使用多个时间和空间分辨率的功能实时预测人类行为

论文标题

使用多个时间和空间分辨率的功能实时预测人类行为

Using Features at Multiple Temporal and Spatial Resolutions to Predict Human Behavior in Real Time

论文作者

Zhang, Liang, Lieffers, Justin, Pyarelal, Adarsh

论文摘要

当执行复杂的任务时，人类自然会同时在多个时间和空间分辨率上推理。我们认为，要使人为智能的代理有效地对人类队友进行建模，即展示心理计算理论（TOM），它应该做同样的事情。在本文中，我们提出了一种整合高分辨率和低分辨率的空间和时间信息，以实时预测人类行为，并根据从我的基于Minecraft的环境中进行模拟的城市搜索和救援（USAR）任务的人类受试者收集的数据进行评估。我们的模型通过神经网络与神经网络进行了用于行为预测的神经网络，并同时训练了所有三个网络。高分辨率提取器通过将人类的Minecraft头像和候选目标之间的曼哈顿距离差异作为最新几个动作（根据高分辨率Gridworld表示）计算而来的曼哈顿距离差异，从而动态地编码了动态变化的目标。相反，低分辨率提取器使用从低分辨率图表示计算的历史矩阵来编码参与者的历史行为。通过有监督的学习，我们的模型获得了人类行为预测的强大先验，并且可以有效地处理长期观察。我们的实验结果表明，与仅使用高分辨率信息的方法相比，我们的方法显着提高了预测准确性。

When performing complex tasks, humans naturally reason at multiple temporal and spatial resolutions simultaneously. We contend that for an artificially intelligent agent to effectively model human teammates, i.e., demonstrate computational theory of mind (ToM), it should do the same. In this paper, we present an approach for integrating high and low-resolution spatial and temporal information to predict human behavior in real time and evaluate it on data collected from human subjects performing simulated urban search and rescue (USAR) missions in a Minecraft-based environment. Our model composes neural networks for high and low-resolution feature extraction with a neural network for behavior prediction, with all three networks trained simultaneously. The high-resolution extractor encodes dynamically changing goals robustly by taking as input the Manhattan distance difference between the humans' Minecraft avatars and candidate goals in the environment for the latest few actions, computed from a high-resolution gridworld representation. In contrast, the low-resolution extractor encodes participants' historical behavior using a historical state matrix computed from a low-resolution graph representation. Through supervised learning, our model acquires a robust prior for human behavior prediction, and can effectively deal with long-term observations. Our experimental results demonstrate that our method significantly improves prediction accuracy compared to approaches that only use high-resolution information.

下载PDF全文

下载文献需遵守相关版权规定

论文标题