论文标题
神经相机模型
Neural Camera Models
论文作者
论文摘要
现代计算机视觉已超越了互联网照片集的领域,并通过非结构化的环境引导配备摄像头的机器人和自动驾驶汽车。为了使这些体现的代理与现实世界对象相互作用,相机越来越多地用作深度传感器,重建环境以完成各种下游推理任务。机器学习辅助的深度感知或深度估计可预测图像中每个像素的距离。尽管在深入估算中取得了令人印象深刻的进步,但仍然存在重大挑战:(1)地面真相深度标签很难大规模收集,(2)通常认为摄像机信息是已知的,但通常是不可靠的,并且(3)限制性摄像机假设很普遍,即使在实践中使用了各种各样的相机类型。在本文中,我们专注于放松这些假设,并描述将相机变成真正通用深度传感器的最终目标的贡献。
Modern computer vision has moved beyond the domain of internet photo collections and into the physical world, guiding camera-equipped robots and autonomous cars through unstructured environments. To enable these embodied agents to interact with real-world objects, cameras are increasingly being used as depth sensors, reconstructing the environment for a variety of downstream reasoning tasks. Machine-learning-aided depth perception, or depth estimation, predicts for each pixel in an image the distance to the imaged scene point. While impressive strides have been made in depth estimation, significant challenges remain: (1) ground truth depth labels are difficult and expensive to collect at scale, (2) camera information is typically assumed to be known, but is often unreliable and (3) restrictive camera assumptions are common, even though a great variety of camera types and lenses are used in practice. In this thesis, we focus on relaxing these assumptions, and describe contributions toward the ultimate goal of turning cameras into truly generic depth sensors.