GAMES Webinar 2019 – 84期（CVPR 2018三维视觉论文报告）| 杨耀青（卡耐基梅隆大学），张寅达（普林斯顿大学）
【GAMES Webinar 2019-84期（CVPR 2018三维视觉论文报告）】
报告题目：From Pixels to Scene: Recovering 3D Geometry and Semantics for Indoor Environments
Understanding 3D geometry and semantics of the surrounding environment is in critically high demand for many applications, such as autonomous driving, robotics, augmented reality, etc. However, it is extremely challenging due to the low quality depth measurements due to failures and noisy measurements from sensors, limited access to ground truth data, and cluttered scenes with heavy occlusions and intervening objects. In this presentation, I will introduces a full spectrum of 3D scene understanding works to handle these challenging issues. Starting from estimating a depth map, which is one of the most important immediate measurements of the 3D geometry of the scene, we introduce a learning based active stereo system that learns self-supervisely and reduces the disparity error to 1/10th of other canonical stereo systems. To further handle the missing depth caused by sensor failures, we propose a method to effectively complete the depth map using information from an aligned color image. Beyond per pixel depth, we then attempt to predict other high-level semantics on each pixel, such as surface normals and object boundaries. However, realizing the lack of large scale supervision, we design a synthetic data generation framework, which creates photo-realistic color rendering and various of accurate pixel-wise ground truths to facilitate the learning process and improve the performance on real data. In the end, we pursue holistic scene understanding by estimating a 3D representation of the scene, in which objects and room layout are represented using 3D bounding box and planar surface respectively. We propose methods to produce such representation from either a single color panorama or depth image leveraging scene context. On the whole, these proposed methods produce understanding of both 3D geometry and semantics from the most fine-grained pixel level to the holistic scene scale, which build foundations and could possibly inspire future works for 3D scene understanding.
Yinda Zhang received his Ph.D. in Computer Science from Princeton University, advised by Professor Thomas Funkhouser. Before that, he received a Bachelor degree from Dept. Automation in Tsinghua University, and a Master degree from Dept. ECE in National University of Singapore co-supervised by Prof. Ping Tan and Prof. Shuicheng Yan. His research mainly focus on machine learning, computer vision, and computer graphics. Recently, he was working on 3D scene understanding, where the goal is to measure 3D geometry and semantics of the surrounding environment leveraging deep learning technology.
GAMES主页的“使用教程”中有 “如何观看GAMES Webinar直播？”及“如何加入GAMES微信群？”的信息；