GAMES Webinar 2018-31期（ICCV 2017论文报告）| 齐晓娟（香港中文大学），刘方宇（滑铁卢大学）
【GAMES Webinar 2018-31期（ICCV 2017论文报告）】
报告时间：2018年1月18日（星期四）晚20:00 – 20:45（北京时间）
报告题目：3D Graph Neural Networks for RGBD Semantic Segmentation
RGBD semantic segmentation requires joint reasoning about 2D appearance and 3D geometric information. In this paper we propose a 3D graph neural network (3DGNN)that builds a k-nearest neighbor graph on top of 3D point cloud. Each node in the graph corresponds to a set of points and is associated with a hidden representation vector initialized with an appearance feature extracted by a unary CNN from 2D images. Relying on recurrent functions, every node dynamically updates its hidden representation based on the current status and incoming messages from its neighbors. This propagation model is unrolled for a certain number of time steps and the final per-node representation is used for predicting the semantic class of each pixel. We use back-propagation through time to train the model. Extensive experiments on NYUD2 and SUN-RGBD datasets demonstrate the effectiveness of our approach.
讲者简介：I am currently a 4th year Ph.D. student in Computer Science and Engineering Department, The Chinese University of Hong Kong (CUHK), supervised by Prof. Jiaya Jia. My research interest includes computer vision, deep learning, and medical image analysis. Recently I am focusing on semantic segmentation, 3D scene understanding and image synthesis. Before that, I got the B.Eng degree in Electronic Science and Technology at Shanghai Jiao Tong University (SJTU) in 2014.
报告时间：2018年1月18日（星期四）晚20:45 – 21:30（北京时间）
报告题目：3DCNN-DQN-RNN: A Deep Reinforcement Learning Framework for Semantic Parsing of Large-scale 3D Point Clouds
Semantic parsing of large-scale 3D point clouds is an important research topic in computer vision and remote sensing fields. Most existing approaches utilize hand-crafted features for each modality independently and combine them in a heuristic manner. They often fail to consider the consistency and complementary information among features adequately, which makes them difficult to capture high-level se- mantic structures. The features learned by most of the cur- rent deep learning methods can obtain high-quality image classification results. However, these methods are hard to be applied to recognize 3D point clouds due to unorganized distribution and various point density of data. In this paper, we propose a 3DCNN-DQN-RNN method which fuses the 3D convolutional neural network (CNN), Deep Q-Network (DQN) and Residual recurrent neural network (RNN) for an efficient semantic parsing of large-scale 3D point clouds. In our method, an eye window under control of the 3D CNN and DQN can localize and segment the points of the object’s class efficiently. The 3D CNN and Residual RNN further extract robust and discriminative features of the points in the eye window, and thus greatly enhance the parsing accuracy of large-scale point clouds. Our method provides an automatic process that maps the raw data to the classification results. It also integrates object localization, segmentation and classification into one framework. Experimental results demonstrate that the proposed method outperforms the state-of-the-art point cloud classification methods.
Fangyu Liu is a fourth-year undergraduate at University of Waterloo, ON, Canada, double majoring in Computer Science, Combinatorics & Optimization, minoring in Pure Mathematics.
He is also working as a research assistant at State Key Laboratory of Remote Sensing Science, Beijing Normal University, Beijing, China, focusing on applications of Machine Learning/Deep Learning methods on large-scale image/3D scene understanding.
GAMES主页的“使用教程”中有 “如何观看GAMES Webinar直播？”及“如何加入GAMES微信群？”的信息；