A pinboard by
Yifei Shi

Visiting PhD student, Princeton University


Build up a real-time 3D reconstruction system by leveraging deep neural networks

RGB-D scanning of indoor environments is important for many applications, including real estate, interior design, and virtual reality. However, it is still challenging to register RGB-D images from a handheld camera over a long video sequence into a globally consistent 3D model. It has been shown that structured registration can significantly alleviate drift issue when being conducted over more reliable geometric proxies such as planes. However, robust plane detection algorithms are mostly confined to large planar structures (e.g. walls, tabletop), which greatly limits the utility of structured approach, especially to cluttered scenes. Working with small size planar patches (screen, box face, etc.) gains flexibility and generality, but renders their detection and matching infeasible for traditional geometric methods.

We opt to harness deep neural networks to address these two tasks robustly. To achieve that, we propose two novel deep architectures for planar patch detection from an RGB-D frame and for patch similarity measuring across two frames, respectively. The fast testing of the networks guarantees real-time frame rate. Based on the extracted planar patches and their correspondences, we would like to realize local frame-to-frame registration as well as global optimization accounting for temporal coherence.