Adv Search
Home | Accepted | Article In Press | Current Issue | Archive | Special Issues | Collections | Featured Articles | Statistics

Article In Press

View Abstracts Download Citations


Reference Manager




Flow-based SLAM: From geometry computation to learning


Available Online:2019-09-19

Abstract (17) | PDF (2) | HTML (8)
Simultaneous localization and mapping (SLAM) has attracted considerable research interest from the robotics and computer-vision communities for >30 years. With steady and progressive efforts being made, modern SLAM systems allow robust and online applications in real-world scenes. We examined the evolution of this powerful perception tool in detail and noticed that the insights concerning incremental computation and temporal guidance are persistently retained. Herein, we denote this temporal continuity as a flow basis and present for the first time a survey that specifically focuses on the flow-based nature, ranging from geometric computation to the emerging learning techniques. We start by reviewing two essential stages for geometric computation, presenting the de facto standard pipeline and problem formulation, along with the utilization of temporal cues. The recently emerging techniques are then summarized, covering a wide range of areas, such as learning techniques, sensor fusion, and continuous-time trajectory modeling. This survey aims at arousing public attention on how robust SLAM systems benefit from a continuously observing nature, as well as the topics worthy of further investigation for better utilizing the temporal cues.
Real-time human segmentation by BowtieNet and a SLAM-based human AR system


Available Online:2019-09-10

Abstract (35) | PDF (7) | HTML (23)
Background Generally, it is difficult to obtain accurate pose and depth for a non-rigid moving object from a single RGB camera to create augmented reality (AR). In this study, we build an augmented reality system from a single RGB camera for a non-rigid moving human by accurately computing pose and depth, for which two key tasks are segmentation and monocular Simultaneous Localization and Mapping (SLAM). Most existing monocular SLAM systems are designed for static scenes, while in this AR system, the human body is always moving and non-rigid.
In order to make the SLAM system suitable for a moving human, we first segment the rigid part of the human in each frame. A segmented moving body part can be regarded as a static object, and the relative motions between each moving body part and the camera can be considered the motion of the camera. Typical SLAM systems designed for static scenes can then be applied. In the segmentation step of this AR system, we first employ the proposed BowtieNet, which adds the atrous spatial pyramid pooling (ASPP) of DeepLab between the encoder and decoder of SegNet to segment the human in the original frame, and then we use color information to extract the face from the segmented human area.
Based on the human segmentation results and a monocular SLAM, this system can change the video background and add a virtual object to humans.
The experiments on the human image segmentation datasets show that BowtieNet obtains state-of-the-art human image segmentation performance and enough speed for real-time segmentation. The experiments on videos show that the proposed AR system can robustly add a virtual object to humans and can accurately change the video background.
Multi-source data-based 3D digital preservation of large-scale ancient chinese architecture: A case report


Available Online:2019-08-20

Abstract (33) | PDF (7) | HTML (34)
The 3D digitalization and documentation of ancient Chinese architecture is challenging because of architectural complexity and structural delicacy. To generate complete and detailed models of this architecture, it is better to acquire, process, and fuse multi-source data instead of single-source data. In this paper, we describe our work on 3D digital preservation of ancient Chinese architecture based on multi-source data. We first briefly introduce two surveyed ancient Chinese temples, Foguang Temple and Nanchan Temple. Then, we report the data acquisition equipment we used and the multi-source data we acquired. Finally, we provide an overview of several applications we conducted based on the acquired data, including ground and aerial image fusion, image and LiDAR (light detection and ranging) data fusion, and architectural scene surface reconstruction and semantic modeling. We believe that it is necessary to involve multi-source data for the 3D digital preservation of ancient Chinese architecture, and that the work in this paper will serve as a heuristic guideline for the related research communities.