Chinese
Adv Search
Home | Accepted | Article In Press | Current Issue | Archive | Special Issues | Collections | Featured Articles | Statistics

2019, 1(4): 386-410 Published Date:2019-8-20

DOI: 10.1016/j.vrih.2019.07.002

Survey and evaluation of monocular visual-inertial SLAM algorithms for augmented reality

Full Text: PDF (45) HTML (392)

Export: EndNote | Reference Manager | ProCite | BibTex | RefWorks

Abstract:

Although VSLAM/VISLAM has achieved great success, it is still difficult to quantitatively evaluate the localization results of different kinds of SLAM systems from the aspect of augmented reality due to the lack of an appropriate benchmark. For AR applications in practice, a variety of challenging situations (e.g., fast motion, strong rotation, serious motion blur, dynamic interference) may be easily encountered since a home user may not carefully move the AR device, and the real environment may be quite complex. In addition, the frequency of camera lost should be minimized and the recovery from the failure status should be fast and accurate for good AR experience. Existing SLAM datasets/benchmarks generally only provide the evaluation of pose accuracy and their camera motions are somehow simple and do not fit well the common cases in the mobile AR applications. With the above motivation, we build a new visual-inertial dataset as well as a series of evaluation criteria for AR. We also review the existing monocular VSLAM/VISLAM approaches with detailed analyses and comparisons. Especially, we select 8 representative monocular VSLAM/VISLAM approaches/systems and quantitatively evaluate them on our benchmark. Our dataset, sample code and corresponding evaluation tools are available at the benchmark website http://www.zjucvg.net/eval-vislam/.
Keywords: Visual-inertial SLAM ; Odometry ; Tracking ; Localization ; Mapping ; Augmented reality

Cite this article:

Jinyu LI, Bangbang YANG, Danpeng CHEN, Nan WANG, Guofeng ZHANG, Hujun BAO. Survey and evaluation of monocular visual-inertial SLAM algorithms for augmented reality. Virtual Reality & Intelligent Hardware, 2019, 1(4): 386-410 DOI:10.1016/j.vrih.2019.07.002

1. Cadena C, Carlone L, Carrillo H, Latif Y, Scaramuzza D, Neira J, Reid I, Leonard J J. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics, 2016, 32(6): 1309–1332 DOI:10.1109/tro.2016.2624754

2. Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha J M. Visual simultaneous localization and mapping: A survey. Artificial Intelligence Review, 2015, 43(1): 55–81 DOI:10.1007/s10462-012-9365-8

3. Durrant-Whyte H, Bailey T. Simultaneous localization and mapping: Part I. IEEE Robotics & Automation Magazine, 2006, 13(2): 99–110 DOI:10.1109/mra.2006.1638022

4. Bailey T, Durrant-Whyte H. Simultaneous localization and mapping (SLAM): Part II. IEEE Robotics & Automation Magazine, 2006, 13(3): 108–117 DOI:10.1109/mra.2006.1678144

5. Liu H M, Zhang G F, Bao H J. A survey of monocular simultaneous localization and mapping. Journal of Computer-Aided Design & Computer Graphics, 2016, 28(6): 855−868 DOI:10.3969/j.issn.1003-9775.2016.06.001

6. Burri M, Nikolic J, Gohl P, Schneider T, Rehder J, Omari S, Achtelik M W, Siegwart R. The EuRoC micro aerial vehicle datasets. The International Journal of Robotics Research, 2016, 35(10): 1157–1163 DOI:10.1177/0278364915620033

7. Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite. In: IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: 2012, 3354–3361 DOI:10.1109/CVPR.2012.6248074

8. Hartley R, Zisserman A. Three-View Geometry. Multiple View Geometry in Computer Vision. Cambridge: Cambridge University Press, 363−364 DOI:10.1017/cbo9780511811685. 020

9. Ma Y, Soatto S, Kosecka J, Sastry S S. An Invitation to 3-D Vision: From Images to Geometric Models. Springer Science & Business Media, 2012, 26

10. Triggs B, McLauchlan P F, Hartley R I, Fitzgibbon A W. Bundle Adjustment—A Modern Synthesis. Vision Algorithms: Theory and Practice. Berlin, Heidelberg: Springer Berlin Heidelberg, 2000: 298−372 DOI:10.1007/3-540-44480-7_21

11. Sola J. Quaternion kinematics for the error-state KF. Tech Rep, 2012

12. Forster C, Carlone L, Dellaert F, Scaramuzza D. On-manifold preintegration for real-time visual: Inertial odometry. IEEE Transactions on Robotics, 2017, 33(1): 1–21 DOI:10.1109/tro.2016.2597321

13. Eckenhoff K, Geneva P, Huang G. Continuous preintegration theory for graph-based visual-inertial navigation. arXiv: 1805. 02774, 2018

14. Davison A J, Reid I D, Molton N D, Stasse O. MonoSLAM: real-time single camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 1052–1067 DOI:10.1109/tpami.2007.1049

15. Mourikis A I, Roumeliotis S I. A multi-state constraint kalman filter for vision-aided inertial navigation. In: IEEE International Conference on Robotics and Automation. Roma, Italy, 2007: 3565–3572 DOI:10.1109/ROBOT.2007.364024

16. Li M Y, Mourikis A I. Improving the accuracy of EKF-based visual-inertial odometry. In: IEEE International Conference on Robotics and Automation. Saint Paul, MN, USA: 2012, 828–835 DOI:10.1109/ICRA.2012.6225229

17. Huang G P, Mourikis A I, Roumeliotis S I. Analysis and improvement of the consistency of extended Kalman filter based SLAM. In: IEEE International Conference on Robotics and Automation. Pasadena, CA, USA: 2008, 473–479 DOI:10.1109/ROBOT.2008.4543252

18. Jones E S, Soatto S. Visual-inertial navigation, mapping and localization: A scalable real-time causal approach. The International Journal of Robotics Research, 2011, 30(4): 407–430 DOI:10.1177/0278364910388963

19. Huang G P, Mourikis A I, Roumeliotis S I. An observability-constrained sliding window filter for SLAM. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. San Francisco, CA, USA: 2011, 65–72 DOI:10.1109/IROS.2011.6095161

20. Huang G P, Mourikis A I, Roumeliotis S I. A quadratic-complexity observability-constrained unscented kalman filter for SLAM. IEEE Transactions on Robotics, 2013, 29(5): 1226–1243 DOI:10.1109/tro.2013.2267991

21. Barrau A, Bonnabel S. An EKF-SLAM algorithm with consistency properties. arXiv: 1510. 06263, 2015

22. Strasdat H, Montiel J M M, Davison A J. Visual SLAM: why filter? Image and Vision Computing, 2012, 30(2): 65–77 DOI:10.1016/j.imavis.2012.02.009

23. Dellaert F, Kaess M. Square root SAM: Simultaneous localization and mapping via square root information smoothing. The International Journal of Robotics Research, 2006, 25(12): 1181–1203 DOI:10.1177/0278364906072768

24. Thrun S, Montemerlo M. The graph SLAM algorithm with applications to large-scale mapping of urban structures. The International Journal of Robotics Research, 2006, 25(5/6): 403–429 DOI:10.1177/0278364906065387

25. Chen Y, Davis T A, Hager W W, Rajamanickam S. Algorithm 887: CHOLMOD, supernodal sparse cholesky factorization and update/downdate. ACM Transactions on Mathematical Software, 2008, 35(3): 1−14 DOI:10.1145/1391989.1391995

26. Davis T A, Gilbert J R, Larimore S I, Ng E G. A column approximate minimum degree ordering algorithm. ACM Transactions on Mathematical Software, 2004, 30(3): 353−376

27. Kaess M, Ranganathan A, Dellaert F. iSAM: incremental smoothing and mapping. IEEE Transactions on Robotics, 2008, 24(6): 1365–1378 DOI:10.1109/tro.2008.2006706

28. Kaess M, Johannsson H, Roberts R, Ila V, Leonard J J, Dellaert F. iSAM2: Incremental smoothing and mapping using the Bayes tree. The International Journal of Robotics Research, 2012, 31(2): 216–235 DOI:10.1177/0278364911430419

29. Ila V, Polok L, Solony M, Svoboda P. SLAM++-A highly efficient and temporally scalable incremental SLAM framework. The International Journal of Robotics Research, 2017, 36(2): 210–230 DOI:10.1177/0278364917691110

30. Ila V, Polok L, Solony M, Istenic K. Fast incremental bundle adjustment with covariance recovery. International Conference on 3D Vision (3DV). Qingdao, China: 2017, 175–184 DOI:10.1109/3DV.2017.00029

31. Liu H M, Chen M Y, Zhang G F, Bao H J, Bao Y Z. ICE-BA: incremental, consistent and efficient bundle adjustment for visual-inertial SLAM. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: 2018, 1974–1982 DOI:10.1109/CVPR.2018.00211

32. Klein G, Murray D. Parallel tracking and mapping for small AR workspaces. In: 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. Nara, Japan, 2007: 225–234 DOI:10.1109/ISMAR.2007.4538852

33. Mur-Artal R, Montiel J M M, Tardos J D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 2015, 31(5): 1147–1163 DOI:10.1109/tro.2015.2463671

34. Mur-Artal R, Tardos J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics, 2017, 33(5): 1255–1262 DOI:10.1109/tro.2017.2705103

35. Mur-Artal R, Tardos J D. Visual-inertial monocular SLAM with map reuse. IEEE Robotics and Automation Letters, 2017, 2(2): 796–803 DOI:10.1109/lra.2017.2653359

36. Leutenegger S, Lynen S, Bosse M, Siegwart R, Furgale P. Keyframe-based visual–inertial odometry using nonlinear optimization. The International Journal of Robotics Research, 2015, 34(3): 314–334 DOI:10.1177/0278364914554813

37. Qin T, Li P L, Shen S J. VINS-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, 2018, 34(4): 1004–1020 DOI:10.1109/tro.2018.2853729

38. Lu F, Milios E. Globally consistent range scan alignment for environment mapping. Autonomous Robots, 1997, 4(4): 333–349 DOI:10.1023/A:1008854305733

39. Li P L, Qin T, Hu B T, Zhu F Y, Shen S J. Monocular visual-inertial state estimation for mobile augmented reality. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Nantes, France, 2017: 11–21 DOI:10.1109/ISMAR.2017.18

40. Galvez-López D, Tardos J D. Bags of binary words for fast place recognition in image sequences. IEEE Transactions on Robotics, 2012, 28(5): 1188–1197 DOI:10.1109/tro.2012.2197158

41. Engel J, Koltun V, Cremers D. Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3): 611–625 DOI:10.1109/tpami.2017.2658577

42. Jin H L, Favaro P, Soatto S. A semi-direct approach to structure from motion. The Visual Computer, 2003, 19(6): 377–394 DOI:10.1007/s00371-003-0202-6

43. Newcombe R A, Lovegrove S J, Davison A J. DTAM: Dense tracking and mapping in real-time. In: International Conference on Computer Vision. Barcelona, Spain: 2011, 2320–2327 DOI:10.1109/ICCV.2011.6126513

44. Engel J, Schöps T, Cremers D. LSD-SLAM: Large-Scale Direct Monocular SLAM. Computer Vision–ECCV 2014. Cham: Springer International Publishing, 2014: 834−849 DOI:10.1007/978-3-319-10605-2_54

45. Newcombe R A, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison AJ, Kohi P, Shotton J, Hodges S, Fitzgibbon A. KinectFusion: Real-time dense surface mapping and tracking. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Basel, Switzerland: 2011, 127−136

46. Whelan T, Salas-Moreno R F, Glocker B, Davison A J, Leutenegger S. ElasticFusion: Real-time dense SLAM and light source estimation. The International Journal of Robotics Research, 2016, 35(14): 1697–1716 DOI:10.1177/0278364916669237

47. Zhu A Z, Atanasov N, Daniilidis K. Event-based visual inertial odometry. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA: 2017, 5816–5824 DOI:10.1109/CVPR.2017.616

48. Zhou H Z, Zou D P, Pei L, Ying R D, Liu P L, Yu W X. StructSLAM: visual SLAM with building structure lines. IEEE Transactions on Vehicular Technology, 2015, 64(4): 1364−1375 DOI:10.1109/tvt.2015.2388780

49. Hsiao M, Westman E, Kaess M. Dense planar-inertial SLAM with structural constraints. In: IEEE International Conference on Robotics and Automation (ICRA). Brisbane, QLD, Australia: 2018, 6521−6528 DOI:10.1109/ICRA.2018.8461094

50. Radwan N, Valada A, Burgard W. VLocNet++: deep multitask learning for semantic visual localization and odometry. In: IEEE Robotics and Automation Letters, 2018, 3(4): 4407−4414 DOI:10.1109/lra.2018.2869640

51. Wang S, Clark R, Wen H K, Trigoni N. DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks. In: IEEE International Conference on Robotics and Automation (ICRA). Singapore, Singapore: 2017, 2043–2050 DOI:10.1109/ICRA.2017.7989236

52. Lianos K N, Schönberger J L, Pollefeys M, Sattler T. VSO: Visual Semantic Odometry. Computer Vision—ECCV 2018. Cham: Springer International Publishing, 2018: 246−263. DOI:10.1007/978-3-030-01225-0_15

53. Schubert D, Goll T, Demmel N, Usenko V, Stückler J, Cremers D. The TUM VI benchmark for evaluating visual-inertial odometry. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid, Spain: 2018, 1680−1687 DOI:10.1109/IROS.2018.8593419

54. Pfrommer B, Sanket N, Daniilidis K, Cleveland J. PennCOSYVIO: A challenging visual inertial odometry benchmark. In: IEEE International Conference on Robotics and Automation (ICRA). Singapore, Singapore: 2017, 3847–3854 DOI:10.1109/ICRA.2017.7989443

55. Cortés S, Solin A, Rahtu E, Kannala J. ADVIO: An Authentic Dataset for Visual-Inertial Odometry. Computer Vision—ECCV 2018. Cham: Springer International Publishing, 2018, 425−440 DOI:10.1007/978-3-030-01249-6_26

56. Furgale P, Rehder J, Siegwart R. Unified temporal and spatial calibration for multi-sensor systems. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan: 2013, 1280–1286 DOI:10.1109/IROS.2013.6696514

57. Olson E. AprilTag: A robust and flexible visual fiducial system. In: IEEE International Conference on Robotics and Automation. Shanghai, China: 2011, 3400–3407 DOI:10.1109/ICRA.2011.5979561

58. Umeyama S. Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1991, 13(4): 376–380 DOI:10.1109/34.88573

59. Weiss S, Achtelik M W, Lynen S, Chli M, Siegwart R. Real-time onboard visual-inertial state estimation and self-calibration of MAVs in unknown environments. In: IEEE International Conference on Robotics and Automation. Saint Paul, MN, USA: 2012, 957–964 DOI:10.1109/ICRA.2012.6225147

60. Delmerico J, Scaramuzza D. A benchmark comparison of monocular visual-inertial odometry algorithms for flying robots. In: IEEE International Conference on Robotics and Automation (ICRA). Brisbane, QLD, Australia: 2018, 2502–2509 DOI:10.1109/ICRA.2018.8460664

61. Qin T, Shen S. Online Temporal Calibration for Monocular Visual-Inertial Systems. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid, Spain: 2018, 3662–3669. DOI:10.1109/IROS.2018.8593603.

62. Klein G, Murray D. Parallel tracking and mapping on a camera phone. In: 8th IEEE International Symposium on Mixed and Augmented Reality. Orlando, FL, USA: 2009, 83–86 DOI:10.1109/ISMAR.2009.5336495

email E-mail this page

Articles by authors