Chinese

2020,  2 (1):   56 - 69   Published Date：2020-2-20

DOI: 10.1016/j.vrih.2019.12.004

Abstract

Background In recent decades, unmanned aerial vehicles (UAVs) have developed rapidly and been widely applied in many domains, including photography, reconstruction, monitoring, and search and rescue. In such applications, one key issue is path and view planning, which tells UAVs exactly where to fly and how to search.MethodsWith specific consideration for three popular UAV applications (scene reconstruction, environment exploration, and aerial cinematography), we present a survey that should assist researchers in positioning and evaluating their works in the context of existing solutions. Results/Conclusions It should also help newcomers and practitioners in related fields quickly gain an overview of the vast literature. In addition to the current research status, we analyze and elaborate on advantages, disadvantages, and potential explorative trends for each application domain.

Content

1 Introduction
The maturity of drone technology has spawned a wide range of novel drone applications. Consumer unmanned aerial vehicles (UAVs) are light and portable with the characteristics of interactivity and expansibility. Additionally, UAVs have a high degree of freedom, allowing them to capture many perspectives that cannot be captured by ground equipment, thereby opening new avenues for research.
In recent years, many studies related to drones have been conducted in the fields of urban reconstruction, scene exploration, cinematography, and biological investigation. Figure 1 presents an overview of three typical applications. By utilizing UAVs, researchers can obtain data very rapidly. However, one challenging issue is path and view planning. When planning flights for drones, different missions require different flight paths and camera control policies. Additionally, for flight safety, the flight trajectory must be free of obstacles. Therefore, it is necessary for a UAV to maintain an accurate perception of the relationship between itself and the surrounding environment at all times. This perception should include its own position and an environmental map based on a combination of visual systems and global positioning system (GPS) signals.
A complete overview of issues related to UAV path and view planning in the field of computer graphics is required. Such an overview must cover the current situation and field limitations to guide immediate future research. In this paper, we present a survey of the planning problem from the perspective of graphics applications of UAVs. We expect to provide a comprehensive understanding of this domain to many different research communities to facilitate progress in this direction.
In Section 2, we introduce works published on the topic of planning for scene reconstruction. Section 3 provides a thorough review of environment exploration with obstacle avoidance research. Section 4 summarizes related works on aerial cinematography. Our conclusions are provided in Section 5.
2 Planning for scene reconstruction
In recent years, there has been an increasing demand for large-scale scene models in many industries[1,2,3]. For example, in games and simulation software, to make users feel more immersed, reconstructed city models are utilized to make scenes feel as real as possible. A complete urban model can provide an accurate three-dimensional environment for automatic driving and map navigation, which serves as the basis for real-time decision making. Scene reconstruction models provide data sources for landscaping planning, disaster analysis, and emergency rescue analysis.
To satisfy these requirements, large-scale scene reconstruction has made significant progress by utilizing light detection and ranging (LiDAR) points, satellite images, and aerial images to generate 3D models. However, many challenges remain. The method of utilizing LiDAR points is limited by high-cost sensors, incompleteness, and sparse point clouds. Additionally, LiDAR systems mounted on ground vehicles cannot capture tall buildings accurately, while LiDAR systems mounted on UAVs have difficulty observing the ground or the vertical faces of buildings. Cheng et al.[4] and Schwarz et al.[5] proposed the merging of LiDAR data from ground vehicles and UAVs, but this method raises the difficulty of registration. Qin et al.[6,7] used the semi-global matching (SGM) algorithm to generate dense point clouds from rational polynomial coefficient satellite stereo images, however, there is still room for improvement in accuracy and completeness. With the boom in consumer UAVs and mature image-based modeling techniques, methods based on aerial images are becoming increasingly popular. Existing approaches can be classified into two main groups: off-the-shelf flight planners and explore-then-exploit methods[8]. In Figure 2, a UAV collects images according to the computed views along the path (left), then utilizes the conventional multi-view stereo (MVS) algorithm to generate a 3D reconstruction (right) of that region.
This technique can be represented mathematically based on the fact that images captured along a trajectory generated under certain constraints will result in a high-quality reconstruction (Equation 1). Different constraints represent different research directions.
where
$Q S , V$
denotes the total reconstructability of all surface samples in
$S$
under the view set
$V$
,
is the trajectory length constraint,
is the flight time constraint, and
is the view set redundancy constraint.
2.1　Off-the-shelf flight planners
Commercial flight planners, such as Pix4D[9], Altizure[10], and DJI-TERRA[11], are common and simple methods. These planners have limited parameter settings. Users simply select a preset mode to generate a simple UAV path, such as a zig-zag or circle, to fill in the area to be reconstructed, regardless of the structure and distribution of buildings in the scene, as shown in Figure 3. Based on this lack of flexibility, scene information is often insufficient, especially in dense areas, where the lower portions of buildings are obstructed. These issues make it difficult to obtain a high-quality 3D reconstruction.
2.2　Explore-then-exploit techniques
Explore-then-exploit techniques[8,12] involve two phases. In the “explore” phase, an initial path is planned with a uniform distribution of views for the area to be reconstructed, similar to an off-the-shelf flight planner. The views are always nadir views. Next, images are quickly collected by a drone and utilized by open-source (MVE[13], COLMAP[14]) or commercial software (Pix4D[9], RealityCapture[15]) to generate an initial rough model called a geometric proxy. In the “exploit” phase, a new trajectory is planned according to the geometric proxy and the collection and reconstruction processes are repeated to generate a high-quality 3D model. Many methods in this group differ in their implementation of the second phase.
Hepp et al.[16] formulated this problem as maximizing information and minimizing unobserved space with constraints for stereo matching, travel time, and safety. They combined two optimization steps of view selection and the travelling salesman problem for path planning into a single submodular objective function[17]. This optimization process does not guarantee convergence and requires the user to set termination conditions, such as running time. Roberts et al.[12] proposed a novel scene coverage model. Based on this model, they formulated a submodular optimization method to determine the optimal orientation for each candidate camera. An integer linear program was then executed to select an optimal trajectory from the candidate viewpoints such that scene coverage was maximized and path length was constrained (to account for the limited battery life of a drone). Smith et al.[18] defined a model for evaluating the pairwise view contribution of views (V1, V2) to the reconstructability of a sample in a geometric proxy according to the principles of the MVS algorithm. This model ensures that camera positions and orientations are adequate for multi-view reconstruction. Additionally, they provided a synthetic benchmark for image-based reconstruction. This benchmark can capture images based on planned paths and quantitatively evaluate reconstruction results. Peng et al.[8] proposed a visibility cone model to define the coverage of a scene. This model utilizes skeleton views on a view manifold to perform path planning to cover low-quality regions. Unlike the aforementioned methods, they could do more than one exploit. Additionally, Almadhoun et al.[19], Mendoza et al.[20], and Huang et al.[21] utilized the next-best-view algorithm to generate waypoints in real time.
2.3　Summary
Off-the-shelf planners do not consider the geometry or distribution of buildings within an area, meaning their results lack significant valid information. Additionally, such methods are inflexible. Explore-then-exploit methods require more than two visits and the completion of several operations in sequential order, which leads to a long data capture process and high computational power requirements at the reconstruction site. Some methods, such as that proposed in [18], have adjusted view positions and orientations based on a fixed number, causing the final view set to contain significant redundancy. Planning a trajectory that can capture good images with a less redundant view set is a challenging problem.
In the second phase of explore-then-exploit techniques[12,16,18], the optimization of planning under constraints is an NP-hard problem. Decomposing and transforming this problem is very challenging. One method is to perform decomposition to minimize view redundancy first and maximize reconstructability second. This can significantly reduce the number of viewpoints required for acquisition while maintaining comparable reconstruction quality.
Because the core of most algorithms that utilize images to reconstruct 3D models is based on the MVS algorithm, the large numbers of glass curtain walls in cities significantly affect final reconstruction results. According to Xie et al.[22], the effects of light should also be considered.
3 Environment exploration with obstacle avoidance
Methods for exploring scenes have a wide range of applications, including target searching, accident rescue, biological surveys, and monitoring[23]. Most exploration studies are conducted utilizing ground vehicles, but ground vehicles have stricter requirements regarding terrain and worse search efficiency.
Based on the boom in consumer UAVs, large fields of view, mobility, and low costs have made UAVs more suitable for exploration. The following are technologies involved in the exploration phase utilized by a UAV to generate informed trajectories to explore unknown scenes:
Sensing: perception of the surrounding environment.
Mapping: representation of the scene utilizing OctoMap[24] or other volumetric maps, which should be updated in real time.
Localizing: position of the UAV itself. This can be typically treated jointly with Mapping as a simultaneous localization and mapping problem (ORB_SLAM2[25]).
Planning: path generation based on a given map.
These four technologies also illustrate the challenges of UAV exploration. During UAV exploration, how to represent the surrounding environment dynamically and accurately, how to plan an optimal trajectory for different missions, and how to avoid obstacles efficiently and safely are all difficulties requiring additional research. In Figure 4, we present a typical 3D scene navigation process. In a simulated environment, after specifying a target, the UAV utilizes onboard sensors to collect data continuously and update the octree. The path-planning algorithm then generates a smooth and collision-free path in real time.
In this section, we mainly discuss UAV exploration based on two goals: reaching a defined target and full coverage of the region.
The problem definition is as follows. Given start and target positions
, respectively, the UAV must navigate from
$p s$
to
$p e$
such that
under different time constraints
with finite time
$t h$
or finite length
$l h$
. There have been many studies in this area. We attempt to divide such studies into classical methods and learning methods.
In non-learning methods, a global occupancy map is required to perform path planning to a target. Borenstein et al.[26] and Kim et al.[27] utilized LiDAR data, depth information, or structure from motion to acquire a 3D geometric map. Mur-Artal et al.[28] and Mur-Artal et al.[25] utilized features detected between images to construct geometric information for target scenes, localize a robot, and perform optimization to fine-tune parameters. Qin et al.[29] demonstrated that the combination of inertial measurement unit (IMU) data and RGB images could increase the accuracy of reconstructed points significantly. Although the efficiency of these systems is suitable for embedded robots, the information recovered from these methods is typically in the form of a sparse point cloud, which cannot be utilized to construct an occupancy map or perform path planning directly. Engel et al.[30] directly generated a dense point cloud to construct a map of a scene, but their method still suffered from image noise and was sensitive to camera distortion.
As the amount of available training data increases, deep neural networks are becoming increasingly popular for solving problems in many different fields. For target navigation, Zhu et al.[31] treated a robot as a virtual agent and learned to navigate to a goal object in a virtual environment utilizing an end-to-end deep reinforcement learning framework. They wanted the network to learn a mapping function between the images that the robot observed and the movements it should perform. Although this types of end-to-end framework is easy to understand and train, Gupta et al.[32] demonstrated that memories of scenes can significantly affect a robot’s choices. In their work, a local collision map was learned explicitly and a robot made decisions based on the collision map. Chen et al.[33] utilized graph nodes to extract additional information from scenes, which proved to be efficient and yielded a high success rate. However, both of these methods utilize neural networks to generate maps and plan movement simultaneously, which results in various drawbacks compared to traditional planning methods. Mishkin et al.[34] demonstrated that although learning methods show significant potential for extracting geometric information from images, they still suffer from network noise and cannot perform as well as traditional planning methods when accurate 3D maps are available.
Landing safely and efficiently during targeted navigation is also a challenging problem. This problem can be subdivided into two categories according to the representation of a landing site, namely landing on a man-made marker site or landing in an unknown area. In the latter case, the landing strategy must consider exploration for landing site candidates and exploitation for the geometry details of candidates. This task is always divided into landing site detection and trajectory generation.
For landing site detection, Johnson et al.[35] and several other groups[36,37,38] identified regions in elevation maps with geometric features of smoothness and small slopes. Based on emerging machine-learning techniques, a few methods have adopted semantic inference to identify candidate-landing sites. The approach proposed by Maturana et al.[39] was the first to introduce a volumetric 3D convolutional neural network (CNN) to solve this problem. Free LSD[40] and SafeUAV[41] utilized 2D image segmentation to detect landing areas. These methods can detect potential landing sites, such as lawns, which are geometrically rough, but safe for landing.
However, it is nontrivial to generate a trajectory that enables a drone to detect landing sites actively and land as quickly as possible. Desaraju et al.[42] optimized trajectories offline to gain maximum information for both candidate landing sites and unknown areas. To apply this method in a real-time environment, Forster et al.[43] utilized the improved REMODE[44] algorithm to perform dense reconstruction and presented the results as a probabilistic elevation map.
3.2　Full coverage
Planning to explore an unknown 3D area to achieve full coverage is called coverage path planning. This problem can be mathematically formulated as a trajectory for maximizing the size of a known scene (Equation 2). If the final view set is V and the function C expresses coverage, then we wish to solve
where
$l V$
denotes the length of the trajectory that connects all views,
$l h$
is the length threshold,
$t V$
denotes the time required for the entire trajectory, and
$t h$
is a time threshold.
There are two main types of methods that can be utilized to solve this problem: frontier-based methods[45,46,47] and sampling-based methods[48,49,50,51,52,53].
Frontier-based methods[54] explore boundaries in the free and occupied space in a map. Such methods are fundamental for most literature on exploration. During the exploration process, frontiers are selected continuously as targets for exploration until the entire scene is covered. This makes it easy to explore large environments consisting of separate areas. However, because this method moves back and forth between regions, full exploration is slow[51]. Cieslewski et al.[47] extended the classical frontier-based method with the goal of conducting rapid exploration. Their method had two modes. The first is reactive mode, which chooses visible frontiers to minimize velocity correction and fly at a consistently high speed. If there are no visible frontiers, the classic mode is activated to select a new frontier utilizing Dijkstra’s shortest path algorithm. These two modes can alternate during operation. This method provides fast exploration and outperforms the next-best-view (NBV) and classical frontier-based methods. However, it sometimes generates excessively long trajectories.
Sampling-based methods are always in the form of Next-Best-View. Bircher et al.[52] proposed a receding horizon planner based on the NBV. This method utilizes rapidly-exploring random trees (RRTs)[55] to store the unmapped space that can be covered from the current UAV position and repeatedly identifies the first edge of a tree before each planning phase. This planner can operate online on a UAV with limited resources. However, it sometimes gets stuck when exploring large scenes and does not achieve global coverage. Selin et al.[51] combined the receding horizon NBV planner and a frontier-based method. The former was adopted for local exploration and the latter was adopted for global goal selection. To account for costly RRT computations and local minimum, Schmid et al.[48] modeled exploration as an RRT-inspired online informative path planning problem. This method keeps non-executed segments and their subtrees alive. It maintains an expansive growing tree and utilizes a single objective function to perform global searching, thereby avoiding local minimum. It also utilizes a novel truncated-signed-distance-field-based 3D reconstruction gain scheme and cost-utility formulation.
3.3　Summary
Although significant research efforts have been devoted to UAV area exploration, many challenges remain. For example, the accuracy of a constructed map directly affects the quality of a planned view. Additionally, the planning process for subsequent exploration views often gets stuck in local minimum. A few researchers[51,56] have proposed a local-global theory for solving this problem.
Additionally, among works focusing on drone landing, few have treated landing site detection and trajectory generation as a single process due to excessive computational complexity. Currently, most exploration algorithms are designed to explore simple environments and there are no good solutions for areas with severe occlusion. Therefore, there is still significant work to be done in this area.
4 Aerial cinematography
Recently, UAV cinematography has risen in popularity. Based on its low cost, flexibility, and scalability, it makes shooting more diverse, lowers the difficulty of video creation, and yields interesting and unique shots, which have greatly enhanced the media industry. When utilizing drones to shoot video, at least two professional operators are required. One controls the UAV and the other controls the camera. This limits the development of UAV cinematography. Therefore, various automatic shooting methods are under development. In autonomous cinematography, based on the shooting intentions of users, a UAV must automatically generate smooth camera trajectories, then shoot videos according to constraints and the environment. Figure 5 presents a smooth video shooting path, where every video frame satisfies an aesthetic principle. Such applications have some operational challenges[57,58], including the definition of a shooting mission, ability to translate a mission into a flight plan, perception of the environment and self-positioning, task assignment for multiple drones, scalability and swarming, and capacity for autonomous and emergency management.
Xie et al.[22] formulated the cinematography problem as a local-global problem (Equation 3). For local planning, cost is defined as
$E l o c a l T s , e = E q u a l i t y + E a x i s + E r o t ,$
where
$E q u a l i t y$
defines the view quality,
$E a x i s$
evaluates the alignment between a trajectory and the dominant axis of a landmark, and
$E r o t$
is utilized to penalize view orientation changes. Here
$T s , e$
is the trajectory from a view
$s$
to a target
$e$
.
For global planning, transition trajectories are added to connect local paths (Equation 4). Here, the cost is
$E t r a n s T j j ' m m ' = E q u a l i t y + E r o t + E t u r n ,$
where
is utilized to discourage zig-zag trajectories.
In this section, we discuss UAV autonomous cinematography from two perspectives: dynamic scenes and landscape videography.
4.1　Dynamic scenes
It is extremely challenging for UAVs to shoot dynamic scenes. They must track a moving target in real time, perform predictions for updating the target and environment simultaneously, generate a feasible trajectory, perform control to fly along the trajectory, avoid obstacles, accomplish the shooting mission, and handle emergencies[57].
For filming dynamic scenes, perception is the first factor to be considered, which includes target tracking and environment mapping. The former is largely based on real-time kinematic GPS and IMU sensors[59,60,61,62], a visual system[63], or motion capture system[60,61,64]. The latter is always a volumetric map. Some methods utilize a virtual camera to model the relationships between dynamic objects and the UAV camera. Galvane et al.[61] proposed a drone Toric space, which has embedded constraints to preserve the relative relationships between a drone and dynamic targets. Additionally, an interactive tool was proposed to facilitate manipulation in the screen space to control UAV shooting.
The prose storyboard language (PSL)[61,65,66] is a formal language utilized to describe all possible shooting missions uniquely. Bonatti et al.[63] focused on an actor on an artistic screen and proposed a UAV cinematography framework to film actors moving in unknown environments. This method contains three sub-systems. First, a vision sub-system utilizes a MobileNet network to detect target bounding boxes from monocular images, then estimates target headings to predict motion. Second, a mapping sub-system utilizes LiDAR data to construct a repeatedly updated volumetric map, allowing the UAV to avoid collisions. Finally, a planning sub-system formulates an objective function for smoothness, shot quality, safety, and occlusion avoidance.
Multi-drone autonomous cinematography is another research hot spot. In addition to the issues discussed above, the utilization of multiple UAVs can lead to collisions between drones and overlapping assignment of filming tasks. Capitán et al.[57] proposed a UAV team cinematography system that utilizes a mission controller to manage the shooting process at a high level based on planning, shooting, and navigation modules. However, in this method, UAV flight paths are simple preset paths, meaning there is still significant room for development. Achtelik et al.[67] attempted to utilize a swam of UAVs to map an unknown environment autonomously. This was accomplished without any ground station commands, representing significant progress in the field.
4.2　Landscape videography
Landscape videography has made significant progress in recent years. Many methods have been developed for generating virtual paths containing keyframes, then following such paths in a real environment. The methods proposed by Gebhardt et al.[68] and Joubert et al.[69] both required users to specify keyframe parameters. The former framed the entire problem as a variable horizon trajectory optimization scheme. Temporary optimization for positional references and balancing methods for smoothness and timing control were implemented. Roberts and Hanrahan[70] leveraged optimized time warping to improve the feasibility of planned trajectories. Gebhardt et al.[71] determined that it is difficult to achieve a smooth trajectory when keyframes are implemented as hard constraints. Gebhardt et al.[72] optimized trajectories to ensure their feasibility based on a virtual path, which struck a balance between user inputs and conflicting constraints. To address the trajectory optimization problem in control theory literature, various methods, such as the model predictive control method proposed by Falaise et al.[73], have been utilized to generate quadrotor trajectories in real time with promising results.
Huang et al.[74] introduced a visual interest metric and generated a trajectory to maximize the amount of visual interest captured by a camera. Their method iteratively optimizes visual interest, travel speed, and camera position until convergence. Xie et al.[22] proposed a novel high-level design tool. This tool requires users to input rough 2.5D models and additional obstacles. It then generates a smooth trajectory for capturing a continuous high-level video that contains landmarks to be photographed and ensures obstacle avoidance. This system focuses on large-scale scenes containing multiple static landmarks, whereas the work by Nägeli et al.[60] focused on dynamic targets and cluttered environments. Yang et al.[75] presented a system that automatically generates a feasible and smooth trajectory for landscape videography. Their system requires users to express their ideas in the form of a 2D sketch. However, because this method lacks a unified quantitative evaluation standard for aesthetics, the result must be judged by a human with a bias. In the future, it is necessary to establish a benchmark for evaluating UAV shooting aesthetics.
4.3　Summary
In this section, we discuss recent work related to cinematography. Based on the utilization of UAVs, filmmakers can capture more interesting, attractive, and creative pictures. However, there are still many problems that must be solved in the future.
For cinematography, it is key to understand shooting tasks. However, current methods mostly utilize PSL or user interaction, meaning autonomy must be improved. Furthermore, to shoot attractive pictures, one must consider the rules of aesthetics. In current systems, the definition of aesthetic rules is overly simplified and there is a lack of efficient and accurate aesthetic evaluation procedures. Therefore, a benchmark for aesthetics is required. Additionally, the influence of light and shadow is typically ignored for trajectory planning, but these factors are very important for photography.
5 Conclusion
From the perspective of computer graphics, we outline the current research status and challenges associated with UAVs for large-scale scene reconstruction, exploration with obstacle avoidance, and cinematography.
Additionally, there are a number of issues that should be considered for drone view and path planning. For example, for complex outdoor scenes, in addition to modeling the drone itself, the environment must also be modeled based on various factors, such as wind. Coombes et al.[76] demonstrated that wind has a significant impact on UAV flight time and that it is essential to take wind into consideration. Furthermore, Coombes et al.[77] built a cost function for considering wind and analyzed minimum flight time in a direction perpendicular to wind mathematically. However, this work assumed that wind fields are steady and uniform, which is too simple for real-world environments. Achermann et al.[78] proposed a CNN for predicting 3D wind to generate safer paths. Many of the optimization problems for UAVs are NP problems, simplifying such problems is a significant challenge.
Based on the aforementioned techniques and problems, we can combine the advantages of UAVs, computer graphics, computer vision, and machine learning to achieve effective and easy-to-use intelligent and autonomous UAVs that can be adapted to various practical applications.

Reference

1.

Musialski P, Wonka P, Aliaga D G, Wimmer M, van Gool L, Purgathofer W. A survey of urban reconstruction. Computer Graphics Forum, 2013, 32(6): 146–177 DOI:10.1111/cgf.12077

2.

Souissi O, Benatitallah R, Duvivier D, Artiba A, Belanger N, Feyzeau P. Path planning: A 2013 survey. In: Proceedings of 2013 International Conference on Industrial Engineering and Systems Management (IESM). IEEE, 2013, 1–8

3.

Biljecki F, Stoter J, Ledoux H, Zlatanova S, Çöltekin A. Applications of 3D city models: state of the art review. ISPRS International Journal of Geo-Information, 2015, 4(4): 2842–2889 DOI:10.3390/ijgi4042842

4.

Cheng L, Gong J Y, Li M C, Liu Y X. 3D building model reconstruction from multi-view aerial imagery and lidar data. Photogrammetric Engineering & Remote Sensing, 2011, 77(2): 125–139 DOI:10.14358/pers.77.2.125

5.

Schwarz B. Mapping the world in 3D. Nature Photonics, 2010, 4(7): 429–430 DOI:10.1038/nphoton.2010.148

6.

Qin R. Rpc stereo processor (rsp)–a software package for digital surface model and orthophoto generation from satellite stereo imagery. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, 2016, III-1: 77–82 DOI:10.5194/isprs-annals-iii-1-77-2016

7.

Qin R. Analysis of critical parameters of satellite stereo image for 3D reconstruction and mapping. arXiv: Computer Vision and Pattern Recognition, 2019

8.

Peng C, Isler V. In: 2019 International Conference on Robotics and Automation (ICRA). 2019, IEEE, 2981–2987

9.

Pix4DmapperPro. 2015

10.

Altizure[EB/OL]. 2019-11-19. https://www.altizure.cn/

11.

DJI-TERRA[EB/OL]. 2019-11-19. https://www.dji.com/cn/dji-terra

12.

Roberts M, Shah S, Dey D, Truong A, Sinha S, Kapoor A, Hanrahan P, Joshi N. Submodular trajectory optimization for aerial 3D scanning. In: 2017 IEEE International Conference on Computer Vision (ICCV). Venice, IEEE, 2017 DOI:10.1109/iccv.2017.569

13.

Fuhrmann S, Langguth F, Moehrle N, Waechter M, Goesele M. MVE: An image-based reconstruction environment. Computers & Graphics, 2015, 53: 44–53 DOI:10.1016/j.cag.2015.09.003

14.

Schonberger J L, Frahm J M. Structure-from-motion revisited. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA, IEEE, 2016 DOI:10.1109/cvpr.2016.445

15.

RealityCapturing[EB/OL]. 2019-11-19. https://www.capturingreality.com/

16.

Hepp B, Nießner M, Hilliges O. Plan3D. ACM Transactions on Graphics, 2019, 38(1): 1–17 DOI:10.1145/3233794

17.

Krause A, Golovin D. Submodular function maximization. 2014

18.

Smith N, Moehrle N, Goesele M, Heidrich W. Aerial path planning for urban scene reconstruction. ACM Transactions on Graphics, 2019, 37(6): 1–15 DOI:10.1145/3272127.3275010

19.

Almadhoun R, Abduldayem A, Taha T, Seneviratne L, Zweiri Y. Guided next best view for 3D reconstruction of large complex structures. Remote Sensing, 2019, 11(20): 2440 DOI:10.3390/rs1120244

20.

Mendoza M, Vasquez-Gomez J I, Taud H, Sucar L E, Reta C J. Supervised Learning of the Next-Best-View for 3D Object Reconstruction. 2019: 1–15

21.

Huang R, Zou D, Vaughan R, Tan P. Active Image-based Modeling with a Toy Drone. In: arXiv e-prints. 2017

22.

Xie K, Yang H, Huang S Q, Lischinski D, Christie M, Xu K, Gong M L, Cohen-Or D, Huang H. Creating and chaining camera moves for qadrotor videography. ACM Transactions on Graphics, 2018, 37(4): 1–13 DOI:10.1145/3197517.3201284

23.

He X, Bourne J R, Steiner J A, Mortensen C, Hoffman K C, Dudley C J, Rogers B, Cropek D M, Leang K K. Autonomous chemical-sensing aerial robot for urban/suburban environmental monitoring. IEEE Systems Journal, 2019, 13(3): 3524–3535 DOI:10.1109/jsyst.2019.2905807

24.

Hornung A, Wurm K M, Bennewitz M, Stachniss C, Burgard W. OctoMap: an efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots, 2013, 34(3): 189–206 DOI:10.1007/s10514-012-9321-0

25.

Mur-Artal R, Tardos J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics, 2017, 33(5): 1255–1262 DOI:10.1109/tro.2017.2705103

26.

Borenstein J, Koren Y. Real-time obstacle avoidance for fast mobile robots. IEEE Transactions on Systems, Man, and Cybernetics, 1989, 19(5): 1179–1187 DOI:10.1109/21.44033

27.

Kim D, Nevatia R. Symbolic Navigation with a Generic Map. Autonomous Robots1999, 6(1):69–88 DOI:10.1023/A:1008824626321

28.

Mur-Artal R, Montiel J M M, Tardos J D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 2015, 31(5): 1147–1163 DOI:10.1109/tro.2015.2463671

29.

Qin T, Li P L, Shen S J. VINS-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, 2018, 34(4): 1004–1020 DOI:10.1109/tro.2018.2853729

30.

Engel J, Koltun V, Cremers D. Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3): 611–625 DOI:10.1109/tpami.2017.2658577

31.

Zhu Y K, Mottaghi R, Kolve E, Lim J J, Gupta A, Li F F, Farhadi A. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore, IEEE, 2017 DOI:10.1109/icra.2017.7989381

32.

Gupta S, Davidson J, Levine S, Sukthankar R, Malik J. Cognitive mapping and planning for visual navigation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, IEEE, 2017 DOI:10.1109/cvpr.2017.769

33.

Chen K, de Vicente J P, Sepulveda G, Xia F, Soto A, Vázquez M, Savarese S. A behavioral approach to visual navigation with graph localization networks. 2019

34.

Mishkin D, Dosovitskiy A, Koltun V. Benchmarking Classic and Learned Navigation in Complex 3D Environments. 2019

35.

Johnson A E, Klumpp A R, Collier J B, Wolf A A. Lidar-based hazard avoidance for safe landing on Mars. Journal of Guidance, Control, and Dynamics, 2002, 25(6): 1091–1099 DOI:10.2514/2.4988

36.

Bosch S, Lacroix S, Caballero F. Autonomous detection of safe landing areas for an UAV from monocular images. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems. Beijing, China, IEEE, 2006 DOI:10.1109/iros.2006.282188

37.

Shen Y F, Rahman Z, Krusienski D, Li J. A vision-based automatic safe landing-site detection system. IEEE Transactions on Aerospace and Electronic Systems, 2013, 49(1): 294–311 DOI:10.1109/taes.2013.6404104

38.

Liu X, Li S, Jiang X Q, Huang X Y. Planetary landing site detection and selection using multilevel optimization strategy. Acta Astronautica, 2019, 163: 272–286 DOI:10.1016/j.actaastro.2019.01.004

39.

Maturana D, Scherer S. 3D Convolutional Neural Networks for landing zone detection from LiDAR. In: 2015 IEEE International Conference on Robotics and Automation (ICRA). Seattle, WA, USA, IEEE, 2015 DOI:10.1109/icra.2015.7139679

40.

Hinzmann T, Stastny T, Cadena C, Siegwart R, Gilitschenski I. Free LSD: prior-free visual landing site detection for autonomous planes. IEEE Robotics and Automation Letters, 2018, 3(3): 2545–2552 DOI:10.1109/lra.2018.2809962

41.

Marcu A, Costea D, Licăreţ V, Pîrvu M, Sluşanschi E, Leordeanu M. SafeUAV: learning to estimate depth and safe landing areas for UAVs from synthetic data//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2019: 43–58 DOI:10.1007/978-3-030-11012-3_4

42.

Desaraju V, Michael N, Humenberger M, Brockers R, Weiss S, Matthies L. Vision-based landing site evaluation and trajectory generation toward rooftop landing. InRobotics: Science and Systems X, Robotics: Science and Systems Foundation, 2014 DOI:10.15607/rss.2014.x.044

43.

Forster C, Faessler M, Fontana F, Werlberger M, Scaramuzza D. Continuous on-board monocular-vision-based elevation mapping applied to autonomous landing of micro aerial vehicles. In: 2015 IEEE International Conference on Robotics and Automation (ICRA). Seattle, WA, USA, IEEE, 2015 DOI:10.1109/icra.2015.7138988

44.

Pizzoli M, Forster C, Scaramuzza D. REMODE: Probabilistic, monocular dense reconstruction in real time. In: 2014 IEEE International Conference on Robotics and Automation (ICRA). Hong Kong, China, IEEE, 2014 DOI:10.1109/icra.2014.6907233

45.

Ravankar A, Ravankar A, Kobayashi Y, Emaru T. Autonomous mapping and exploration with unmanned aerial vehicles using low cost sensors. Proceedings, 2018, 4(1): 44 DOI:10.3390/ecsa-5-05753

46.

Cesare K, Skeele R, Yoo S H, Zhang Y, Hollinger G. Multi-UAV exploration with limited communication and battery. In: 2015 IEEE International Conference on Robotics and Automation. Seattle, WA, USA, IEEE, 2015 DOI:10.1109/icra.2015.7139494

47.

Cieslewski T, Kaufmann E, Scaramuzza D. Rapid exploration with multi-rotors: A frontier selection method for high speed flight. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, BC, IEEE, 2017 DOI:10.1109/iros.2017.8206030

48.

Schmid L M, Pantic M, Khanna R, Ott L, Siegwart R, Nieto J. An Efficient Sampling-based Method for Online Informative Path Planning in Unknown Environments. arXiv: Robotics, 2019

49.

Dang T, Papachristos C, Alexis K. Autonomous exploration and simultaneous object search using aerial robots. In: 2018 IEEE Aerospace Conference. Big Sky, MT, IEEE, 2018 DOI:10.1109/aero.2018.8396632

50.

Papachristos C, Khattak S, Alexis K. Uncertainty-aware receding horizon exploration and mapping using aerial robots. In: 2017 IEEE international conference on robotics and automation (ICRA). 2017, IEEE, 4568–4575

51.

Selin M, Tiger M, Duberg D, Heintz F, Jensfelt P. Efficient autonomous exploration planning of large-scale 3-D environments. IEEE Robotics and Automation Letters, 2019, 4(2): 1699–1706 DOI:10.1109/lra.2019.2897343

52.

Bircher A, Kamel M, Alexis K, Oleynikova H, Siegwart R. Receding horizon "Next-best-view" planner for 3D exploration. In: 2016 IEEE International Conference on Robotics and Automation (ICRA). Stockholm, Sweden, IEEE, 2016 DOI:10.1109/icra.2016.7487281

53.

Bircher A, Kamel M, Alexis K, Oleynikova H, Siegwart R. Receding horizon path planning for 3D exploration and surface inspection. Autonomous Robots, 2018, 42(2): 291–306 DOI:10.1007/s10514-016-9610-0

54.

Yamauchi B. A frontier-based exploration for autonomous exploration. IEEE international symposium on computational intelligence in robotics and automation, Monterey, CA, 1997: 146–151

55.

LaValle S M. Rapidly-exploring random trees: A new tool for path planning. 1998

56.

Oleynikova H, Taylor Z, Siegwart R, Nieto J. Safe local exploration for replanning in cluttered unknown environments for microaerial vehicles. IEEE Robotics and Automation Letters, 2018, 3(3): 1474–1481 DOI:10.1109/lra.2018.2800109

57.

Capitán J, Torres-González A, Ollero A. Autonomous cinematography with teams of drones. Workshop on Aerial Swarms. IEEE International Conference on Intelligent Robots and Systems (IROS), 2019(i):1–3

58.

Mademlis I, Mygdalis V, Nikolaidis N, Pitas I. Challenges in autonomous UAV cinematography: an overview. In: 2018 IEEE International Conference on Multimedia and Expo. San Diego, CA, USA, IEEE, 2018 DOI:10.1109/icme.2018.8486586

59.

Joubert N, Goldman D B, Berthouzoz F, Roberts M, Landay J A, Hanrahan P. Towards a drone cinematographer: guiding quadrotor cameras using visual composition principles. 2016

60.

Nägeli T, Meier L, Domahidi A, Alonso-Mora J, Hilliges O. Real-time planning for automated multi-view drone cinematography. ACM Transactions on Graphics, 2017, 36(4): 1–10 DOI:10.1145/3072959.3073712

61.

Galvane Q, Lino C, Christie M, Fleureau J, Servant F, Tariolle F L, Guillotel P. Directing cinematographic drones. ACM Transactions on Graphics, 2018, 37(3): 1–18 DOI:10.1145/3181975

62.

Galvane Q, Fleureau J, Tariolle F L, Guillotel P. Automated cinematography with unmanned aerial vehicles. 2017

63.

Bonatti R, Ho C, Wang W, Choudhury S, Scherer S. Towards a Robust Aerial Cinematography Platform: Localizing and Tracking Moving Targets in Unstructured Environments. 2019

64.

Mellinger D, Kumar V. Minimum snap trajectory generation and control for quadrotors. In: 2011 IEEE International Conference on Robotics and Automation. Shanghai, China, IEEE, 2011 DOI:10.1109/icra.2011.5980409

65.

Ronfard R, Gandhi V, Boiron L. The Prose Storyboard Language: A Tool for Annotating and Directing Movies. 2013

66.

Richter C, Bry A, Roy N. Polynomial trajectory planning for aggressive quadrotor flight in dense indoor environments//Springer Tracts in Advanced Robotics. Cham: Springer International Publishing, 2016: 649–666 DOI:10.1007/978-3-319-28872-7_37

67.

Achtelik M, Achtelik M, Brunet Y, Chli M, Chatzichristofis S, Decotignie J-D, Doth K-M, Fraundorfer F, Kneip L, Gurdan D. Sfly: Swarm of micro flying robots. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2012, IEEE, 2649–2650

68.

Gebhardt C, Stevšić S, Hilliges O. Optimizing for aesthetically pleasing qadrotor camera motion. ACM Transactions on Graphics, 2018, 37(4): 1–11 DOI:10.1145/3197517.3201390

69.

Joubert N, Roberts M, Truong A, Berthouzoz F, Hanrahan P. An interactive tool for designing quadrotor camera shots. ACM Transactions on Graphics, 2015, 34(6): 1–11 DOI:10.1145/2816795.2818106

70.

Roberts M, Hanrahan P. Generating dynamically feasible trajectories for quadrotor cameras. ACM Transactions on Graphics, 2016, 35(4): 1–11 DOI:10.1145/2897824.2925980

71.

Gebhardt C, Hilliges O. WYFIWYG: Investigating Effective User Support in Aerial Videography. 2018

72.

Gebhardt C, Hepp B, Nägeli T, Stevšić S, Hilliges O. Airways: optimization-based planning of quadrotor trajectories according to high-level user goals. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. San Jose, California, USA, ACM, 2016, 2508–2519 DOI:10.1145/2858036.2858353

73.

Faulwasser T, Kern B, Findeisen R. Model predictive path-following for constrained nonlinear systems. In: Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference. Shanghai, China, IEEE, 2009 DOI:10.1109/cdc.2009.5399744

74.

Huang H, Lischinski D, Hao Z, Gong M, Christie M, Cohen-Or D. Trip synopsis: 60km in 60sec. Computer Graphics Forum, 2016, 35(7): 107–116 DOI:10.1111/cgf.13008

75.

Yang H, Xie K, Huang S Q, Huang H. Uncut aerial video via a single sketch. Computer Graphics Forum, 2018, 37(7): 191–199 DOI:10.1111/cgf.13559

76.

Coombes M, Chen W H, Liu C J. Boustrophedon coverage path planning for UAV aerial surveys in wind. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS). Miami, FL, USA, IEEE, 2017 DOI:10.1109/icuas.2017.7991469

77.

Coombes M, Fletcher T, Chen W H, Liu C J. Optimal polygon decomposition for UAV survey coverage path planning in wind. Sensors, 2018, 18(7): 2132 DOI:10.3390/s18072132

78.

Achermann F, Lawrance N R J, Ranftl R, Dosovitskiy A, Chung J J, Siegwart R. Learning to predict the wind for safe aerial vehicle planning. In: 2019 International Conference on Robotics and Automation (ICRA). Montreal, QC, Canada, IEEE, 2019 DOI:10.1109/icra.2019.8793547