Visual SLAM in UAV

4 min readNov 28, 2021

Visual Simultaneous Localization and Mapping (VSLAM) using multirotor Unmanned Aerial Vehicles (UAV) in an unknown environment have grown in popularity for both research and education. This article gives an in depth perspective about the various modules used in the process of SLAM for UAVs namely the localization, mapping, and path planning kits.

Overview:

The localization (pose estimation) kit utilizes the onboard sensor information, such as a stereo camera, to estimate the vehicle’s 6 degree of freedom (DoF) pose in real-time. The pose feeds into the ﬂight control unit (FCU) to achieve position level control. Given the vehicle pose and sensor input, such as point cloud, the mapping kit reconstructs the environment throughout the mission. Typically, the environment is presented by a 3D occupancy voxel map with the Euclidean signed distance information. The path planning kit, ﬁnds the path to the destination with the minimum cost, avoids the obstacle, and generates a trajectory. The trajectory is then sent to the FCU in the time sequence and navigate the vehicle.

1) Localization: Localization or pose estimation in VSLAM can be performed by using visual sensors like monocular, stereo or RGB-D cameras like Kinect. The monocular solution has an advantage in terms of simple structure, light-weight, and cost-efﬁciency. However, recovering the scale correctly is the challenge for such a system. To improve the accuracy of monocular visual systems IMU information is normally integrated to eliminate this problem.

Nowadays, stereo camera solutions are off the shelf. As the depth information can be directly extracted from every frame, the accuracy and the robustness of the system are better than the monocular setup. Admittedly, the stereo data stream is more massive than the monocular one. The powerful onboard computers can compensate for this issue.

In the UAV application, the visual information is usually fused with the IMU data through either a ﬁlter-based framework or an optimization-based framework. In the ﬁlter-based framework, the pose and the landmark are in the system states. IMU input propagates the pose states and the relevant covariance matrix. In the optimization-based framework, the IMU engaged through a pre-integration edge. According to some studies, the optimization based approach outperforms the ﬁlter-based approach in terms of accuracy but requires more computation resources.

2) Mapping: The mapping system, which provides a foundation for onboard motion planning, is an essential component in the perception-planning-control pipeline. A mapping system needs to balance the measurements’ accuracy and the overhead of storage.

Three kinds of maps have successful stories in the UAV navigation application: point cloud map, occupancy map, and Euclidean Signed Distance Fields (ESDFs) map. We can easily get the point cloud map by stitching point measurement. However, this kind of map only suitable for high precision sensors in static environments since the sensor noise and dynamic objects cannot be accessed and modiﬁed.

Occupancy maps, such as Octomap, store occupancy probabilities in a hierarchical octree structure. These approaches’ main restriction is the ﬁxed-size voxel grid, which requires a known map size in advance and cannot be dynamically changed.

Nowadays, Euclidean Signed Distance Fields (ESDFs) map gains popularity. This kind of map is suitable for dynamically growing maps and have the advantage of evaluating the distance and gradient information against obstacles.

3) Planning: For UAV route planning, algorithms can be classiﬁed into two main categories, sampling-based and optimization-based. Rapidly-exploring random tree (RRT) is the representative of the sampling-based algorithm. In this method, samples are drawn randomly from the conﬁguration space and guide the tree to grow towards the target. Other sampling based planners are RRT*, Probabilistic road maps, etc.

Even though the sampling-based method is suitable for ﬁnding safe paths, it is not smooth for UAV to follow and they require optimization planners to produce the most optimal or shortest paths like A*, LPA*, JPS (Jump Point Search).

In optimization-based methods, another way to add constraints to the optimization problem is ﬁrst obtaining a series of waypoints by sample search or grid search, then optimize the motion primitives to generate a smooth trajectory through the waypoints under the UAV’s dynamic constraints. The global planner utilizes algorithms like Jump Point Search (JPS) algorithm to output a serial of waypoints, which present the shortest path.

It offers computation efﬁciency over those pure optimization-based algorithms. However, the safe radius and other parameters must be tuned carefully.

Visual SLAM in UAV

Written by Dibyendu Biswas