Well, when we capture photos in 2D, all the depth information is lost due to a process called Perspective projection. When an image is taken, it is converted from 3D to 2D, it is presented on a 2D plane where the distance of each point (or pixel) away from the…


Camera calibration is the estimation of the parameters of a camera, parameters about the camera required to determine an accurate relationship between a 3D point in the real world and its corresponding 2D projection (pixel) in the image captured by that calibrated camera.

The major purpose of camera calibration is…


The geometry that relates the cameras, points in 3D, and the corresponding observations is referred to as the epipolar geometry of a stereo pair. It is independent of the scene structure and only depends on the internal and external parameters of the camera.

The general setup of epipolar geometry. The…


Panoramic stitching is the process of combining multiple images with overlapping fields of view to produce a panorama or high-resolution image. To combine images with overlapping field of view, we need to detect the matching features and key points in the image pair.


Homography is a planar relationship that transforms points from one plane to another. Homography stores the position and orientation of the camera and this can be retrieved by decomposing the homography matrix.

Homogeneous Coordinate

Since there is a conversion from 3D to 2D when taking a picture, the scale of depth is…


Hough transform is a feature extraction method used in image analysis. Hough transform in its simplest from can be used to detect straight lines in an image.

Intuition behind Hough Transformation


DeepSORT is one of the finest object tracking algorithm. However, there are some assumptions in DeepSORT, for example, there should be no ego-motion. Ego-motion in simple words mean that camera should be stationary. Also, DeepSORT tracks only person. For any other object we would need to train it again.

Optical…


Fourier transform (FT) is a mathematical transform that decomposes a function (often a function of time or a signal) into its constituent frequencies, and takes them from the time domain into the frequency domain.

The output of Fourier transform is a complex number, which is represented as


With the development of AR/VR, self-driving cars, 3D vision problem becomes more and more important since it provides much richer information than 2D. 3D image measures one more dimension, the depth dimension.


A self-driving car using Cruise Automation LiDAR

LiDAR or Light Detection and Ranging is an active remote sensing system that is used to measure distances. It sends light pulses to detect any object in the path of light.

How Does LiDAR Work?

LiDAR is an active remote sensing system. A LiDAR system measures the time it takes for emitted light to…

Dibyendu Biswas

Robotics Enthusiast. Well versed with computer vision, path planning algorithms, SLAM and ROS

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store