For the last assignment in our robotic vision course, we are asked to implement a beacon based localization program, using a single camera. The final goal of the task is to be able to visualise the estimated location of our camera in the 3D world.
Before we begin explaining the details of the implementation, we need to organize the assignment into subtasks to understand the whole picture of the problem better. The following three points are probably a good way to generalize the whole problem and will be explained in more detail later on the blog.
Calibration
Beacon detection
Pose estimation
We used an Android phone and the Droidcam app to record and estimate (at the same time) the position of the camera relative to the beacon.
Also, we will use the AprilTag library and its tags for the detection of the beacon and its corners.
Calibration
The calibration part of the problem is vital in order to know the intrinsic parameters of the camera. Because we are going to be moving the camera, in this first instance the extrinsic parameters (position and rotation) will not be necessary.
To calibrate, we used a CheckBoardPattern and the opencv findChessboardCorners and calibrateCamera method. More than 20 images of the CheckBoardPatter were taken with the camera from different angles.
First, we need to find all the corners of the pattern with size 10x7 from the multiple checkboard images. As seen in the next figure, the opencv findChessboardCorners will automatically do this for us.
Once this is done, we pass the corners and the chessboard real metrics to the calibrateCamera function to obtain the intrinsic matrix of our android phone camera.
Beacon detection
The detection of the 4 corners of our beacon can also be obtained with a method from the AprilTag library. With the use of a Detector from the library, we can get the 4 corners as shown in the next images.
The good thing about using this library, is that we can have a simple way to identify different tags if we wanted to use more than one beacon.
Pose estimation
Finally, for the pose estimation, once we have the intrinsic matrix, the 3D corners and the 2D detected corners coordinates, we can use the perspective n points function from the opencv library to obtain the extrinsic information of the camera respective to the beacon.
With A being the intrinsic matrix and b the translation vector, we can obtain the optical center of the camera by doing the next operation.
In case we wanted to add more beacons, we tested it for two. By adding more beacons, we can obtain multiple extrinsic information, and by calculating the average of these extrinsics, we can have a more robust estimation of the camera position in the world. But, to do so, we need a "map" of the real world, meaning that we need to save the position of each tag relative to the real world reference system.
The implementation we made of this problem, can estimate the position of the camera correctly if the orientation of the beacons doesn't change from one to another.
Results and conclusion
With the results obtained with our implementation, we can observe the correct autolocalization made by the camera and its representation in the 3D world. By looking at the first video with 1 beacon localization, we can see that the localization becomes more unstable the further the distance between the camera and the beacon.
By estimating with 2 beacons, we can see some faulty computation of the Z-axis from the right beacon, but apart from that problem, the autolocalization is pretty stable. The result from doing an estimation with 2 beacons can be seen in the next video:
Comments