Using Keypoints in Deep Learning Projects effectively

In annotation projects it can be useful to not only label an object but invest more time into details and label important points in the image aswell. These points of intrest are called Keypoints.

For example keypoints can be the corners of an object or other special features like the eyes in the human face. With this special features you can identify some objects more precise you can tell diffrent things about them. In the human face you could start recognize facial expressions like smiling with the help of keypoints.

There are multiple ways to label an object with keypoints. Some of the methods we are taking a close look at today are:

Connected keypoints with polygone lines
Generated keypoints with forms
Generated and manual set keypoints for 3D object detection

Connected keypoints with polygone lines

Labeling an object with dotted lines tells the system that the order of which the keypoints are set is important. As you see in the picture below the eyes are always labeled with a set amount of keypoints. In this case the position and relation between they keypoints to each other is important. Every point gets its own unique id so each point can be clearly assigned while detecting the face.

keypoints in face — Face detection with keypoints

This process gets eased by using the dotted lines technique. The human who is training the system recignizes while labeling that certain structures belong to each other. It is very important that the polygon line always contains the same keypoints. With this technique the human who is labeling the data is deciding which points in the object are important.

A technique like this is especially useful in projects where its important to differ between ceratin structures. For example:

Detection and interpretation of faces and facial expressions
Marker detection
Document detection
Font detection

Generated keypoints with forms

The generation of fulcrums (points to assist the system) based on predefined geometric forms offers the advantage that objects in the image can be labeled with much less effort. This technique is very useful when you need to detect lines and trajectories. It is important to keep in mind that the keypoints need to be on the same line or edge. The keypoints on a certain edge or line need to be the same class. Every new edge gets its own class with keypoints. The relation of the points to each other are not that imporant when using this variant.

To create this points as simple as possible you can create them dynamical and automated. With a predefined form you are able to evenly spread keypoints along the form. This way the system recieves an optimal placed amount of traindata. With the set keypoints you can now calculate and issue the desired structure. An additional feature: Not every keypoint needs to be found to detect and calculate the structure.

Evenly spread keypoints with the same class on a form are especially useful when you want to:

Detect lines and their trajectory
Detect streets and their trajectory
Area detection
Detect forms like circles and ellipses

Generated and manual set keypoints for 3D object detection

3D annotation doesnt necesarry needs to be realised with a 3D model of the object. With keypoints in a 2D image and the corresponding 3D points in a 3D model in a CAD format for example you can use reprojection to calculate the position of the object. It uses ratation and translation data to generate a matching transformationmatrix. This technique requires a calibrated camera system. Just like the variant with the keypoints who are connected with a dotted line the keypoints in this method are unique set keypoints with a unique class. This makes it easier to comprehend the matching between 2D and 3D points.

To label a 3D object you need to set relevant keypoints before. This process can be done manually or automated. Setting them manually is often better when a human needs to label the data manually later. In this case you could use the corners of an object for example. The automated process is prefered when the object doesnt have very specific or unique parts. In this case the system sets a high amount of keypoints. This allows to system later to work even when not all keypoints were found since not all of them are needed. Algorithms who set keypoints automated often use eigenvalues to set the keypoints but there are other algorithms like „Orb-Detector“ who can be used aswell.

Keypoints 3D Detect — Detected 3d object in a 2d image with keypoints

Detection of a 3D object is usefull when you want to:

Guide a robot to interact with certain objects. For example grabbing them
Calculate distance between objects
Develop augmented reality software