Main problems in image processing

The first concept to introduce is related to images, which can be seen as a two-dimensional (2D) view of a 3D world. A digital image is a numeric representation, normally binary, of a 2D image as a finite set of digital values, which are called pixels (the concept of a pixel will be explained in detail in the Concepts of pixels, colors, channels, images, and color spaces section). Therefore, the goal of computer vision is to transform this 2D data into the following:

Computer vision may tackle common problems (or difficulties) when dealing with image-processing techniques:

To put all of these difficulties together, imagine that you want to develop a face-detection system. This system should be robust enough to deal with changes in illumination or weather conditions. Additionally, the system should tackle the movements of the head, and could even deal with the fact that the user can be farther from or closer to the camera. It should be able to detect the head of the user with some degree of rotation in every axis (yaw, roll, and pitch). For example, many face-detection algorithms show good performance when the head is near frontal. However, they fail to detect a face if it's not frontal (for example, a face in profile). Moreover, you may want to detect the face even if the user is wearing glasses or sunglasses, which produces an occlusion in the eye region. When developing a computer vision project, you must take all of these factors into consideration. A good approximation is to have many test images to validate your algorithm by incorporating some difficulties. You can also classify your test images in connection with the main difficulty they have to easily detect the weak points of your algorithm.