Stereo-Based Object Tracking and Pedestrian Recognition in Traffic environments

Project founded by Volkswagen AG, Germany (2006-2007)



The pedestrians are the most vulnerable participants to the urban traffic. The first step toward protecting pedestrians is to reliably detect them. However, although recognizing the humanoid shape proves very easy for humans, it is very difficult, at the moment, for computer vision systems. This is especially true in the highly cluttered urban environment. The high variance of appearance, occlusions and different pose and distance are problems in pedestrian classification.



Development of a real-time dense-stereo camera based system able to detect, classify and track pedestrians in urban environments. Based on the detection parameters the system is able to alert the driver of a imminent accident or take control o f the car to avoid a collision with a pedestrian. The system needs to have a high degree of accuracy and speed to be able to handle difficult situations and react faster than a human is able to.

Marginal conditions:
• detection range: 0 to 30 m
• Field of view: 72 degrees
• Maximum traveling speed: 80 km/h
• Processing rate: 20 frames/second



The acquisition system is composed from two digital grayscale cameras integrated in a calibrated stereo head rigid. The camera parameters are calibrated using a dedicated methodology. Dense stereo reconstruction is performed by a dedicated hardware board (DeepSea by TYZX) or by a software algorithm. The 3D points are used to determine the surface of the road in front of the car. The 3D points above the detected road surface are grouped into pedestrian hypothesis, taking into account the density of the 3D points, vicinity criteria and 2D image information such as similar texture and connecting edges. Tracking using Kalman filtering is used in order to improve and stabilize the detected pedestrian hypothesis. The pedestrians are classified based on shape and motion features Real time implementation using Visual C++ framework and MMX instructions for code optimization was used. Detection results can be further fused with information provided by other sensors (radar and laser scanner).



Detection results of the DESPED application are presented in fig. 1:

Fig. 1. DESPED output.

Dense stereovision

Real-time dense stereo reconstruction is performed using a dedicated board (DeepSea by TYZX) (fig. 2). A software algorithm was also developed for off-line validation purposes (fig. 3).

Detection of pedestrian hypothesis
The pedestrian detection method relies on the fact that pedestrians are usually well reconstructed by the 3D reconstruction engines. This results in a high density of 3D points in the 3D space occupied by the pedestrian. The density map is constructed based on a subset of 3D points selected from the total set of reconstructed 3D points. By detecting those positions in the density map where there is a high density of 3D points, possible positions for pedestrians can be determined. These results are further validated using texture validation methods and 3D validation methods (fig. 4).

Shape Based Classification
One of the most important characteristic of the human body is its general shape. One of our classification methods tries to match this characteristic shape against a database which contains models for pedestrians with different positions and poses. Based on these matching, the classifier is capable to determine is the pedestrian hypothesis is a pedestrian or not. The 3D information is used to determine the features belonging to the pedestrian and the scale for the models in the database (fig. 6).

Motion Based Classification
The motion based pedestrian classification tries to classify objects based on their movement properties. The legs move in opposite directions and generate a motion signature different for that of a rigid object like a car. This motion signature is characteristic for pedestrians and can be used to validate pedestrian hypothesis. Also the frequency of this motion is another characteristic that is used for classification. (fig. 5)

Multifeature Object Classification
There is no single classification feature that is able to reliably classify a pedestrian. Our system takes the approach of combining a set of features (features which properties of the hypothesis or results of classifiers). The features are combined using a Bayesian naive classifier in order to determine the probability that the pedestrian hypothesis is a actual pedestrian


Concluding remarks

We have developed a system based on stereovision able to perform in real time pedestrian detection and pedestrian classification in urban scenarios. The system functions are integrated into a dedicated stereovision framework, which can be easily extended with other capabilities (e.g. image analyzes tasks) or to build a specific application for an active security or driving assistance system for automotive industry.


Our services

The obtained expertise of the research team has reached the state-of-the-art level in some specific fields of the computer vision usable in robot or automotive applications: camera calibration, high-resolution stereo reconstruction, stereo measurements, object detection and tracking, pattern matching, motion estimation (control, programming, design).


Recent publications

S. Nedevschi, C. Tomiuc, S. Bota, "Stereo Based Pedestrian Detection for Collision Avoidance Applications", Proceedings of ICRA 2007 Workshop: Planning, Perception and Navigation for Intelligent Vehicles, April, 2007, Roma, pp. 39-44.

S. Bota, S. Nedevschi, Multi-Feature "Walking Pedestrian Detection Using Dense Stereo Motion", WIT 2007, 20-21 March, 2007, Hamburg, pp. 113-118

Corneliu Tomiuc, Sergiu Nedevschi, "Real time object classification exploiting 2D and 3D information", Proceedings of IEEE 2nd International Conference on Intelligent Computer Communication and Processing, 1-2 September, Cluj-Napoca, Romania, 2006, pp. 129-134.

Silviu Bota, Sergiu Nedevschi, "Walking Pedestrians Detection Using Motion Field and Dense Stereo", Poster volume of IEEE 2nd International Conference on Intelligent Computer Communication and Processing, 1-2 September, Cluj-Napoca, Romania, 2006, pp 45-51.

S. Nedevschi, C. Vancea, T. Marita, T. Graf, On-Line Calibration Method for Stereovision Systems Used in Vehicle Applications, Proceedings of the IEEE Intelligent Transportation Systems Conference (ITSC 2006), Toronto, Canada, September 17-20, 2006, pp. 957-962.

Fig.�2. Hardware dense stereo-reconstruction

Fig.�3. Software dense stereo-reconstruction

Fig.4. Pedestrian Detection

Fig.5. Pedestrian Motion Detection

Fig. 6 Shape Based Classification