Research founded byRomanian Ministry of Research the frame of the following projects:
• Model-based industrial objects recognition with application in robotics (1994-1996)
• Study of the computational matching strategies based on fitting the model to symbolic structures. Implementation of an experimental model-based recognition system for industrial objects, with application in robotics.
• Study and implementation of a method for symbolic representation of knowledge in model-based recognition.
The model based object recognition problem can be divided in three sub-problems:
• features selection and extraction;
• model and scene representation;
• matching between model and scene representations.
An effective solution of this problem can be obtained only taking into account the aims of the recognition task:
• the class of the recognizable objects;
• the type of sensorial system used (intensity or depth images);
• the performances of the recognition system (speed, accuracy, ability to recognize partial occluded objects, ability to recognize an object from any viewpoint).
The general problem of automatic object recognition from any arbitrarily viewpoint require object centered 3-D models. In order to implement an effective recognition algorithm a matching method for model and image data representation must be found. Since the object model has more information then input data, it is not possible to convert the input data into complete model data, and do matching in the model format. In order to reduce the dimension of the problem, it is profitably to work in an intermediate space, which can be computed both from the input data and model.
Development of a robust and effective method for model based recognition of 2D or 3D objects, from real life intensity images.
In order to improve the matching process an "intermediate representation" easy obtainable both from the intensity image by features extraction (edge detection and contour extraction) and from CAD model was proposed. The matching procedure has the following features
• matching the models with symbolic structures of the scene coded in the intermediate representation;
• use of the inexact matching techniques;
• intensive using of indexing techniques for search space reduction.
The scene and models representation
For scene and models representation an intermediate representation is used as defined in [1,5] and characterized by:
• local features;
• the representation primitives are the straight line segments and ellipsoidal arcs ;
• 2D index features associated to 3D forms and which correspond to projection invariant properties of 3D forms;
• the association of localization and identification attributes with each feature;
• the association of significance factor with each feature;
• hierarchical nature of the representation.
The aggregation of the elementary features belonging to the same physical entity in index features, generates compounded features with greater discrimination and indexing power.
The intermediate representation can be viewed as an explicit relational attributed graph in which the straight line segment elementary features represent the nodes and the compound features expressing relational constraints between elementary features represent the arcs.
The hierarchical nature of the representation allows the optimal hierarchical level selection to realize indexing or matching.
The signification factors allow the selection of the privileged features and the features ordering according to their importance.
Fig. 1. Extraction of ellipsoidal arc primitives from industrial objects
Sequential extraction of the models from model library to realize a matching with sensorial features is acceptable in the case of small number 2D object libraries. Considering a great number of 2D objects or the 3D object library, the former being specified by projections corresponding to their different aspects, it is necessary to use indexing techniques.
The index features defined in the intermediate representation are used in the implementation of indexing. It is used a technique based on model level accumulation of the votes coming out after a successful indexing of the models by the index features from the scene. The vote number for each model will be normalized.
The models will be sorted depending on to the accumulated vote number. In the recognition process first the models with greater vote number will be used.
For inexact matching the hypothesis generation and verification method was used. This method allows the focus of the search at the model, feature and algorithm level.
The focus at the model level consists of the selection of a set of sensorial features corresponding to the virtual appearance of the model in the scene. It is realized for each generated hypothesis.
Hypothesis generation is realized by aligning the model features with the same kind of sensorial features having similar identification attributes and determining the associated rigid transformations. To get smaller generated hypothesis number, and the hypothesis to be as probable as possible, it is necessary to use some privileged features with high discrimination power. In this respect, the index features, from the adopted intermediate representation, showing relations between elementary features, have a higher discrimination power than the elementary features. At the same time, for the reducing of the hypothesis number, the feature pairs used have to insure only one aligning possibility. The hypothesis generation is realized by aligning the angle type features.
The hypothesis verification is done by predicting the zones in the scene where the virtual model features appear: mi*=T(mi). This virtual segment is used to realize a new focus at the feature level. Only the segments intersecting or contented into a rectangle, which has as a median line the mi* segment, and has an orientation close to the one of the mi* segment are selected from the sensorial segment set associated to the hypothesis.
Fig. 3. Focus at the feature level.
A 1:1 pairing is tried. In the case of edge line features, the likeness factor of two features mi and sij can be calculated. If a 1:1 matching is not possible one has to try an inexact matching. For each of the sensorial segments associated with the virtual appearance of the model segment on the scene it is calculated the dm distance between their middle and the support line of the mi* segment. Only the segments with the distance dm < Dmax having the same gradient orientation are taken. Each of these segments will contribute to the matching in a ratio which is proportional to the length of its projection on the mi* segment and an inverse ratio to the dm.distance. The likeness factor between the mi* segment and the fragments of sensorial segments which satisfy the above conditions is used to update the quality factor and the covering factor. The quality and covering factors are used for heuristic guiding, abandoning or terminating of the search. The 1:1 pairings are used for iterative refinement of the transformation.
The above presented matching method was tested on a set of industrial objects in the following conditions: improper lighting; noises; touching and partial overlapping of the objects.
In fig. 4 are presented the results of the recognition of a polygonal object using the presented method, and the HYPER method respectively. The use of the inexact matching, determines an improvement of the recognition accuracy. So the quality factor obtained through the presented method is 0.882 instead of 0.689 with the HYPER method.
Fig. 4. Left: polygonal object recognition using inexact matching;Right: polygonal object recognition using HYPER method.
The figures 5 and 6-left show the recognition of some objects in noisy scenes with partial overlapping. In all these situations the quality factor of the recognition is close to the weight of the visible perimeter of the objects. In fig. 6 the recognition of a cub with hard 3D features from a noisy occluded image is presented.
Fig. 5. Objects recognition from a noisy scene with occlusions.
Fig. 6. Left: "ferastrau" object recognition from a noisy scene with occlusions; Right: "cub" object recognition from a noisy scene with occlusions.
1. S. Nedevschi, Tiberiu Marita, Daniela Puiu, "Intermediate Representation in Model Based Recognition Using Straight Line and Ellipsoidal Arc Primitives", Proceeding of 11th International Conference on Image Analysis and Processing 2001, 26-28 September, 2001, Palermo, Italy, pp. 156-161.
2. S. Nedevschi, M. Marton, "Recognition Oriented Modeling Environment for Robust 3D Object Identification and Positioning from a Single Intensity Image", 9th DAAAM International Symposium, Technical University Cluj, Cluj-Napoca, Romania, 22-24th October, 1998.
3. S. Nedevschi, L. Todoran, "Inexact Matching in Bidimensional Recognition of 2D and 3D Objects from Intensity Images", Proceedings of the International Conference on Control Systems and Computer Science, Bucuresti, May, 1995, Vol. 3, pp. 37-40.
4. Sergiu Nedevschi , "A Robust and Effective Method for Bidimensional Recognition of 2D and 3D Objects from Intensity Images", Intelligent Robots and Systems, Munich, Germany, 12-24 sept. 1994.
5. S. Nedevschi, C. Goina, "Intermediate Representation for 3D Model Based Recognition from Intensity Image", ACAM, Vol. 2, No. 1, 1993, pp. 26-33.
6. S. Nedevschi, C. Goina, Edge Extraction and Contour Closing with Subpixel Accuracy, ACAM, No. 2, 1992.