There are several factors that affect the performance of a 3D scene reconstruction system. Among them the most important are the choice of feature detectors and descriptors, number of visual features and correct match between them, reliable tracking of the correspondences along selected keyframes.
In this work, we propose a fast method for generation of a 3D map from the time sequence of RGB-D images selecting the minimum number of keypoints and keyframes that still ensures correct feature correspondences and as a result a high quality of the 3D map. The performance of the proposed algorithm of the 3D scene reconstruction is evaluated by computer simulation using real indoor environment data.
It is well known that the accuracy and resolution of depth data decreases when the distance from a RGB-D sensor to a 3D object of interest increases, affecting the performance of 3D scene reconstruction systems based on an ICP algorithm. In this paper, to improve the 3D map accuracy by aligning multiple cloud points we propose: first, to split the depth plane into sub-clouds with a similar resolution; then, in each sub-cloud to select a minimum number of keypoints for aligning them separately with an ICP algorithm; finally, to merge all clouds into a dense 3D map. Computer simulation results show the performance of the proposed algorithm of the 3D scene reconstruction using real indoor environment data.
In order to design a tracking algorithm with invariance to pose, occlusion, clutter, and illumination changes of a scene, non-overlapping signal models for input scenes as well as for objects of interest and Synthetic Discriminant Function approach are exploited. A set of optimum correlation filters with respect to peak-to-output energy is derived for different target versions in each frame. A prediction method is utilized to locate a target patch in the coming frame. The algorithm performance is tested in terms of recognition and localization errors in real scenarios and compared with that of the state-of-the-art tracking algorithms.
With the development of RGB-D sensors, a new alternative to generation of 3D maps is appeared. First, features extracted from color and depth images are used to localize them in a 3D scene. Next, Iterative Closest Point (ICP) algorithm is used to align RGB-D frames. As a result, a new frame is added to the dense 3D model. However, the spatial distribution and resolution of depth data affect to the performance of 3D scene reconstruction systems based on ICP. In this paper we propose to divide the depth data into sub-clouds with similar resolution, to align them separately, and unify in the entire points cloud. The presented computer simulation results show an improvement in accuracy of 3D scene reconstruction using real indoor environment data.
In this work, we propose a new algorithm for matching of coming video sequences to a simultaneous localization and
mapping system based on a RGB-D camera. Basically, this system serves for estimation in real-time the trajectory of
camera motion and generates a 3D map of indoor environment. The proposed algorithm is based on composite
correlation filters with adjustable training sets depending on appearance of indoor environment as well as relative
position and perspective from the camera to environment components. The algorithm is scale-invariant because it
utilizes the depth information from RGB-D camera. The performance of the proposed algorithm is evaluated in terms of
accuracy, robustness, and processing time and compared with that of common feature-based matching algorithms based
on the SURF descriptor.
This paper considers the face identification task in video sequences where the individual’s face presents variations;
such as expressions, pose, scale, shadow/lighting and occlusion. The principles of Synthetic Discriminant
Functions (SDF) and K-Law filters are used to design an adaptive unconstrained correlation filter (AUNCF). We
developed a face tracking algorithm which together with a face recognition algorithm were carefully integrated
into a video-based face identification method. First, a manually selected face in the first video frame is identified.
Then, in order to build an initial correlation filter, the selected face is distorted so that it generates a training set.
Finally, the face tracking task is performed using the initial correlation filter which is updated through the video
sequence. The efficiency of the proposed method is shown by experiments on video sequences, where different
facial variations are presented. The proposed method correctly identifies and tracks the face under observation
on the tested video sequences.
Correlation filters have become an important tool for detection, localization, recognition and object tracking in digital
media. This interest in correlation filters has increased thanks to the processing speed advances of the computers that
enable the implementation of digital correlation filters in real-time. This paper compares the performance of three
correlation filters in the activity of object recognition, specifically human faces with variations in facial expression, pose,
rotation, partial occlusion, illumination and additive white Gaussian noise. The analyzed filters are k-law, MACE and
OTSDF. Simulation results show that the k-law nonlinear composite filter has the best performance in terms of accuracy
and false acceptance rate. Finally, we conclude that a preprocessing algorithm improves significantly the performance of
correlation filters for recognizing objects when they have variations in illumination and noise.
Automatic estimation of human activities is a topic widely studied. However the process becomes difficult when we
want to estimate activities from a video stream, because human activities are dynamic and complex. Furthermore, we
have to take into account the amount of information that images provide, since it makes the modelling and estimation
activities a hard work. In this paper we propose a method for activity estimation based on object behavior. Objects are
located in a delimited observation area and their handling is recorded with a video camera. Activity estimation can be
done automatically by analyzing the video sequences. The proposed method is called "signature recognition" because it
considers a space-time signature of the behaviour of objects that are used in particular activities (e.g. patients' care in a
healthcare environment for elder people with restricted mobility). A pulse is produced when an object appears in or
disappears of the observation area. This means there is a change from zero to one or vice versa. These changes are
produced by the identification of the objects with a bank of nonlinear correlation filters. Each object is processed
independently and produces its own pulses; hence we are able to recognize several objects with different patterns at the
same time. The method is applied to estimate three healthcare-related activities of elder people with restricted mobility.
During a cognitive stimulation session where elders with cognitive decline perform stimulation activities, such as
solving puzzles, we observed that they require constant supervision and support from their caregivers, and caregivers
must be able to monitor the stimulation activity of more than one patient at a time. In this paper, aiming at providing
support for the caregiver, we developed a vision-based system using an Phase-SDF filter that generates a composite
reference image which is correlated to a captured wooden-puzzle image. The output correlation value allows to
automatically verify the progress on the puzzle solving task, and to assess its completeness and correctness.
This work presents the development and utilization of vectorial signatures filters obtained from the application of properties of the scale and Fourier transform for images recognition. The filters were applied to different input scene, which consisted in the 26 letters of the alphabet. Each letter is an image of 256 × 256 pixels of black background with a centered white Arial letter. The image was rotated 360 degrees in increment of 1o and scaled from 70% to 130% in increment of 0.5%. In order to find a new invariant correlation digital system we obtained two unidimensional vector after to achieve different mathematical transformation in the target as well as the input scene. To recognize a target, signatures were compared, calculating the Euclidean distance between the target and the input scene; then, confidence levels are obtained. The results demonstrate that this system has a good performance to discriminate between letters.
One of the main problems in visual image processing is incomplete information owing an occlusion of objects by other objects. Since correlation filters mainly use contour information of objects to carry out pattern recognition then conventional correlation filters without training often yield a poor performance to recognize partially occluded objects. Adaptive correlation filters based on synthetic discriminant functions for recognition of partially occluded objects imbedded into a cluttered background are proposed. The designed correlation filters are adaptive to an input test scene, which is constructed with fragments of the target, false objects, and background to be rejected. These filters are able to suppress sidelobes of the given background as well as false objects. The performances of the adaptive filters in real scenes are compared with those of various correlation filters in terms of discrimination capability and robustness to noise.
New adaptive correlation filters based on a conventional synthetic discriminant function (SDF) for reliable recognition of an object in cluttered background are proposed. The information about an object to be recognized, false objects, and a background to be rejected is utilized in an iterative training procedure to design a correlation filter with a given value of discrimination capability. Computer simulation results obtained with the proposed adaptive filter in test scenes are discussed and compared with those of various correlation filters in terms of discrimination capability, tolerance to input additive noise that is always present in image sensors, and to small geometric image distortions.
One of the main problems in visual signal processing is incomplete information owing an occlusion of objects by other objects. It is well known that correlation filters mainly use contour information of objects to carry out pattern recognition. However, in real applications object contours are often disappeared. In these cases conventional correlation filters without training yield a poor performance. In this paper two novel methods based on correlation filters with training for recognition of partially occluded objects are proposed. The methods improve significantly discrimination capability of conventional correlation filters. The first method performs training of a correlation filter with both a target and objects to be rejected. In the second proposal two different correlation filters are designed. They deal independently with contour and texture information to improve recognition of partially occluded objects. Computer simulation results for various test images are provided and discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.