A cognitive approach to vision for a mobile robot

D. Paul Benjamin; Christopher Funk; Damian Lyons

doi:10.1117/12.2018856

29 May 2013 A cognitive approach to vision for a mobile robot

D. Paul Benjamin, Christopher Funk, Damian Lyons

Proceedings Volume 8756, Multisensor, Multisource Information Fusion: Architectures, Algorithms, and Applications 2013; 87560I (2013) https://doi.org/10.1117/12.2018856
Event: SPIE Defense, Security, and Sensing, 2013, Baltimore, Maryland, United States

Abstract

We describe a cognitive vision system for a mobile robot. This system works in a manner similar to the human vision system, using saccadic, vergence and pursuit movements to extract information from visual input. At each fixation, the system builds a 3D model of a small region, combining information about distance, shape, texture and motion. These 3D models are embedded within an overall 3D model of the robot's environment. This approach turns the computer vision problem into a search problem, with the goal of constructing a physically realistic model of the entire environment. At each step, the vision system selects a point in the visual input to focus on. The distance, shape, texture and motion information are computed in a small region and used to build a mesh in a 3D virtual world. Background knowledge is used to extend this structure as appropriate, e.g. if a patch of wall is seen, it is hypothesized to be part of a large wall and the entire wall is created in the virtual world, or if part of an object is recognized, the whole object's mesh is retrieved from the library of objects and placed into the virtual world. The difference between the input from the real camera and from the virtual camera is compared using local Gaussians, creating an error mask that indicates the main differences between them. This is then used to select the next points to focus on. This approach permits us to use very expensive algorithms on small localities, thus generating very accurate models. It also is task-oriented, permitting the robot to use its knowledge about its task and goals to decide which parts of the environment need to be examined. The software components of this architecture include PhysX for the 3D virtual world, OpenCV and the Point Cloud Library for visual processing, and the Soar cognitive architecture, which controls the perceptual processing and robot planning. The hardware is a custom-built pan-tilt stereo color camera. We describe experiments using both static and moving objects.

Citation Download Citation

D. Paul Benjamin, Christopher Funk, and Damian Lyons "A cognitive approach to vision for a mobile robot", Proc. SPIE 8756, Multisensor, Multisource Information Fusion: Architectures, Algorithms, and Applications 2013, 87560I (29 May 2013); https://doi.org/10.1117/12.2018856

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available