AbstractsComputer Science

Autonomous learning of object models on mobile robots using visual cues

by Xiang Li




Institution: Texas Tech University
Department:
Year: 2013
Keywords: Visual Learning; Object Recognition; Wheeled Robots
Record ID: 2015511
Full text PDF: http://hdl.handle.net/2346/58650


Abstract

Mobile robots are increasingly being used in real-world application domains such as disaster rescue, surveillance, health care and navigation. These application domains are typically characterized by partial observability, non-deterministic action outcomes and unforeseen changes. A major challenge to the widespread deployment of robots in such domains is the ability to learn models of domain objects automatically and efficiently, and to adapt the learned models in response to changes. Although sophisticated algorithms have been developed for modeling and recognizing objects using different visual cues, existing algorithms are predominantly computationally expensive, and require considerable prior knowledge or many labeled training samples of desired objects to learn object models. Enabling robots to learn object models and recognize objects with minimal human supervision thus continues to be an open problem. The above-mentioned challenges are offset by some observations. First, many objects have distinctive characteristics, locations, and motion patterns, although these parameters may not be known in advance and may change over time. Second, images encode information about objects in the form of many different visual cues. Third, any specific task performed by robots typically requires accurate models of only a small number of domain objects. This dissertation describes an algorithm that exploits these observations to achieve the following objectives: 1. Investigate learning of object models from a small (3 - 8) number of images. Robots consider objects that move to be interesting, efficiently identifying corresponding image regions using motion cues. 2. Exploit complementary strengths of appearance-based and contextual visual cues to efficiently learn representative models of these objects from relevant image regions. 3. Use learned object models in generative models of information fusion and energy minimization algorithms for reliable and efficient recognition of stationary and moving objects in novel scenes with minimal human supervision. These objectives promote incremental learning, enabling robots can acquire and use sensor inputs and human feedback based on need and availability. The object models consist of: spatial arrangements of gradient features, graph-based models of neighborhoods of gradient features, parts-based models of image segments, color distributions, and local context models. Although the visual cues underlying individual components of the object model have been used in other algorithms, our representation of these cues fully exploits their complementary strengths, resulting in reliable and efficient learning and recognition in indoor and outdoor domains. All algorithms are evaluated on wheeled robots in indoor and outdoor domains and on images drawn from benchmark datasets.