Return to search

Qualitative Distances and Qualitative Description of Images for Indoor Scene Description and Recognition in Robotics

The automatic extraction of knowledge from the world by a robotic system as human beings interpret their environment through their senses is still an unsolved task in Artificial Intelligence. A robotic agent is in contact with the world through its sensors and other electronic components which obtain and process mainly numerical information. Sonar, infrared and laser sensors obtain distance information. Webcams obtain digital images that are represented internally as matrices of red, blue and green (RGB) colour coordinate values. All this numerical values obtained from the environment need a later interpretation in order to provide the knowledge required by the robotic agent in order to carry out a task.
Similarly, light wavelengths with specific amplitude are captured by cone cells of human eyes obtaining also stimulus without meaning. However, the information that human beings can describe and remember from what they see is expressed using words, that is qualitatively.
The exact process carried out after our eyes perceive light wavelengths and our brain interpret them is quite unknown. However, a real fact in human cognition is that people go beyond the purely perceptual experience to classify things as members of categories and attach linguistic labels to them.
As the information provided by all the electronic components incorporated in a robotic agent is numerical, the approaches that first appeared in the literature giving an interpretation of this information followed a mathematical trend. In this thesis, this problem is addressed from the other side, its main aim is to process these numerical data in order to obtain qualitative information as human beings can do.
The research work done in this thesis tries to narrow the gap between the acquisition of low level information by robot sensors and the need of obtaining high level or qualitative information for enhancing human-machine communication and for applying logical reasoning processes based on concepts. Moreover, qualitative concepts can be added a meaning by relating them to others. They can be used for reasoning applying qualitative models that have been developed in the last twenty years for describing and interpreting metrical and mathematical concepts such as orientation, distance, velocity, acceleration, and so on. And they can be also understood by human-users both written and read aloud.
The first contributions presented are the definition of a method for obtaining fuzzy distance patterns (which include qualitative distances such as ‘near’, far’, ‘very far’ and so on) from the data obtained by any kind of distance sensors incorporated in a mobile robot and the definition of a factor to measure the dissimilarity between those fuzzy patterns. Both have been applied to the integration of the distances obtained by the sonar and laser distance sensors incorporated in a Pioneer 2 dx mobile robot and, as a result, special obstacles have been detected as ‘glass window’, ‘mirror’, and so on. Moreover, the fuzzy distance patterns provided have been also defuzzified in order to obtain a smooth robot speed and used to classify orientation reference systems into ‘open’ (it defines an open space to be explored) or ‘closed’.
The second contribution presented is the definition of a model for qualitative image description (QID) by applying the new defined models for qualitative shape and colour description and the topology model by Egenhofer and Al-Taha [1992] and the orientation models by Hernández [1991] and Freksa [1992]. This model can qualitatively describe any kind of digital image and is independent of the image segmentation method used. The QID model have been tested in two scenarios in robotics: (i) the description of digital images captured by the camera of a Pioneer 2 dx mobile robot and (ii) the description of digital images of tile mosaics taken by an industrial camera located on a platform used by a robot arm to assemble tile mosaics.
In order to provide a formal and explicit meaning to the qualitative description of the images generated, a Description Logic (DL) based ontology has been designed and presented as the third contribution. Our approach can automatically process any random image and obtain a set of DL-axioms that describe it visually and spatially. And objects included in the images are classified according to the ontology schema using a DL reasoner. Tests have been carried out using digital images captured by a webcam incorporated in a Pioneer 2 dx mobile robot. The images taken correspond to the corridors of a building at University Jaume I and objects with them have been classified into ‘walls’, ‘floor’, ‘office doors’ and ‘fire extinguishers’ under different illumination conditions and from different observer viewpoints.
The final contribution is the definition of a similarity measure between qualitative descriptions of shape, colour, topology and orientation. And the integration of those measures into the definition of a general similarity measure between two qualitative descriptions of images. These similarity measures have been applied to: (i) extract objects with similar shapes from the MPEG7 CE Shape-1 library; (ii) assemble tile mosaics by qualitative shape and colour similarity matching; (iii) compare images of tile compositions; and (iv) compare images of natural landmarks in a mobile robot world for their recognition.
The contributions made in this thesis are only a small step forward in the direction of enhancing robot knowledge acquisition from the world. And it is also written with the aim of inspiring others in their research, so that bigger contributions can be achieved in the future which can improve the life quality of our society.

Identiferoai:union.ndltd.org:TDX_UJI/oai:www.tdx.cat:10803/52897
Date28 November 2011
CreatorsFalomir Llansola, Zoe
ContributorsEscrig Monferrer, M.Teresa, Freksa, Christian, Universitat Jaume I. Departament d'Enginyeria i Ciència dels Computadors, Universität Bremen
PublisherUniversitat Jaume I
Source SetsUniversitat Jaume I
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/doctoralThesis, info:eu-repo/semantics/publishedVersion
Format209 p., application/pdf
SourceTDX (Tesis Doctorals en Xarxa)
Rightsinfo:eu-repo/semantics/openAccess, ADVERTIMENT. L'accés als continguts d'aquesta tesi doctoral i la seva utilització ha de respectar els drets de la persona autora. Pot ser utilitzada per a consulta o estudi personal, així com en activitats o materials d'investigació i docència en els termes establerts a l'art. 32 del Text Refós de la Llei de Propietat Intel·lectual (RDL 1/1996). Per altres utilitzacions es requereix l'autorització prèvia i expressa de la persona autora. En qualsevol cas, en la utilització dels seus continguts caldrà indicar de forma clara el nom i cognoms de la persona autora i el títol de la tesi doctoral. No s'autoritza la seva reproducció o altres formes d'explotació efectuades amb finalitats de lucre ni la seva comunicació pública des d'un lloc aliè al servei TDX. Tampoc s'autoritza la presentació del seu contingut en una finestra o marc aliè a TDX (framing). Aquesta reserva de drets afecta tant als continguts de la tesi com als seus resums i índexs.

Page generated in 0.1462 seconds