This thesis develops a self-adaptive architecture for image understanding that addresses certain kinds of lack of robustness common in image understanding programs. The architecture provides support for making image understanding programs that can manipulate their own semantics and thereby adjust their structure in response to changes in the environment that might cause static image understanding systems to fail. The general approach taken has been to explore the ideas of self-adaptive software and implement an architectural framework that addresses a class of problems that we term "interpretation problems" common in image understanding. Self-adaptive software is a relatively new idea and this thesis represents one of the first implementations of the general idea. The general idea is that to make programs robust to changing environmen- tal conditions that they should be "aware" of their relationship with the environment and be able to restructure themselves at runtime in order to "track" changes in the environment. The implementation takes the form of a multi-layered reflective interpreter that ma- nipulates and runs simple agents. The interpreter framework utilizes Monte-Carlo sam- pling as a mechanism for estimating most likely solutions, uses Minimum Descriptin Length (MDL) as a central coordinating device, and includes a theorem prover based compiler to restructure the program when necessary. To test the architectural ideas developed in the thesis a test domain of interpreting aerial images was chosen. Much of the research described in the thesis addresses issues in that problem domain. The task of the program is to segment, label, and parse aerial images so as to produce an image description similar to descriptions produced by a human expert. An image corpus is developed that is used as the source of domain knowledge. The first processing stage of the program segments the aerial images into segments similar to those found in the annotated corpus. To accomplish this a new segmentation algorithm that we call semantic segmentation was developed that not only used MDL as a principle to drive the low-level segmentation but also allows higher level semantics to influence the segmentation. In our usage of the algorithm those semantics take the form of labeling and parsing the resulting segmentation. The second stage labels the regions and parses the regions into a parse tree. To do this we develop a 2D statistical parser. Rules of grammar are induced from the corpus and an MDL parser finds approximations to the most probable parse of the regions of the segmented image.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:393372 |
Date | January 2001 |
Creators | Robertson, Paul |
Contributors | Brady, Michael |
Publisher | University of Oxford |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | http://ora.ox.ac.uk/objects/uuid:01f16b87-63be-4b55-9e52-14738fefed57 |
Page generated in 0.0018 seconds