The ability to visualize a solution space can be very beneficial, and it is generally accepted that the objective of visualization is to aid researchers in gathering insight. However, insight cannot be gathered effectively if the source data is misrepresented. This dissertation begins by demonstrating that the adaptive landscape visualization in widespread usage frequently misrepresents the neighborhood structure of genotypic space and, consequently, will mislead users about the manner in which solution space is traversed by the genetic algorithm. Bernhard Riemann, the father of topology, explicitly noted that a measurement of the distance between entities should represent the manner in which one can be brought towards the other. Thus, the commonly used Hamming distance, for example, is not representative of traversals of genotypic space by the genetic algorithm – a representative measure must include consideration for both mutation and recombination. This dissertation separately explores the properties that mutational and recombinational distances should have, and ultimately establishes a measure that is representative of the traversals made by both operators simultaneously.
It follows that these measures can be used to enhance the adaptive landscape, by minimizing the discrepancy between the interpoint distances in genotypic space and the interpoint distances in the two-dimensional representation from which the landscape is extruded. This research also establishes a methodology for evaluating measures defining neighbourhood structures that are purportedly representative of traversals of genotypic space, by comparing them against an empirically generated norm. Through this approach it is conclusively demonstrated that the Hamming distance between genotypes is less representative than the proposed measures, and should not be used to define the neighbourhood structure from which visualizations would be constructed.
While the proposed measures do not distort the data or otherwise mislead the user, they do require a significant computational expense. Fortunately, the choice to use these measures is always made at the discretion of the user, with additional costs incurred when accuracy and representativity are of paramount importance. These measures will ultimately find further application in population diversity measurement, cluster analysis, and any other task where the representativity of the neighborhood structure of the genotypic space is vital.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OGU.10214/3565 |
Date | 04 May 2012 |
Creators | Collier, Robert |
Contributors | Wineberg, Mark |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Thesis |
Rights | http://creativecommons.org/licenses/by-nc-sa/2.5/ca/ |
Page generated in 0.0096 seconds