Return to search

Assessing program code through static structural similarity

Learning to write software requires much practice and frequent assessment. Consequently, the use of computers to assist in the assessment of computer programs has been important in supporting large classes at universities. The main approaches to the problem are dynamic analysis (testing student programs for expected output) and static analysis (direct analysis of the program code). The former is very sensitive to all kinds of errors in student programs, while the latter has traditionally only been used to assess quality, and not correctness. This research focusses on the application of static analysis, particularly structural similarity, to marking student programs. Existing traditional measures of similarity are limiting in that they are usually only effective on tree structures. In this regard they do not easily support dependencies in program code. Contemporary measures of structural similarity, such as similarity flooding, usually rely on an internal normalisation of scores. The effect is that the scores only have relative meaning, and cannot be interpreted in isolation, ie. they are not meaningful for assessment. The SimRank measure is shown to have the same problem, but not because of normalisation. The problem with the SimRank measure arises from the fact that its scores depend on all possible mappings between the children of vertices being compared. The main contribution of this research is a novel graph similarity measure, the Weighted Assignment Similarity measure. It is related to SimRank, but derives propagation scores from only the locally optimal mapping between child vertices. The resulting similarity scores may be regarded as the percentage of mutual coverage between graphs. The measure is proven to converge for all directed acyclic graphs, and an efficient implementation is outlined for this case. Attributes on graph vertices and edges are often used to capture domain specific information which is not structural in nature. It has been suggested that these should influence the similarity propagation, but no clear method for doing this has been reported. The second important contribution of this research is a general method for incorporating these local attribute similarities into the larger similarity propagation method. An example of attributes in program graphs are identifier names. The choice of identifiers in programs is arbitrary as they are purely symbolic. A problem facing any comparison between programs is that they are unlikely to use the same set of identifiers. This problem indicates that a mapping between the identifier sets is required. The third contribution of this research is a method for applying the structural similarity measure in a two step process to find an optimal identifier mapping. This approach is both novel and valuable as it cleverly reuses the similarity measure as an existing resource. In general, programming assignments allow a large variety of solutions. Assessing student programs through structural similarity is only feasible if the diversity in the solution space can be addressed. This study narrows program diversity through a set of semantic preserving program transformations that convert programs into a normal form. The application of the Weighted Assignment Similarity measure to marking student programs is investigated, and strong correlations are found with the human marker. It is shown that the most accurate assessment requires that programs not only be compared with a set of good solutions, but rather a mixed set of programs of varying levels of correctness. This research represents the first documented successful application of structural similarity to the marking of student programs.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:nmmu/vital:10478
Date January 2007
CreatorsNaude, Kevin Alexander
PublisherNelson Mandela Metropolitan University, Faculty of Science
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeThesis, Masters, MSc
Formatviii, 114 leaves ; 30 cm, pdf
RightsNelson Mandela Metropolitan University

Page generated in 0.0027 seconds