XML files are frequently used by developers when building Web applications or Java EE applications. However, maintaining XML files is challenging and time-consuming because the correct usage of XML entities is always domain-specific and rarely well documented. Also, the existing compilers and program analysis tools seldom examine XML files. In this thesis, we developed a novel approach to XML file debugging called Xeditor where we extract XML consistency rules from open-source projects and use these rules to detect XML bugs. There are two phases in Xeditor: rule inference and application. To infer rules, Xeditor mines XML-based deployment descriptors in open-source projects, extracting XML entity pairs that frequently co-exist in the same files and refer to the same string literals. Xeditor then applies association rule mining to the extracted pairs. For rule application, given a program commit, Xeditor checks whether any updated XML file violates the inferred rules; if so, Xeditor reports the violation and suggests an edit for correction?. Our evaluation shows that Xeditor inferred rules with high precision (83%). For injected XML bugs, Xeditor detected rule violations and suggested changes with 74.6% precision, 50% recall. More importantly, Xeditor identified 31 really erroneous XML updates in version history, 17 of which updates were fixed by developers in later program commits. This observation implies that by using Xeditor, developers would have avoided introducing errors when writing XML files. Finally, we compared Xeditor with a baseline approach that suggests changes based on frequently co-changed entities, and found Xeditor to outperform the baseline for both rule inference and rule application. / XML files are frequently used in Java programming and when building Web application implementation. However, it is a challenge to maintain XML files since these files should follow various domain-specific rules and the existing program analysis tools seldom check XML files. In this thesis, we introduce a new approach to XML file debugging called Xeditor that extracts XML consistency rules from open-source projects and uses these rules to detect XML bugs. To extract the rules, Xeditor first looks at working XML files and finds all the pairs of entities A and B, which entities coexist in one file and have the same value on at least one occasion. Then Xeditor will check when A occurs, what is the probability that B also occurs. If the probability is high enough, Xeditor infers a rule that A is associated with B. To apply the rule, Xeditor checks XML files with errors. If a file violates the rules that were previously inferred, Xeditor will report the violation and suggest a change. Our evaluation shows that Xeditor inferred the correct rules with high precision 83%. More importantly, Xeditor identified issues in previous versions of XML files, and many of those issues were fixed by developers in later versions. Therefore, Xeditor is able to help find and fix errors when developers write their XML files.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/96481 |
Date | 12 1900 |
Creators | Wen, Chengyuan |
Contributors | Computer Science, Meng, Na, Tilevich, Eli, Servant Cortes, Francisco Javier |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Language | en_US |
Detected Language | English |
Type | Thesis |
Format | ETD, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0022 seconds