Global ETD Search

1	Reverse Engineering Behavioural Models by Filtering out Utilities from Execution Traces Braun, Edna 10 September 2013 (has links) An important issue in software evolution is the time and effort needed to understand existing applications. Reverse engineering software to recover behavioural models is difficult and is complicated due to the lack of a standardized way of extracting and visualizing knowledge. In this thesis, we study a technique for automatically extracting static and dynamic data from software, filtering and analysing the data, and visualizing the behavioural model of a selected feature of a software application. We also investigate the usefulness of the generated diagrams as documentation for the software. We present a literature review of studies that have used static and dynamic data analysis for software comprehension. A set of criteria is created, and each approach, including this thesis’ technique, is compared using those criteria. We propose an approach to simplify lengthy traces by filtering out software components that are too low level to give a high-level picture of the selected feature. We use static information to identify and remove small and simple (or uncomplicated) software components from the trace. We define a utility method as any element of a program designed for the convenience of the designer and implementer and intended to be accessed from multiple places within a certain scope of the program. Utilityhood is defined as the extent to which a particular method can be considered a utility. Utilityhood is calculated using different combinations of selected dynamic and static variables. Methods with high utilityhood values are detected and removed iteratively. By eliminating utilities, we are left with a much smaller trace which is then visualized using the Use Case Map (UCM) notation. UCM is a scenario language used to specify and explain behaviour of complex systems. By doing so, we can identify the algorithm that generates a UCM closest to the mental model of the designers. Although when analysing our results we did not identify an algorithm that was best in all cases, there is a trend in that three of the best four algorithms (out of a total of eight algorithms investigated) used method complexity and method lines of code in their parameters. We also validated the algorithm results by doing a comparison with a list of methods given to us by the creators of the software and doing precision and recall calculations. Seven out of the eight participants agreed or strongly agreed that using UCM diagrams to visualize reduced traces is valid approach, with none who disagreed. Reverse Engineering Behavioural Models Software Maintenance Software Comprehension Trace Compression Trace Visualization Feature Location
2	Reverse Engineering Behavioural Models by Filtering out Utilities from Execution Traces Braun, Edna January 2013 (has links) An important issue in software evolution is the time and effort needed to understand existing applications. Reverse engineering software to recover behavioural models is difficult and is complicated due to the lack of a standardized way of extracting and visualizing knowledge. In this thesis, we study a technique for automatically extracting static and dynamic data from software, filtering and analysing the data, and visualizing the behavioural model of a selected feature of a software application. We also investigate the usefulness of the generated diagrams as documentation for the software. We present a literature review of studies that have used static and dynamic data analysis for software comprehension. A set of criteria is created, and each approach, including this thesis’ technique, is compared using those criteria. We propose an approach to simplify lengthy traces by filtering out software components that are too low level to give a high-level picture of the selected feature. We use static information to identify and remove small and simple (or uncomplicated) software components from the trace. We define a utility method as any element of a program designed for the convenience of the designer and implementer and intended to be accessed from multiple places within a certain scope of the program. Utilityhood is defined as the extent to which a particular method can be considered a utility. Utilityhood is calculated using different combinations of selected dynamic and static variables. Methods with high utilityhood values are detected and removed iteratively. By eliminating utilities, we are left with a much smaller trace which is then visualized using the Use Case Map (UCM) notation. UCM is a scenario language used to specify and explain behaviour of complex systems. By doing so, we can identify the algorithm that generates a UCM closest to the mental model of the designers. Although when analysing our results we did not identify an algorithm that was best in all cases, there is a trend in that three of the best four algorithms (out of a total of eight algorithms investigated) used method complexity and method lines of code in their parameters. We also validated the algorithm results by doing a comparison with a list of methods given to us by the creators of the software and doing precision and recall calculations. Seven out of the eight participants agreed or strongly agreed that using UCM diagrams to visualize reduced traces is valid approach, with none who disagreed. Reverse Engineering Behavioural Models Software Maintenance Software Comprehension Trace Compression Trace Visualization Feature Location
3	CASE STUDY FOR A LIGHTWEIGHT IMPACT ANALYSIS TOOL Lewis, Alice 20 April 2009 (has links) No description available. Computer Science Information Systems
4	Feature Location using Unit Test Coverage in an Agile Development Environment DeLozier, Gregory Steven 04 August 2014 (has links) No description available. Computer Science software feature location unit tests agile extreme programming code coverage test coverage
5	Limits on visual working memory for feature-location bound objects in early development: representational capacity, stability, complexity, and fidelity Applin, Jessica B. 30 September 2022 (has links) Tracking the identity of occluded objects requires binding an object’s features to its location to represent exactly which objects are located where, relying heavily on capacity-limited visual working memory. This dissertation aims to examine the capacity and stability of object working memory, and the complexity and fidelity of object working memory representations, in toddlers and young children. A series of four experiments used a novel task to examine 28- to 40-month-old toddlers’ and 5- to 6-year-old children’s visual working memory recall of specific objects in specific locations. I predicted capacity limits would vary with age, presentation/occlusion type, and complexity, and that older children would be able to monitor these limits successfully. Children observed arrays of featurally-distinct objects that were hidden from view either simultaneously (Chapter 2, Experiment 1 and Chapter 3, Experiments 1 & 2) or sequentially (Chapter 2, Experiment 2) and were asked to recall an object’s location. When objects were hidden simultaneously, toddlers showed a capacity of 3 feature-location bindings (Chapter 2, Experiment 1) and 5- to 6-year-old children showed a capacity of 4 feature-location bindings (Chapter 3, Experiment 1), and both showed capacity development, supporting the hypotheses. When objects were hidden sequentially, toddlers’ performance was impacted by whether they had the easier (set size 2) or harder (set size 3) block first, suggesting the structure of the task may have influenced how children divided attention between maintaining and encoding of representations in working memory. Additionally, in Chapter 3, the number of feature bindings that children had to maintain was varied. Children could remember more single-feature objects than multi-feature objects (limit of 4 vs. 3, respectively), suggesting that binding additional features to a representation taxes cognitive resources, as hypothesized. Finally, the study in Chapter 3 explored children’s ability to monitor the fidelity of their visual working memories by asking them to gauge their confidence by placing bets with tangible, at-risk resources. Children modulated their bets appropriately, betting more after providing correct answers and fewer after incorrect answers, as hypothesized. Together, these data help to inform our understanding of visual working memory for feature-location bound objects across early development. Developmental psychology Feature-location binding Metacognitive monitoring Metamemory Object representation Visual working memory
6	A New Feature Coding Scheme for Video-Content Matching Tasks Qiao, Yingchan January 2017 (has links) This thesis present a new feature coding scheme for video-content matching tasks. The purpose of this feature coding scheme is to compress features under a strict bitrate budget. Features contain two parts of information: the descriptors and the feature locations. We propose a variable level scalar quantizer for descriptors and a variable block size location coding scheme for feature locations. For descriptor coding, the SIFT descriptors are transformed using Karhunen-Loéve Transform (KLT). This K-L transformation matrix is trained using the descriptors extracted from the 25K-MIRFLICKR image dataset. The quantization of descriptors is applied after descriptor transformation. Our proposed descriptor quantizer allocates different bitrates to the elements in the transformed descriptor according to the sequence order. We establish the correlation between the descriptor quantizer distortion and the video matching performance, given a strict bitrate budget. Our feature location coding scheme is built upon the location histogram coding method. Instead of using uniform block size, we use different sizes of blocks to quantize different areas of a video frame. We have achieved nearly 50% reduction in the bitrate allocated for location information compared to the bitrate allocated by the coding schemes that use uniform block size. With this location coding scheme, we achieve almost the same video matching performance as that of the uniform block size coding. By combining the descriptor and location coding schemes, experimental results have shown that the overall feature coding scheme achieves excellent video matching performance. / Thesis / Master of Applied Science (MASc)
7	Révéler le contenu latent du code source : à la découverte des topoi de programme / Unveiling source code latent knowledge : discovering program topoi Ieva, Carlo 23 November 2018 (has links) Le développement de projets open source à grande échelle implique de nombreux développeurs distincts qui contribuent à la création de référentiels de code volumineux. À titre d'exemple, la version de juillet 2017 du noyau Linux (version 4.12), qui représente près de 20 lignes MLOC (lignes de code), a demandé l'effort de 329 développeurs, marquant une croissance de 1 MLOC par rapport à la version précédente. Ces chiffres montrent que, lorsqu'un nouveau développeur souhaite devenir un contributeur, il fait face au problème de la compréhension d'une énorme quantité de code, organisée sous la forme d'un ensemble non classifié de fichiers et de fonctions.Organiser le code de manière plus abstraite, plus proche de l'homme, est une tentative qui a suscité l'intérêt de la communauté du génie logiciel. Malheureusement, il n’existe pas de recette miracle ou bien d’outil connu pouvant apporter une aide concrète dans la gestion de grands bases de code.Nous proposons une approche efficace à ce problème en extrayant automatiquement des topoi de programmes, c'est à dire des listes ordonnées de noms de fonctions associés à un index de mots pertinents. Comment se passe le tri? Notre approche, nommée FEAT, ne considère pas toutes les fonctions comme égales: certaines d'entre elles sont considérées comme une passerelle vers la compréhension de capacités de haut niveau observables d'un programme. Nous appelons ces fonctions spéciales points d’entrée et le critère de tri est basé sur la distance entre les fonctions du programme et les points d’entrée. Notre approche peut être résumée selon ses trois étapes principales : 1) Preprocessing. Le code source, avec ses commentaires, est analysé pour générer, pour chaque unité de code (un langage procédural ou une méthode orientée objet), un document textuel correspondant. En outre, une représentation graphique de la relation appelant-appelé (graphe d'appel) est également créée à cette étape. 2) Clustering. Les unités de code sont regroupées au moyen d’une classification par clustering hiérarchique par agglomération (HAC). 3) Sélection du point d’entrée. Dans le contexte de chaque cluster, les unités de code sont classées et celles placées à des positions plus élevées constitueront un topos de programme.La contribution de cette thèse est triple: 1) FEAT est une nouvelle approche entièrement automatisée pour l'extraction de topoi de programme, basée sur le regroupement d'unités directement à partir du code source. Pour exploiter HAC, nous proposons une distance hybride originale combinant des éléments structurels et sémantiques du code source. HAC requiert la sélection d’une partition parmi toutes celles produites tout au long du processus de regroupement. Notre approche utilise un critère hybride basé sur la graph modularity et la cohérence textuelle pour sélectionner automatiquement le paramètre approprié. 2) Des groupes d’unités de code doivent être analysés pour extraire le programme topoi. Nous définissons un ensemble d'éléments structurels obtenus à partir du code source et les utilisons pour créer une représentation alternative de clusters d'unités de code. L’analyse en composantes principales, qui permet de traiter des données multidimensionnelles, nous permet de mesurer la distance entre les unités de code et le point d’entrée idéal. Cette distance est la base du classement des unités de code présenté aux utilisateurs finaux. 3) Nous avons implémenté FEAT comme une plate-forme d’analyse logicielle polyvalente et réalisé une étude expérimentale sur une base ouverte de 600 projets logiciels. Au cours de l’évaluation, nous avons analysé FEAT sous plusieurs angles: l’étape de mise en grappe, l’efficacité de la découverte de topoi et l’évolutivité de l’approche. / During the development of long lifespan software systems, specification documents can become outdated or can even disappear due to the turnover of software developers. Implementing new software releases or checking whether some user requirements are still valid thus becomes challenging. The only reliable development artifact in this context is source code but understanding source code of large projects is a time- and effort- consuming activity. This challenging problem can be addressed by extracting high-level (observable) capabilities of software systems. By automatically mining the source code and the available source-level documentation, it becomes possible to provide a significant help to the software developer in his/her program understanding task.This thesis proposes a new method and a tool, called FEAT (FEature As Topoi), to address this problem. Our approach automatically extracts program topoi from source code analysis by using a three steps process: First, FEAT creates a model of a software system capturing both structural and semantic elements of the source code, augmented with code-level comments; Second, it creates groups of closely related functions through hierarchical agglomerative clustering; Third, within the context of every cluster, functions are ranked and selected, according to some structural properties, in order to form program topoi.The contributions of the thesis is three-fold:1) The notion of program topoi is introduced and discussed from a theoretical standpoint with respect to other notions used in program understanding ;2) At the core of the clustering method used in FEAT, we propose a new hybrid distance combining both semantic and structural elements automatically extracted from source code and comments. This distance is parametrized and the impact of the parameter is strongly assessed through a deep experimental evaluation ;3) Our tool FEAT has been assessed in collaboration with Software Heritage (SH), a large-scale ambitious initiative whose aim is to collect, preserve and, share all publicly available source code on earth. We performed a large experimental evaluation of FEAT on 600 open source projects of SH, coming from various domains and amounting to more than 25 MLOC (million lines of code).Our results show that FEAT can handle projects of size up to 4,000 functions and several hundreds of files, which opens the door for its large-scale adoption for program understanding. Apprentissage automatique Clustering hiérarchique Programme compréhension Machine Learning Hierarchical Clustering Program Comprehension Feature location Feature Extraction
8	Investigating topic modeling techniques for historical feature location. Schulte, Lukas January 2021 (has links) Software maintenance and the understanding of where in the source code features are implemented are two strongly coupled tasks that make up a large portion of the effort spent on developing applications. The concept of feature location investigated in this thesis can serve as a supporting factor in those tasks as it facilitates the automation of otherwise manual searches for source code artifacts. Challenges in this subject area include the aggregation and composition of a training corpus from historical codebase data for models as well as the integration and optimization of qualified topic modeling techniques. Building up on previous research, this thesis provides a comparison of two different techniques and introduces a toolkit that can be used to reproduce and extend on the results discussed. Specifically, in this thesis a changeset-based approach to feature location is pursued and applied to a large open-source Java project. The project is used to optimize and evaluate the performance of Latent Dirichlet Allocation models and Pachinko Allocation models, as well as to compare the accuracy of the two models with each other. As discussed at the end of the thesis, the results do not indicate a clear favorite between the models. Instead, the outcome of the comparison depends on the metric and viewpoint from which it is assessed. feature location topic modeling changesets latent dirichlet distribution pachinko alloca-tion mining software repositories source code comprehension Software Engineering Programvaruteknik
9	Construction de lignes de produits logiciels par rétro-ingénierie de modèles de caractéristiques à partir de variantes de logiciels : l'approche REVPLINE / Reverse Engineering Feature Models From Software Variants to Build Software Product Lines : RIVEPLINE Approach Al-Msie' Deen, Ra'Fat 24 June 2014 (has links) Les lignes de produits logicielles constituent une approche permettant de construire et de maintenir une famille de produits logiciels similaires mettant en œuvre des principes de réutilisation. Ces principes favorisent la réduction de l'effort de développement et de maintenance, raccourcissent le temps de mise sur le marché et améliorent la qualité globale du logiciel. La migration de produits logiciels similaires vers une ligne de produits demande de comprendre leurs similitudes et leurs différences qui s'expriment sous forme de caractéristiques (features) offertes. Dans cette thèse, nous nous intéressons au problème de la construction d'une ligne de produits à partir du code source de ses produits et de certains artefacts complémentaires comme les diagrammes de cas d'utilisation, quand ils existent. Nous proposons des contributions sur l'une des étapes principales dans cette construction, qui consiste à extraire et à organiser un modèle de caractéristiques (feature model) dans un mode automatisé. La première contribution consiste à extraire des caractéristiques dans le code source de variantes de logiciels écrits dans le paradigme objet. Trois techniques sont mises en œuvre pour parvenir à cet objectif : l'Analyse Formelle de Concepts, l'Indexation Sémantique Latente et l'analyse des dépendances structurelles dans le code. Elles exploitent les parties communes et variables au niveau du code source. La seconde contribution s'attache à documenter une caractéristique extraite par un nom et une description. Elle exploite le code source mais également les diagrammes de cas d'utilisation, qui contiennent, en plus de l'organisation logique des fonctionnalités externes, des descriptions textuelles de ces mêmes fonctionnalités. En plus des techniques précédentes, elle s'appuie sur l'Analyse Relationnelle de Concepts afin de former des groupes d'entités d'après leurs relations. Dans la troisième contribution, nous proposons une approche visant à organiser les caractéristiques, une fois documentées, dans un modèle de caractéristiques. Ce modèle de caractéristiques est un arbre étiqueté par des opérations et muni d'expressions logiques qui met en valeur les caractéristiques obligatoires, les caractéristiques optionnelles, des groupes de caractéristiques (groupes ET, OU, OU exclusif), et des contraintes complémentaires textuelles sous forme d'implication ou d'exclusion mutuelle. Ce modèle est obtenu par analyse d'une structure obtenue par Analyse Formelle de Concepts appliquée à la description des variantes par les caractéristiques. L'approche est validée sur trois cas d'étude principaux : ArgoUML-SPL, Health complaint-SPL et Mobile media. Ces cas d'études sont déjà des lignes de produits constituées. Nous considérons plusieurs produits issus de ces lignes comme s'ils étaient des variantes de logiciels, nous appliquons notre approche, puis nous évaluons son efficacité par comparaison entre les modèles de caractéristiques extraits automatiquement et les modèles de caractéristiques initiaux (conçus par les développeurs des lignes de produits analysées). / The idea of Software Product Line (SPL) approach is to manage a family of similar software products in a reuse-based way. Reuse avoids repetitions, which helps reduce development/maintenance effort, shorten time-to-market and improve overall quality of software. To migrate from existing software product variants into SPL, one has to understand how they are similar and how they differ one from another. Companies often develop a set of software variants that share some features and differ in other ones to meet specific requirements. To exploit existing software variants and build a software product line, a feature model must be built as a first step. To do so, it is necessary to extract mandatory and optional features in addition to associate each feature with its name. Then, it is important to organize the mined and documented features into a feature model. In this context, our thesis proposes three contributions.Thus, we propose, in this dissertation as a first contribution a new approach to mine features from the object-oriented source code of a set of software variants based on Formal Concept Analysis, code dependency and Latent Semantic Indexing. The novelty of our approach is that it exploits commonality and variability across software variants, at source code level, to run Information Retrieval methods in an efficient way. The second contribution consists in documenting the mined feature implementations based on Formal Concept Analysis, Latent Semantic Indexing and Relational Concept Analysis. We propose a complementary approach, which aims to document the mined feature implementations by giving names and descriptions, based on the feature implementations and use-case diagrams of software variants. The novelty of our approach is that it exploits commonality and variability across software variants, at feature implementations and use-cases levels, to run Information Retrieval methods in an efficient way. In the third contribution, we propose an automatic approach to organize the mined documented features into a feature model. Features are organized in a tree which highlights mandatory features, optional features and feature groups (and, or, xor groups). The feature model is completed with requirement and mutual exclusion constraints. We rely on Formal Concept Analysis and software configurations to mine a unique and consistent feature model. To validate our approach, we applied it on three case studies: ArgoUML-SPL, Health complaint-SPL, Mobile media software product variants. The results of this evaluation validate the relevance and the performance of our proposal as most of the features and its constraints were correctly identified. Ingénierie des lignes de produits Variante de logiciel Réingénierie Identification de caractéristique Modèle de caractéristiques Variabilité Software Product Line Engineering Software Product Variants Re-Engineering Feature mining Feature location Feature model
10	Supporting Source Code Comprehension During Software Evolution and Maintenance Alhindawi, Nouh Talal 30 July 2013 (has links) No description available. Computer Science Software Comprehension Software Evolution Software Maintenance Information Retrieval Latent Semantic Indexing Stereotypes Traceability Commits Topic Modeling Corpus Feature Location Concept Location Source Code Quering

Search results