Global ETD Search

1	Using Machine Learning Techniques to Improve Static Code Analysis Tools Usefulness Alikhashashneh, Enas A. 08 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / This dissertation proposes an approach to reduce the cost of manual inspections for as large a number of false positive warnings that are being reported by Static Code Analysis (SCA) tools as much as possible using Machine Learning (ML) techniques. The proposed approach neither assume to use the particular SCA tools nor depends on the specific programming language used to write the target source code or the application. To reduce the number of false positive warnings we first evaluated a number of SCA tools in terms of software engineering metrics using a highlighted synthetic source code named the Juliet test suite. From this evaluation, we concluded that the SCA tools report plenty of false positive warnings that need a manual inspection. Then we generated a number of datasets from the source code that forced the SCA tool to generate either true positive, false positive, or false negative warnings. The datasets, then, were used to train four of ML classifiers in order to classify the collected warnings from the synthetic source code. From the experimental results of the ML classifiers, we observed that the classifier that built using the Random Forests (RF) technique outperformed the rest of the classifiers. Lastly, using this classifier and an instance-based transfer learning technique, we ranked a number of warnings that were aggregated from various open-source software projects. The experimental results show that the proposed approach to reduce the cost of the manual inspection of the false positive warnings outperformed the random ranking algorithm and was highly correlated with the ranked list that the optimal ranking algorithm generated. Static code analysis Source code metrics Machine learning False positives Reduction
2	Monitoramento de métricas de código-fonte em projetos de software livre / Source code metrics tracking on free and open source projects Meirelles, Paulo Roberto Miranda 20 May 2013 (has links) Nesta tese de doutorado, apresentamos uma abordagem para a observação das métricas de código-fonte, estudando-as através de suas distribuições e associações, além de discutir as relações de causalidade e implicações práticas-gerenciais para monitoramento das mesmas. Em nossos estudos avaliamos a distribuição e correlações dos valores das métricas de 38 projetos de software livre, dentre os com mais contribuidores ativos em seus repositórios. Para tal, coletamos e analisamos os valores para cada métrica em mais de 344.872 classes e módulos dos projetos avaliados. Complementarmente, para mostrarmos a utilidade do monitoramento de métricas, descrevemos uma extensão e adaptação do modelo de causalidade do conceito de atratividade de projetos de software livre, que indica uma relação estatística entre os valores das métricas de código-fonte e a quantidade de downloads, contribuidores e atualizações (commits) nos repositórios dos projetos. Para isso, realizamos estudos empíricos com milhares de projetos de software livre. Do ponto de vista prático, também contribuímos com um conjunto de ferramentas inovador para a automação da avaliação de projetos de software livre, com ênfase nos estudos e na seleção de métricas, o que permite a análise de código-fonte de acordo com a percepção de qualidade das comunidades de software livre. Entre as principais contribuições desta tese está uma análise detalhada, em relação ao comportamento, valores e estudos de caso, de 15 métricas de código-fonte, o que representa um avanço em comparação a literatura relacionada ao ampliar o número de métricas avaliadas e propor uma abordagem que visa diminuir as contradições das análises das métricas. / In this Ph.D dissertation we present an approach about source code metrics tracking. We have researched source code metrics distributions and associations to discuss their causality and management-practices implications. Our studies have assessed distributions and correlations of source code metric values on 38 free software projects, which have a lot of activated contributors in their repositories. We have collected and analyzed metrics from 344,872 classes and modules of about 38 free software projects. Additionally, to show how it is useful to track source code metrics, we have extended the model of free software attractiveness to include source code metrics. Our technical attractiveness model indicates a statistical relationship between source code metrics and number of downloads, contributors, and commits in the analyzed free software repositories. For that, we have conducted empirical studies with 8,450 free software projects. From a practical point of view, we have contributed with a set of innovative tools for automated evaluation of free software projects. Our tool allow the analyses of source code metrics that mirror quality perceptions from the free software communities point of view. engenharia de software experimental experimental software engineering. free and open source software métricas de código-fonte software livre source code metrics
3	Monitoramento de métricas de código-fonte em projetos de software livre / Source code metrics tracking on free and open source projects Paulo Roberto Miranda Meirelles 20 May 2013 (has links) Nesta tese de doutorado, apresentamos uma abordagem para a observação das métricas de código-fonte, estudando-as através de suas distribuições e associações, além de discutir as relações de causalidade e implicações práticas-gerenciais para monitoramento das mesmas. Em nossos estudos avaliamos a distribuição e correlações dos valores das métricas de 38 projetos de software livre, dentre os com mais contribuidores ativos em seus repositórios. Para tal, coletamos e analisamos os valores para cada métrica em mais de 344.872 classes e módulos dos projetos avaliados. Complementarmente, para mostrarmos a utilidade do monitoramento de métricas, descrevemos uma extensão e adaptação do modelo de causalidade do conceito de atratividade de projetos de software livre, que indica uma relação estatística entre os valores das métricas de código-fonte e a quantidade de downloads, contribuidores e atualizações (commits) nos repositórios dos projetos. Para isso, realizamos estudos empíricos com milhares de projetos de software livre. Do ponto de vista prático, também contribuímos com um conjunto de ferramentas inovador para a automação da avaliação de projetos de software livre, com ênfase nos estudos e na seleção de métricas, o que permite a análise de código-fonte de acordo com a percepção de qualidade das comunidades de software livre. Entre as principais contribuições desta tese está uma análise detalhada, em relação ao comportamento, valores e estudos de caso, de 15 métricas de código-fonte, o que representa um avanço em comparação a literatura relacionada ao ampliar o número de métricas avaliadas e propor uma abordagem que visa diminuir as contradições das análises das métricas. / In this Ph.D dissertation we present an approach about source code metrics tracking. We have researched source code metrics distributions and associations to discuss their causality and management-practices implications. Our studies have assessed distributions and correlations of source code metric values on 38 free software projects, which have a lot of activated contributors in their repositories. We have collected and analyzed metrics from 344,872 classes and modules of about 38 free software projects. Additionally, to show how it is useful to track source code metrics, we have extended the model of free software attractiveness to include source code metrics. Our technical attractiveness model indicates a statistical relationship between source code metrics and number of downloads, contributors, and commits in the analyzed free software repositories. For that, we have conducted empirical studies with 8,450 free software projects. From a practical point of view, we have contributed with a set of innovative tools for automated evaluation of free software projects. Our tool allow the analyses of source code metrics that mirror quality perceptions from the free software communities point of view. engenharia de software experimental métricas de código-fonte software livre experimental software engineering. free and open source software source code metrics
4	USING MACHINE LEARNING TECHNIQUES TO IMPROVE STATIC CODE ANALYSIS TOOLS USEFULNESS Enas Ahmad Alikhashashneh (7013450) 16 October 2019 (has links) <p>This dissertation proposes an approach to reduce the cost of manual inspections for as large a number of false positive warnings that are being reported by Static Code Analysis (SCA) tools as much as possible using Machine Learning (ML) techniques. The proposed approach neither assume to use the particular SCA tools nor depends on the specific programming language used to write the target source code or the application. To reduce the number of false positive warnings we first evaluated a number of SCA tools in terms of software engineering metrics using a highlighted synthetic source code named the Juliet test suite. From this evaluation, we concluded that the SCA tools report plenty of false positive warnings that need a manual inspection. Then we generated a number of datasets from the source code that forced the SCA tool to generate either true positive, false positive, or false negative warnings. The datasets, then, were used to train four of ML classifiers in order to classify the collected warnings from the synthetic source code. From the experimental results of the ML classifiers, we observed that the classifier that built using the Random Forests</p> <p>(RF) technique outperformed the rest of the classifiers. Lastly, using this classifier and an instance-based transfer learning technique, we ranked a number of warnings that were aggregated from various open-source software projects. The experimental results show that the proposed approach to reduce the cost of the manual inspection of the false positive warnings outperformed the random ranking algorithm and was highly correlated with the ranked list that the optimal ranking algorithm generated.</p> Software Engineering

1

Page generated in 0.0944 seconds