Global ETD Search

101	USING MACHINE LEARNING TECHNIQUES TO IMPROVE STATIC CODE ANALYSIS TOOLS USEFULNESS Enas Ahmad Alikhashashneh (7013450) 16 October 2019 (has links) <p>This dissertation proposes an approach to reduce the cost of manual inspections for as large a number of false positive warnings that are being reported by Static Code Analysis (SCA) tools as much as possible using Machine Learning (ML) techniques. The proposed approach neither assume to use the particular SCA tools nor depends on the specific programming language used to write the target source code or the application. To reduce the number of false positive warnings we first evaluated a number of SCA tools in terms of software engineering metrics using a highlighted synthetic source code named the Juliet test suite. From this evaluation, we concluded that the SCA tools report plenty of false positive warnings that need a manual inspection. Then we generated a number of datasets from the source code that forced the SCA tool to generate either true positive, false positive, or false negative warnings. The datasets, then, were used to train four of ML classifiers in order to classify the collected warnings from the synthetic source code. From the experimental results of the ML classifiers, we observed that the classifier that built using the Random Forests</p> <p>(RF) technique outperformed the rest of the classifiers. Lastly, using this classifier and an instance-based transfer learning technique, we ranked a number of warnings that were aggregated from various open-source software projects. The experimental results show that the proposed approach to reduce the cost of the manual inspection of the false positive warnings outperformed the random ranking algorithm and was highly correlated with the ranked list that the optimal ranking algorithm generated.</p> Software Engineering
102	Enabling Timing Analysis of Complex Embedded Software Systems Kraft, Johan January 2010 (has links) Cars, trains, trucks, telecom networks and industrial robots are examples of products relying on complex embedded software systems, running on embedded computers. Such systems may consist of millions of lines of program code developed by hundreds of engineers over many years, often decades. Over the long life-cycle of such systems, the main part of the product development costs is typically not the initial development, but the software maintenance, i.e., improvements and corrections of defects, over the years. Of the maintenance costs, a major cost is the verification of the system after changes has been applied, which often requires a huge amount of testing. However, today's techniques are not sufficient, as defects often are found post-release, by the customers. This area is therefore of high relevance for industry. Complex embedded systems often control machinery where timing is crucial for accuracy and safety. Such systems therefore have important requirements on timing, such as maximum response times. However, when maintaining complex embedded software systems, it is difficult to predict how changes may impact the system's run-time behavior and timing, e.g., response times.Analytical and formal methods for timing analysis exist, but are often hard to apply in practice on complex embedded systems, for several reasons. As a result, the industrial practice in deciding the suitability of a proposed change, with respect to its run-time impact, is to rely on the subjective judgment of experienced developers and architects. This is a risky and inefficient, trial-and-error approach, which may waste large amounts of person-hours on implementing unsuitable software designs, with potential timing- or performance problems. This can generally not be detected at all until late stages of testing, when the updated software system can be tested on system level, under realistic conditions. Even then, it is easy to miss such problems. If products are released containing software with latent timing errors, it may cause huge costs, such as car recalls, or even accidents. Even when such problems are found using testing, they necessitate design changes late in the development project, which cause delays and increases the costs. This thesis presents an approach for impact analysis with respect to run-time behavior such as timing and performance for complex embedded systems. The impact analysis is performed through optimizing simulation, where the simulation models are automatically generated from the system implementation. This approach allows for predicting the consequences of proposed designs, for new or modified features, by prototyping the change in the simulation model on a high level of abstraction, e.g., by increasing the execution time for a particular task. Thereby, designs leading to timing-, performance-, or resource usage problems can be identified early, before implementation, and a late redesigns are thereby avoided, which improves development efficiency and predictability, as well as software quality. The contributions presented in this thesis is within four areas related to simulation-based analysis of complex embedded systems: (1) simulation and simulation optimization techniques, (2) automated model extraction of simulation models from source code, (3) methods for validation of such simulation models and (4) run-time recording techniques for model extraction, impact analysis and model validation purposes. Several tools has been developed during this work, of which two are in commercialization in the spin-off company Percepio AB. Note that the Katana approach, in area (2), is subject for a recent patent application - patent pending. / PROGRESS Embedded-systems Real-time-systems Timing-analysis Simulation Simulation-optimization Simulation-Model-Extraction Source-code-analysis Run-time-monitoring Model-validation Computer science Datavetenskap
103	The Open Source Revolution: Transforming the Software Industry with Help from the Government Stoltz, Mitchell L. 30 April 1999 (has links) A new method for making software is stealthily gaining ground in the computer industry, offering a promise of better, cheaper software and the empowerment of the user. The open source movement could revolutionize the software industry...if it succeeds. Open source means software that you are allowed to copy, modify, and give to friends. Source code , the lists of instructions which tell computers how to run, is readily available, allowing you to look inside the workings of a program and change it to suit your needs. A group of programmers, companies, users, and activists have gathered in support of this empowering technology, seeking to persuade businesses and users that open source is the way to go. However, open source faces stiff challenges. The economic basis for the software industry is to charge users by the copy when they buy software. Copying and modification are illegal. The industry and its customers are so mired in this worldview that the idea of giving out a program's "recipe," along with a license to change or copy it at will, seems preposterous. Powerful players in the software industry, such as Microsoft, see open source as a threat to their bottom line, and have devoted their energies to discrediting and marginalizing the movement. Beginning from the assumption that cheap, reliable software that empowers the user is a good thing, this thesis looks at the claims made by advocates about the benefits of open source. I explore how the advocates make their case to the business world, the public, and government. I also look at ways in which the government could help bring about an open source revolution, using the policy tools of procurement, research funding, standards enforcement, and antitrust law. I conclude that programmers and public interest lobbyists must join forces to carry this revolution forward, and that the time for action is now, while Microsoft is on trial. Open Source Code Microsoft Open source software intellectual Property Computer Sciences Public Policy Science and Technology Policy Software Engineering
104	Power Analysis and Low Power Scheduling Techniques for Intelligent Memory System Cheng, Lien-Fu 27 July 2001 (has links) Power consumption is gradually becoming an important issue of designing computing systems. Most of the researches of low power issues have focused on semiconductor techniques or hardware architecture designs, but less utilized the techniques of software optimization. This paper presents a new scheduling methodology in source code level for Intelligent Memory System, which reduces the energy consumption by means of code compilation techniques. The scheduling kernel provides two options for users: performance-oriented low power scheduling and energy-oriented low power scheduling, to achieve the objective of considering high performance and low power issues. The experimental results are also presented and discussed. Energy-oriented low power scheduling Intelligent Memory System
105	Correlation-based communication in wireless multimedia sensor networks Dai, Rui 19 August 2011 (has links) Wireless multimedia sensor networks (WMSNs) are networks of interconnected devices that allow retrieving video and audio streams, still images, and scalar data from the environment. In a densely deployed WMSN, there exists correlation among the observations of camera sensors with overlapped coverage areas, which introduces substantial data redundancy in the network. In this dissertation, efficient communication schemes are designed for WMSNs by leveraging the correlation of visual information observed by camera sensors. First, a spatial correlation model is developed to estimate the correlation of visual information and the joint entropy of multiple correlated camera sensors. The compression performance of correlated visual information is then studied. An entropy-based divergence measure is proposed to predict the compression efficiency of performing joint coding on the images from correlated cameras. Based on the predicted compression efficiency, a clustered coding technique is proposed that maximizes the overall compression gain of the visual information gathered in WMSNs. The correlation of visual information is then utilized to design a network scheduling scheme to maximize the lifetime of WMSNs. Furthermore, as many WMSN applications require QoS support, a correlation-aware QoS routing algorithm is introduced that can efficiently deliver visual information under QoS constraints. Evaluation results show that, by utilizing the correlation of visual information in the communication process, the energy efficiency and networking performance of WMSNs could be improved significantly. Communication protocols Wireless sensor networks Spatial correlation Source coding QoS Multimedia Wireless communication systems Source code (Computer science) Computer science Multimedia communications Multimedia systems
106	Vizitų registravimo sistemos projektavimas ir testavimas / Design and testing of call reporting system Prelgauskas, Justinas 10 July 2008 (has links) Šiame dokumente aprašytas darbas susideda ir trijų pagrindinių dalių. Pirmojoje, inžinerinėje dalyje atlikome vizitų registravimo sistemos (toliau - „PharmaCODE“) analizę ir projektavimą. Čia pateikėme esmines verslo aplinkos, reikalavimų ir konkurentų analizės, o taipogi ir projektavimo detales. Pateikėme pagrindinius architektūrinius sprendimus. Antrojoje darbo dalyje aprašėme sistemos kokybės tyrimus, naudojant statinės išeities kodų analizės įrankius ir metodus. Šioje dalyje aprašėme kokius įrankius naudojome ir pateikėme pagrindinius kodo analizės rezultatus. Trečiojoje darbo dalyje gilinomės į išeities tekstų analizės metodus ir įrankius, sukūrėme patobulintą analizės taisyklę. Mūsų taisyklės pagalba pavyko aptikti daugiau potencialių SQL-įterpinių saugumo spragų nei aptiko jos pirmtakė – Microsoft projektuota kodo analizės taisyklė. / This work consists of three major parts. First – engineering part – is analysis and design of call reporting system (codename – “PharmaCODE”). We will provide main details of business analysis and design decisions. Second part is all about testing and ensuring system quality, mainly by means of static source code analysis tools & methods. We will describe tools being used and provide main results of source code analysis in this part. And finally, in the third part of this we go deeper into static source code analysis and try to improve one of analysis rules. These days, when there is plenty of evolving web-based applications, security is gaining more and more impact. Most of those systems have, and depend on, back-end databases. However, web-based applications are vulnerable to SQL-injection attacks. In this paper we present technique of solving this problem using secure-coding guidelines and .NET Framework’s static code analysis methods for enforcing those guidelines. This approach lets developers discover vulnerabilities in their code early in development process. We provide a research and realization of improved code analysis rule, which can automatically discover SQL-injection vulnerabilities in MSIL code. Informatics Crm Vizitų registravimo sistema Statinė išeities tekstų analizė Fxcop Sql įterpiniai Crm Call reporting system Static source code analysis Fxcop Sql injection
107	Un formalisme pour la traçabilité des transformations Lemoine, Mathieu 12 1900 (has links) Dans le développement logiciel en industrie, les documents de spécification jouent un rôle important pour la communication entre les analystes et les développeurs. Cependant, avec le temps, les changements de personel et les échéances toujours plus courtes, ces documents sont souvent obsolètes ou incohérents avec l'état effectif du système, i.e., son code source. Pourtant, il est nécessaire que les composants du système logiciel soient conservés à jour et cohérents avec leurs documents de spécifications pour faciliter leur développement et maintenance et, ainsi, pour en réduire les coûts. Maintenir la cohérence entre spécification et code source nécessite de pouvoir représenter les changements sur les uns et les autres et de pouvoir appliquer ces changements de manière cohérente et automatique. Nous proposons une solution permettant de décrire une représentation d'un logiciel ainsi qu'un formalisme mathématique permettant de décrire et de manipuler l'évolution des composants de ces représentations. Le formalisme est basé sur les triplets de Hoare pour représenter les transformations et sur la théorie des groupes et des homomorphismes de groupes pour manipuler ces transformations et permettrent leur application sur les différentes représentations du système. Nous illustrons notre formalisme sur deux représentations d'un système logiciel : PADL, une représentation architecturale de haut niveau (semblable à UML), et JCT, un arbre de syntaxe abstrait basé sur Java. Nous définissons également des transformations représentant l'évolution de ces représentations et la transposition permettant de reporter les transformations d'une représentation sur l'autre. Enfin, nous avons développé et décrivons brièvement une implémentation de notre illustration, un plugiciel pour l'IDE Eclipse détectant les transformations effectuées sur le code par les développeurs et un générateur de code pour l'intégration de nouvelles représentations dans l'implémentation. / When developing software system in industry, system specifications are heavily used in communication among analysts and developers. However, system evolution, employee turn-over and shorter deadlines lead those documents either not to be up-to-date or not to be consistent with the actual system source code. Yet, having up-to-date documents would greatly help analysts and developers and reduce development and maintenance costs. Therefore, we need to keep those documents up-to-date and consistent. We propose a novel mathematical formalism to describe and manipulate the evolution of these documents. The mathematical formalism is based on Hoare triple to represent the transformations and group theory and groups homomorphisms to manipulate these transformations and apply them on different representations. We illustrate our formalism using two representation of a same system: PADL, that is an abstract design specification (similar to UML), and JCT, that is an Abstract Syntax Tree for Java. We also define transformations describing their evolutions, and transformations transposition from one representation to another. Finally, we provide an implementation of our illustration, a plugin for the Eclipse IDE detecting source code transformations made by a developer and a source code generator for integrating new representations in the implementation. traçabilité traceability modèle model code source source code théorie des groupes group theory transpositions transpositions
108	Pride: uma ferramenta de detecção de similaridade em código-fonte / Pride: a tool for detecting similarity in source code Almeida, Diogo Cabral de 31 March 2015 (has links) Plagiarism among students of introductory programming courses has been increasing over time. The ease of exchange of information brought by the Internet can be the factor responsible for this increase. In many cases, students try to disguise the plagiarism making some modiﬁcations to the source code. However, some masking techniques are extremely complex to be detected and may not be seen with the naked eye. In this dissertation, detection techniques were analyzed and, on this basis, was developed a system able to detect plagiarism in source code. This system is based on the representation code as an abstract syntax tree and Karp-Rabin Greedy String Tiling algorithm. The system was evaluated using a source-code base of students of programming disciplines. Oracle based comparison was performed to compare the system with others. The oracle was created from the manual analysis of the teacher of the subject, which was marked if there was plagiarism or not in each pair of source code. To represent the results, ROC curves and confusion matrices were used. The same procedure was applied to existing systems, allowing direct comparison of results. More speciﬁcally, we use the value of the area under the curve and the minimum distance to point (0, 1) of the ROC space, since these ﬁgures represent the classiﬁcation performance. The analysis of results shows that, for the sample used, the developed system obtained higher area under the curve and also the shortest distance to the point (0, 1) of the space ROC. However, we ﬁnd that the choice of similarity detection tool in source code will depend on conservative or liberal proﬁle of teaching. / O plágio entre alunos de disciplinas introdutórias de programação vem aumentando ao longo do tempo. A facilidade na troca de informações trazida pela Internet pode ser um dos fatores responsáveis por esse aumento. Em muitos casos, os alunos tentam disfarçar o plágio fazendo algumas modiﬁcações no código-fonte. Porém, algumas técnicas de disfarce são extremamente complexas e podem não ser detectadas a olho nu. Neste trabalho, foram analisadas as técnicas de detecção e, com base nelas, foi desenvolvido um sistema capaz de detectar plágio em código-fonte. Este sistema é baseado na representação do código como uma árvore sintática abstrata e no algoritmo Karp-Rabin Greedy String Tiling. O sistema foi avaliado utilizando uma base de códigos-fonte de alunos de disciplinas programação. Foi realizada uma comparação baseada em oráculo para comparar o sistema com os demais. O oráculo foi criado a partir da análise do docente da disciplina, onde foi marcado se havia plágio ou não em cada par de código-fonte. Para representar os resultados, foram utilizadas curvas ROC e matrizes de confusão. O mesmo procedimento foi aplicado aos sistemas já existentes, o que permitiu a comparação direta entre os resultados. Mais especiﬁcamente, utilizamos o valor da área sob a curva e a distância mínima para o ponto (0, 1) do espaço ROC, uma vez que esses valores representam o desempenho de classiﬁcação. A análise dos resultados indica que, para a amostra utilizada, o sistema desenvolvido obteve o maior valor da área sob a curva e também a menor distância para o ponto (0, 1) do espaço ROC. No entanto, concluímos que a escolha de uma ferramenta de detecção de similaridade em código-fonte dependerá bastante do perﬁl conservador ou liberal do docente. Código-fonte Similaridade Plágio Computer programming language source code Plagiarism
109	SISTEMA MULTIUSUÁRIO PARA PUBLICAÇÃO DE INFORMAÇÕES GEORREFERENCIADAS COM BASE EM FERRAMENTAS DE CÓDIGO FONTE ABERTO / MULTIUSER SYSTEM TO PUBLISH GEOREFERENCED INFORMATION BASED ON OPEN SOURCE CODE TOOLS Schetinger Filho, Henrique 26 April 2005 (has links) The geographic information systems (GIS) are great potential tools to support organizations management. The requirements for implantation and use of this system are computational (the hardware and software) and human resources. Due to the high cost of some systems and involvement of many professionals in the geoprocessment area, the implantation of these systems can become onerous. A possible solution to reduce costs is the development of geographic information systems capable to be remotely managed, through web interface and using free software and open source code tools. This work has for its objective the development of a multiusing system to publish georeferenced information to support management tasks. The system is based on a web visualization interface, made by free software and open source code tools; tools for data administration through web; platform independence in the client with web browser; database management system; tools to generate web maps to the InterNet and ways to create new queries through web. / Os sistemas de informações geográficas (SIG) são ferramentas de grande potencial para apoiar a gestão de organizações. A implantação e uso desses sistemas requerem, como requisitos fundamentais, recursos computacionais (hardware e software) e recursos humanos (pessoal qualificado). Devido ao custo elevado de alguns sistemas e envolvimento de muitos profissionais na área de geoprocessamento, a implantação destes sistemas pode se tornar onerosa. Uma possível solução para reduzir custos é o desenvolvimento de sistemas de informação geográfica capazes de serem administrados remotamente, através de interface web e da utilização de ferramentas de software livre e de código fonte aberto. Este trabalho tem por objetivo o desenvolvimento de um sistema multiusuário para publicação de informações georreferenciadas para apoio à tomada de decisões. O sistema criado está baseado em: interface de visualização para web, construído a partir de ferramentas de software livre e código fonte aberto; mecanismo para criação de novas consultas através da web; mecanismo de administração dos dados através da web; independência de plataforma no cliente com o uso de navegador web; sistema de gerenciamento de banco de dados e ferramenta para geração de mapas através da Internet (webmaps). Sistemas de informação geográfica Geoprocessamento Webmaps Código fonte aberto Geographic Information Systems Geoprocessing Webmaps Open source code
110	Investigação sobre uso de vocabulário de código fonte para identiﬁcação de especialistas. / Research on the use of source code vocabulary to identify specialists. SANTOS, Katyusco de Farias. 08 May 2018 (has links) Submitted by Johnny Rodrigues (johnnyrodrigues@ufcg.edu.br) on 2018-05-08T16:28:56Z No. of bitstreams: 1 KATYUSCO DE FARIAS SANTOS - DISSERTAÇÃO PPGCC 2015..pdf: 6449694 bytes, checksum: 9e1346ec6a91417520be4a04624afaa1 (MD5) / Made available in DSpace on 2018-05-08T16:28:56Z (GMT). No. of bitstreams: 1 KATYUSCO DE FARIAS SANTOS - DISSERTAÇÃO PPGCC 2015..pdf: 6449694 bytes, checksum: 9e1346ec6a91417520be4a04624afaa1 (MD5) Previous issue date: 2015-02-28 / Identiﬁcadores e comentários de um código fonte constituem o vocabulário de software. Pesquisas apontam vocabulários como uma fonte valorosa de informação sobre o projeto. Para entender a natureza e o potencial dos vocabulários, desenvolvemos um ferramental capaz de extraí-los a partir de código fonte. Explorando os dados estatisticamente, identificamos duas propriedades de vocabulários: tamanho, expresso como função de potência de LOC (Lines-Of-Code); e a repetição de seus termos, que se ajusta a uma distribuição log-normal. Vocabulários, bem como suas propriedades e operações foram formalizadas baseadas no conceito de multisets. O ferramental de extração e a formalização viabilizaram cooperações cientíﬁcas sobre a utilidade de vocabulário sem atividades de manutenção. Esse conhecimento acumulado revelou que vocabulário pouco foi explorado como insumo à modelagem de conhecimento de código. Desenvolvemos então uma abordagem para identiﬁcar especialistas de código cujo conhecimento é deﬁnido pela similaridade existente entre vocabulários das entidades e dos desenvolvedores. Comparamos a precisão e cobertura da nossa abordagem com de duas outras: baseada em commits e baseada em percentual de LOC modiﬁcadas. Os resultados apontam que para indicar um único especialista, top-1, a nossa abordagem tem uma precisão menor, entre 29.9% e 10% que as abordagens de baseline. Já para indicar mais de um desenvolvedor especialista, até top-3, a nossa abordagem tem uma acurácia melhor de até 18.7% em relação as de baseline. Identiﬁcamos também que o conhecimento definido por similaridade quando combinado com um modelo baseado em autoria aumenta a capacidade de identiﬁcar especialistas, no R2 do modelo, em mais de 4 pontos percentuais. Concluímos que além de poder ser utilizado de forma isolada para modelar conhecimento de código e assim identiﬁcar especialistas, o vocabulário pode ser um componente adicional a modelos de conhecimento baseados em autoria e propriedade, já que capturam aspectos diferentes dos existentes nesse modelos. / Identiﬁers and comments from a source code are the software vocabulary. Research point vocabularies as a valuable source of information about the project. To understand we developed a tool that extract them from source code. Exploring the data statistically, we identify two vocabularies properties: vocabulary size, that is a power function of LOC (Lines-Of-Code) and the repetition of vocabulary terms that ﬁts alog-normal distribution. Vocabulary as well as their properties and operations were formalized based on the concept of multisets. Extraction tool and formalization made possible scientiﬁc cooperation on usage of vocabulary in maintenance activities. This accumulated knowledge has shown that vocabulary was not explored as an input to code knowledge. Then we developed a code experts identiﬁcation approach whose knowledge is deﬁned by existing similarity between entities and developers vocabularies. We compared precision and recall with two baseline approaches: based on commits and based on percentage of modiﬁed LOC.The results show that to indicate a single specialist, top-1, our approach has alower precision, between 29.9% and 10%,than baseline approaches. More than one specialist-developer, up to top-3, our approach has better accuracy of up to 18.7% over those of the baselines. We also identify that the knowledge deﬁned by similarity when combined with an authorship model enhances the ability to identify experts, R2 of the model, by more than 4 points. We conclude that vocabulary can be solely used to expertise, and thus identify experts. In addition, vocabulary can be an additional component for models based on authorship and ownership, since it captures different aspects from ones existing in these models. Ciência da Computação. Vocabulário de software Especialista de código Código fonte Software vocabulary Source code Code Specialist Vocabulário de Código Fonte Medidas de Expertise Modelo Degree-Of-Knowledge DOK

Search results