Global ETD Search

561	Generation of Control Logic from Ordinary Speech Haghjo, Hamed, Vahlberg, Elias January 2022 (has links) Developments in automatic code generation are evolving remarkably fast, with companies and researchers competing to reach human-level accuracy and capability. Advancements in this field primarily focus on using machine learning models for end-to-end code generation. This project introduces the system CodeFromVoice, which explores an alternative method for code generation. This method relies on existing Natural Language Processing models combined with traditional parsing methods. CodeFromVoice shows that this approach can generate code from text or transcribed speech using Automatic Speech Recognition. The generated code is limited in complexity and restricted to the context of an existing application but achieves a Word Error Rate of less than 25%. / Utvecklingen av automatisk kodgenerering visar stora framsteg, med företag och forskare som tävlar om att nå mänsklig nivå av noggrannhet och förmåga. Framsteg inom detta område fokuserar främst på användning av maskininlärningsmodeller för hela kodgenerering processen. Detta projekt introducerar systemet CodeFromVoice, som utforskar en alternativ metod för kodgenerering. Denna metod bygger på befintliga NLP-modeller kombinerat med traditionella parsning metoder. CodeFromVoice visar att detta tillvägagångssätt kan generera kod från text eller transkriberat tal med automatisk taligenkänning. Den genererade koden är begränsad i komplexitet och begränsad till sammanhanget av en existerande applikation, men uppnår en ordfelfrekvens som är mindre än 25%. Code generation generation of code generation of control logic natural language processing Engineering and Technology Teknik och teknologier
562	MLpylint: Automating the Identification of Machine Learning-Specific Code Smells Hamfelt, Peter January 2023 (has links) Background. Machine learning (ML) has rapidly grown in popularity, becoming a vital part of many industries. This swift expansion has brought about new challenges to technical debt, maintainability and the general software quality of ML systems. With ML applications becoming more prevalent, there is an emerging need for extensive research to keep up with the pace of developments. Currently, the research on code smells in ML applications is limited and there is a lack of tools and studies that address these issues in-depth. This gap in the research highlights the necessity for a focused investigation into the validity of ML-specific code smells in ML applications, setting the stage for this research study. Objectives. Addressing the limited research on ML-specific code smells within Python-based ML applications. To achieve this, the study begins with the identification of these ML-specific code smells. Once recognized, the next objective is to choose suitable methods and tools to design and develop a static code analysis tool based on code smell criteria. After development, an empirical evaluation will assess both the tool’s efficacy and performance. Additionally, feedback from industry professionals will be sought to measure the tool’s feasibility and usefulness. Methods. This research employed Design Science Methodology. In the problem identification phase, a literature review was conducted to identify ML-specific code smells. In solution design, a secondary literature review and consultations with experts were performed to select methods and tools for implementing the tool. Additionally, 160 open-source ML applications were sourced from GitHub. The tool was empirically tested against these applications, with a focus on assessing its performance and efficacy. Furthermore, using the static validation method, feedback on the tool’s usefulness was gathered through an expert survey, which involved 15 ML professionals from Ericsson. Results. The study introduced MLpylint, a tool designed to identify 20 ML-specific code smells in Python-based ML applications. MLpylint effectively analyzed 160ML applications within 36 minutes, identifying in total 5380 code smells, although, highlighting the need for further refinements to each code smell checker to accurately identify specific patterns. In the expert survey, 15 ML professionals from Ericsson acknowledged the tool’s usefulness, user-friendliness and efficiency. However, they also indicated room for improvement in fine-tuning the tool to avoid ambiguous smells. Conclusions. Current studies on ML-specific code smells are limited, with few tools addressing them. The development and evaluation of MLpylint is a significant advancement in the ML software quality domain, enhancing reliability and reducing associated technical debt in ML applications. As the industry integrates such tools, it’s vital they evolve to detect code smells from new ML libraries. Aiding developers in upholding superior software quality but also promoting further research in the ML software quality domain. Code Smell Machine Learning Static Code Analysis Software Quality Technical Debt Computer Systems Datorsystem
563	Are Open-Source SystemsDeveloped with Good CodeQuality? An Empirical Study Jonsson, Sebastian, Safavi, Nima January 2023 (has links) Due to the surge in the development of software, people in the software industry have a need for higher coding quality in different programming languages. A “code with good quality” can be defined as code that is written in a way that follows the rules or, in other words, conventions for, i.e., comments, proper indentation, clear notations, simplicity, naming, etc. There are coding style guidelines extracted from Java and Oracle code conventions to have readable, maintainable source code; however, the current studies do not answer the question of to what extent the open-source systems follow these guidelines. Finding the violations of conventions at the early stages of software development is essential because the changes are costly and impossible in the later stages. As a result, adhering to coding conventions will facilitate code readability and maintainability. Thus, this study intends to analyze the results from several code quality tools, make a comparison among them and, based on the outcomes, develop a new tool that covers the probable missing conventions in the studied code-checking tools. code quality maintainability Oracle convention code-checking tools readability Computer Sciences Datavetenskap (datalogi)
564	SUPPORTING SOFTWARE EXPLORATION WITH A SYNTACTIC AWARESOURCE CODE QUERY LANGUAGE Bartman, Brian M. 26 July 2017 (has links) No description available. Computer Science
565	Joint random linear network coding and convolutional code with interleaving for multihop wireless network Susanto, Misfa, Hu, Yim Fun, Pillai, Prashant January 2013 (has links) No / Abstract: Error control techniques are designed to ensure reliable data transfer over unreliable communication channels that are frequently subjected to channel errors. In this paper, the effect of applying a convolution code to the Scattered Random Network Coding (SRNC) scheme over a multi-hop wireless channel was studied. An interleaver was implemented for bit scattering in the SRNC with the purpose of dividing the encoded data into protected blocks and vulnerable blocks to achieve error diversity in one modulation symbol while randomising errored bits in both blocks. By combining the interleaver with the convolution encoder, the network decoder in the receiver would have enough number of correctly received network coded blocks to perform the decoding process efficiently. Extensive simulations were carried out to study the performance of three systems: 1) SRNC with convolutional encoding, 2) SRNC; and 3) A system without convolutional encoding nor interleaving. Simulation results in terms of block error rate for a 2-hop wireless transmission scenario over an Additive White Gaussian Noise (AWGN) channel were presented. Results showed that the system with interleaving and convolutional code achieved better performance with coding gain of at least 1.29 dB and 2.08 dB on average when the block error rate is 0.01 when compared with system II and system III respectively.
566	Towards Measuring & Improving Source Code Quality Iftikhar, Umar January 2024 (has links) Context: Software quality has a multi-faceted description encompassing several quality attributes. Central to our efforts to enhance software quality is to improve the quality of the source code. Poor source code quality impacts the quality of the delivered product. Empirical studies have investigated how to improve source code quality and how to quantify the source code improvement. However, the reported evidence linking internal code structure information and quality attributes observed by users is varied and, at times, conflicting. Furthermore, there is a further need for research to improve source code quality by understanding trends in feedback from code review comments. Objective: This thesis contributes towards improving source code quality and synthesizes metrics to measure improvement in source code quality. Hence, our objectives are 1) To synthesize evidence of links between source code metrics and external quality attributes, & identify source code metrics, and 2) To identify areas to improve source code quality by identifying recurring code quality issues using the analysis of code review comments. Method: We conducted a tertiary study to achieve the first objective, an archival analysis and a case study to investigate the latter two objectives. Results: To quantify source code quality improvement, we reported a comprehensive catalog of source code metrics and a small set of source code metrics consistently linked with maintainability, reliability, and security. To improve source code quality using analysis of code review comments, our explored methodology improves the state-of-the-art with interesting results. Conclusions: The thesis provides a promising way to analyze themes in code review comments. Researchers can use the source code metrics provided to estimate these quality attributes reliably. In future work, we aim to derive a software improvement checklist based on the analysis of trends in code review comments. Source code quality Code review analysis Software quality improvement Computer Systems Datorsystem
567	Implementation of Parallel and Serial Concatenated Convolutional Codes Wu, Yufei 27 April 2000 (has links) Parallel concatenated convolutional codes (PCCCs), called "turbo codes" by their discoverers, have been shown to perform close to the Shannon bound at bit error rates (BERs) between 1e-4 and 1e-6. Serial concatenated convolutional codes (SCCCs), which perform better than PCCCs at BERs lower than 1e-6, were developed borrowing the same principles as PCCCs, including code concatenation, pseudorandom interleaving and iterative decoding. The first part of this dissertation introduces the fundamentals of concatenated convolutional codes. The theoretical and simulated BER performance of PCCC and SCCC are discussed. Encoding and decoding structures are explained, with emphasis on the Log-MAP decoding algorithm and the general soft-input soft-output (SISO) decoding module. Sliding window techniques, which can be employed to reduce memory requirements, are also briefly discussed. The second part of this dissertation presents four major contributions to the field of concatenated convolutional coding developed through this research. First, the effects of quantization and fixed point arithmetic on the decoding performance are studied. Analytic bounds and modular renormalization techniques are developed to improve the efficiency of SISO module implementation without compromising the performance. Second, a new stopping criterion, SDR, is discovered. It is found to perform well with lowest cost when evaluating its complexity and performance in comparison with existing criteria. Third, a new type-II code combining automatic repeat request (ARQ) technique is introduced which makes use of the related PCCC and SCCC. Fourth, a new code-assisted synchronization technique is presented, which uses a list approach to leverage the simplicity of the correlation technique and the soft information of the decoder. In particular, the variant that uses SDR criterion achieves superb performance with low complexity. Finally, the third part of this dissertation discusses the FPGA-based implementation of the turbo decoder, which is the fruit of cooperation with fellow researchers. / Ph. D. parallel concatenated convolutional code turbo codes serial concatenated convolutional code wireless communications channel coding
568	Codes from norm-trace curves: local recovery and fractional decoding Murphy, Aidan W. 04 April 2022 (has links) Codes from curves over finite fields were first developed in the late 1970's by V. D. Goppa and are known as algebraic geometry codes. Since that time, the construction has been tailored to fit particular applications, such as erasure recovery and error correction using less received information than in the classical case. The Hermitian-lifted code construction of L'opez, Malmskog, Matthews, Piñero-González, and Wootters (2021) provides codes from the Hermitian curve over $F_{q^2}$ which have the same locality as the well-known one-point Hermitian codes but with a rate bounded below by a positive constant independent of the field size. However, obtaining explicit expressions for the code is challenging. In this dissertation, we consider codes from norm-trace curves, which are a generalization of the Hermitian curve. We develop norm-trace-lifted codes and demonstrate an explicit basis of the codes. We then consider fractional decoding of codes from norm-trace curves, extending the results obtained for codes from the Hermitian curve by Matthews, Murphy, and Santos (2021). / Doctor of Philosophy / Coding theory focuses on recovering information, whether that data is corrupted and changed (called an error) or is simply lost (called an erasure). Classical codes achieve this goal by accessing all received symbols. Because long codes, meaning those with many symbols, are common in applications, it is useful for codes to be able to correct errors and recover erasures by accessing less information than classical codes allow. That is the focus of this dissertation. Codes with locality are designed for erasure recovery using fewer symbols than in the classical case. Such codes are said to have locality $r$ and availability $s$ if each symbol can be recovered from $s$ disjoint sets of $r$ other symbols. Algebraic curves, such as the Hermitian curve or the more general norm-trace curves, offer a natural structure for designing codes with locality. This is done by considering lines intersected with the curve to form repair groups, which are sets of $r+1$ points where the information from one point can be recovered using the rest of the points in the repair group. An error correction method which uses less data than the classical case is that of fractional decoding. Fractional decoding takes advantage of algebraic properties of the field trace to correct errors by downloading only a $lambda$-proportion of the received information, where $lambda < 1$. In this work, we consider a new family of codes resulting from norm-trace curves, and study their locality and availability, as well as apply the ideas of fractional decoding to these codes. algebraic geometry code locally recoverable code fractional decoding norm-trace curve
569	Deep Learning for Code Generation using Snippet Level Parallel Data Jain, Aneesh 05 January 2023 (has links) In the last few years, interest in the application of deep learning methods for software engineering tasks has surged. A variety of different approaches like transformer based methods, statistical machine translation models, models inspired from natural language settings have been proposed and shown to be effective at tasks like code summarization, code synthesis and code translation. Multiple benchmark data sets have also been released but all suffer from one limitation or the other. Some data sets only support a select few programming languages while others support only certain tasks. These limitations restrict researchers' ability to be able to perform thorough analyses of their proposed methods. In this work we aim to alleviate some of the limitations faced by researchers who work in the paradigm of deep learning applications for software engineering tasks. We introduce a large, parallel, multi-lingual programming language data set that supports tasks like code summarization, code translation, code synthesis and code search in 7 different languages. We provide benchmark results for the current state of the art models on all these tasks and we also explore some limitations of current evaluation metrics for code related tasks. We provide a detailed analysis of the compilability of code generated by deep learning models because that is a better measure of ascertaining usability of code as opposed to scores like BLEU and CodeBLEU. Motivated by our findings about compilability, we also propose a reinforcement learning based method that incorporates code compilability and syntax level feedback as rewards and we demonstrate it's effectiveness in generating code that has less syntax errors as compared to baselines. In addition, we also develop a web portal that hosts the models we have trained for code translation. The portal allows translation between 42 possible language pairs and also allows users to check compilability of the generated code. The intent of this website is to give researchers and other audiences a chance to interact with and probe our work in a user-friendly way, without requiring them to write their own code to load and inference the models. / Master of Science / Deep neural networks have now become ubiquitous and find their applications in almost every technology and service we use today. In recent years, researchers have also started applying neural network based methods to problems in the software engineering domain. Software engineering by it's nature requires a lot of documentation, and creating this natural language documentation automatically using programs as input to the neural networks has been one their first applications in this domain. Other applications include translating code between programming languages and searching for code using natural language as one does on websites like stackoverflow. All of these tasks now have the potential to be powered by deep neural networks. It is common knowledge that neural networks are data hungry and in this work we present a large data set containing codes in multiple programming languages like Java, C++, Python, C#, Javascript, PHP and C. Our data set is intended to foster more research in automating software engineering tasks using neural networks. We provide an analysis of performance of multiple state of the art models using our data set in terms of compilability, which measures the number of syntax errors in the code, as well as other metrics. In addition, propose our own deep neural network based model for code translation, which uses feedback from programming language compilers in order to reduce the number of syntax errors in the generated code. We also develop and present a website where some of our code translation models have been hosted. The website allows users to interact with our work in an easy manner without any knowledge of deep learning and get a sense of how these technologies are being applied for software engineering tasks. Deep Learning Code Dataset Code Translation Software Development Compilation Reinforcement Learning
570	All English and No Code-switching : A thematic analysis of writing behaviours among EMI master's students James, Calum January 2022 (has links) As a kind of education strategy, English as a medium of instruction (EMI) has become increasingly widespread across the world in recent years. The increased adoption means that many students are performing study activities such as reading, writing, and giving presentations in English all while maintaining and using a native language in other situations. One area of interest within EMI research is how it may relate to academic writing, and here there are relatively few studies aiming to examine the interactions between EMI and writing among master’s students. This paper collected qualitative interview data from five EMI master’s students who were asked to describe how they go about writing academic texts, what experiences and opinions they have of multilingualism in their lives, as well as how they may utilise the languages available to them to assist in their writing through code-switching or translanguaging. A thematic analysis was conducted which generated ten themes within two overarching categories, language use and multilingualism and writing behaviours. Participants in the present study reported no code-switching behaviours at any point throughout their writing, contrasting with previous research in multilingual university settings. This may be due to constraints of the EMI environment, where all produced materials from students need to be in English, discouraging the use of multiple languages and leading to opinions of sticking to one language being easier. Future research could usefully examine language use within EMI educational contexts with a focus on how it facilitates or otherwise affects code-switching tendencies. / English as a medium of instruction (EMI) har blivit en allt vanligare strategi inom utbildning under senare år. Dess utökning medför att många studenter utför studieaktiviteter som skrivning, läsning och muntliga presentationer på engelska och samtidigt bibehåller och använder sig av ett modersmål inom andra sammanhang. Ett intresseområde inom EMI-forskning är hur det knyter an till akademiskt skrivande, där det finns relativt få studier som fokuserar på samspelet mellan EMI och skrivande hos mastersstudenter. Denna studie samlade in kvalitativa data från intervjuer med fem mastersstudenter inom EMI program. De beskrev hur de gick tillväga vid akademiskt skrivande, vad de har för erfarenheter av och åsikter om flerspråkighet samt hur de använder sina tillgängliga språk till hjälp vid skrivprocessen genom så kallat code-switching eller translanguaging. En tematisk analys utfördes vilket skapade tio teman inom två breda kategorier, nämligen språkbruk och flerspråkighet och skrivbeteenden. Deltagarna i denna studie rapporterade inte att de använde code-switching alls inom skrivprocessen, till skillnad från tidigare studier från flerspråkiga universitetssammanhang. Detta kan bero på begräsningar inom EMI-miljöer, där texter och presentationer från studenter måste vara på engelska, vilket kan hindra användningen av flera språk. Framtida forskning skulle användbart kunna utforska språkbruk hos EMI-studenter med fokus på hur utbildningen med fokus på hur det underlättar eller annars påverkar code-switching tendenser. code-switching EMI master’s students multilingualism code-switching EMI flerspråkighet mastersstudenter Languages and Literature Språk och litteratur

Search results