Global ETD Search

1	Investigating student experiences with GitHub and Stack Overflow: an exploratory study Bhasin, Trishala 29 July 2021 (has links) Programmers who want to improve their skills and background in software development rely heavily on developer social platforms such as GitHub and Stack Overflow to enhance their learning. Stack Overflow provides answers to questions they have about languages or library skills they wish to acquire, while contributing to open-source projects hosted on sites like GitHub gives them valuable experience. Students also use these platforms during their education: most will rely heavily on Stack Overflow at some point in their schooling, while many can benefit from contributing to GitHub projects to build their expertise and professional portfolios. We already know from previous research that developers face barriers participating on these platforms, and therefore we may expect that at least some students will experience similar or possibly even bigger barriers. This research describes a semi-structured interview study followed by a survey with university students to explore how they use the GitHub and Stack Overflow platforms. I identified the benefits the students report from using these tools and the barriers they face. I have concluded with some preliminary recommendations on how to reduce the hurdles students may face with these and other developer social platforms, and I have also suggested future work to mitigate these roadblocks. / Graduate opensource students GitHub Stack Overflow Programming
2	On the implications of unsafe eBPF composition Somaraju, Sai Roop 10 June 2024 (has links) In the era of Linux being omnipresent, the demand for dynamically extending kernel capabil- ities without requiring changes to kernel source code or loading kernel modules at runtime is increasing. This is driven by numerous use cases such as observability, security, and network- ing, which can be efficiently addressed at the system level, underscoring the importance of such extensions. Any extension requires programmers to possess high levels of skill and thor- ough testing to ensure complete safety. The eBPF subsystem in the Linux kernel addresses this challenge by allowing applications to enhance the kernel's capabilities at runtime, while ensuring stability and security. This guaranteed safety is facilitated by the verifier engine, which statically verifies BPF code. In this thesis, we identify that the verifier implicitly relies on safety assumptions about its runtime execution environment, which are not being upheld in certain scenarios. One such critical aspect of the execution environment is the availability of stack space for use while executing the BPF program. Specifically, we high- light this fundamental issue in certain configuration of the BPF runtime environment within the Linux kernel and how this unsafe composition allowed for kernel stack overflow, thus violating safety guarantees. To tackle this problem, we propose a stack switching approach to ensure stack safety and evaluate its effectiveness. / Master of Science / Many platforms worldwide, including Meta, Netflix, Google, Cloudflare, and others, rely on the Linux kernel to manage their servers. To ensure system security, improve monitoring, and enhance networking efficiency, various kernel capabilities are dynamically added or re- moved at runtime without the need for reboots, thus minimizing downtime for users. The Linux Extended Berkeley Packet Filter (eBPF) subsystem facilitates dynamic and safe ex- tension by securely verifying the code injected into the kernel. This eases server maintenance tasks, eliminating concerns about system crashes when making runtime changes as eBPF is guaranteeing safety at all times. In our research, we demonstrate that if we attach verified eBPF in a certain manner, we can potentially stack overflow the kernel stack and crash the whole kernel due to unsafe composition with the Kernel. We also propose two solutions to this problem, which can ensure that eBPF remains safe while adhering to the guarantees it provides. Extended Berkeley Packet Filter Linux Stack Overflow
3	Implementing a Lambda Architecture to perform real-time updates Gudipati, Pramod Kumar January 1900 (has links) Master of Science / Department of Computing and Information Sciences / William Hsu / The Lambda Architecture is the new paradigm for big data, that helps in data processing with a balance on throughput, latency and fault-tolerance. There exists no single tool that provides a complete solution in terms of better accuracy, low latency and high throughput. This initiated the idea to use a set of tools and techniques to build a complete big data system. The Lambda Architecture defines a set of layers to fit in a set of tools and techniques rightly for building a complete big data system: Speed Layer, Serving Layer, Batch Layer. Each layer satisfies a set of properties and builds upon the functionality provided by the layers beneath it. The Batch layer is the place where the master dataset is stored, which is an immutable and append-only set of raw data. Also, batch layer pre-computes results using a distributed processing system like Hadoop, Apache Spark that can handle large quantities of data. The Speed Layer captures new data coming in real time and processes it. The Serving Layer contains a parallel processing query engine, which takes results from both Batch and Speed layers and responds to queries in real time with low latency. Stack Overflow is a Question & Answer forum with a huge user community, millions of posts with a rapid growth over the years. This project demonstrates The Lambda Architecture by constructing a data pipeline, to add a new “Recommended Questions” section in Stack Overflow user profile and update the questions suggested in real time. Also, various statistics such as trending tags, user performance numbers such as UpVotes, DownVotes are shown in user dashboard by querying through batch processing layer. Lambda Architecture Big Data Stack overflow computer science
4	A Behavior-Driven Recommendation System for Stack Overflow Posts Greco, Chase D 01 January 2018 (has links) Developers are often tasked with maintaining complex systems. Regardless of prior experience, there will inevitably be times in which they must interact with parts of the system with which they are unfamiliar. In such cases, recommendation systems may serve as a valuable tool to assist the developer in implementing a solution. Many recommendation systems in software engineering utilize the Stack Overflow knowledge-base as the basis of forming their recommendations. Traditionally, these systems have relied on the developer to explicitly invoke them, typically in the form of specifying a query. However, there may be cases in which the developer is in need of a recommendation but unaware that their need exists. A new class of recommendation systems deemed Behavior-Driven Recommendation Systems for Software Engineering seeks to address this issue by relying on developer behavior to determine when a recommendation is needed, and once such a determination is made, formulate a search query based on the software engineering task context. This thesis presents one such system, StackInTheFlow, a plug-in integrating into the IntelliJ family of Java IDEs. StackInTheFlow allows the user to intervi act with it as a traditional recommendation system, manually specifying queries and browsing returned Stack Overflow posts. However, it also provides facilities for detecting when the developer is in need of a recommendation, defined when the developer has encountered an error messages or a difficulty detection model based on indicators of developer progress is fired. Once such a determination has been made, a query formulation model constructed based on a periodic data dump of Stack Overflow posts will automatically form a query from the software engineering task context extracted from source code currently open within the IDE. StackInTheFlow also provides mechanisms to personalize, over time, the results displayed to a specific set of Stack Overflow tags based on the results previously selected by the user. The effectiveness of these mechanisms are examined and results based the collection of anonymous user logs and a small scale study are presented. Based on the results of these evaluations, it was found that some of the queries issued by the tool are effective, however there are limitations regarding the extraction of the appropriate context of the software engineering task yet to overcome. Stack Overflow Recommendation Systems IDE Developer Tools Software Engineering
5	Predicting the programming language of questions and snippets of stack overflow using natural language processing Alrashedy, Kamel 11 September 2018 (has links) Stack Overflow is the most popular Q&A website among software developers. As a platform for knowledge sharing and acquisition, the questions posted in Stack Over- flow usually contain a code snippet. Stack Overflow relies on users to properly tag the programming language of a question and assumes that the programming language of the snippets inside a question is the same as the tag of the question itself. In this the- sis, a classifier is proposed to predict the programming language of questions posted in Stack Overflow using Natural Language Processing (NLP) and Machine Learning (ML). The classifier achieves an accuracy of 91.1% in predicting the 24 most popular programming languages by combining features from the title, body and code snippets of the question. We also propose a classifier that only uses the title and body of the question and has an accuracy of 81.1%. Finally, we propose a classifier of code snip- pets only that achieves an accuracy of 77.7%.Thus, deploying ML techniques on the combination of text and code snippets of a question provides the best performance. These results demonstrate that it is possible to identify the programming language of a snippet of only a few lines of source code. We visualize the feature space of two programming languages Java and SQL in order to identify some properties of the information inside the questions corresponding to these languages. / Graduate Stack overflow knowledge sharing Natural Language Processing Machine Learning
6	Evaluating Stack Overflow Usability Posts in Conjunction with Usability Heuristics Jalali, Hamed 05 1900 (has links) This thesis explores the critical role of usability in software development and uses usability heuristics as a cost-effective and efficient method for evaluating various software functions and interfaces. With the proliferation of software development in the modern digital age, developing user-friendly interfaces that meet the needs and preferences of users has become a complex process. Usability heuristics, a set of guidelines based on principles of human-computer interaction, provide a starting point for designers to create intuitive, efficient, and easy-to-use interfaces that provide a seamless user experience. The study uses Jakob Nieson's ten usability heuristics to evaluate the usability of Stack Overflow posts, a popular Q\&A website for developers. Through the analysis of 894 posts related to usability, the study identifies common usability problems faced by users and developers, providing valuable insights into the effectiveness of usability guidelines in software development practice. The research findings emphasize the need for ongoing evaluation and improvement of software interfaces to ensure a seamless user experience. The thesis concludes by highlighting the potential of usability heuristics in guiding the design of user-friendly software interfaces and improving the overall user experience in software development. Usability Stack Overflow Usability testing Heuristics Computer Science
7	How Reliable is the Crowdsourced Knowledge of Security Implementation? Chen, Mengsu 12 1900 (has links) The successful crowdsourcing model and gamification design of Stack Overflow (SO) Q&A platform have attracted many programmers to ask and answer technical questions, regardless of their level of expertise. Researchers have recently found evidence of security vulnerable code snippets being possibly copied from SO to production software. This inspired us to study how reliable is SO in providing secure coding suggestions. In this project, we automatically extracted answer posts related to Java security APIs from the entire SO site. Then based on the known misuses of these APIs, we manually labeled each extracted code snippets as secure or insecure. In total, we extracted 953 groups of code snippets in terms of their similarity detected by clone detection tools, which corresponds to 785 secure answer posts and 644 insecure answer posts. Compared with secure answers, counter-intuitively, insecure answers has higher view counts (36,508 vs. 18,713), higher score (14 vs. 5), more duplicates (3.8 vs. 3.0) on average. We also found that 34% of answers provided by the so-called trusted users who have administrative privileges are insecure. Our finding reveals that there are comparable numbers of secure and insecure answers. Users cannot rely on community feedback to differentiate secure answers from insecure answers either. Therefore, solutions need to be developed beyond the current mechanism of SO or on the utilization of SO in security-sensitive software development. / Master of Science / Stack Overflow (SO), the most popular question and answer platform for programmers today, has accumulated and continues accumulating tremendous question and answer posts since its launch a decade ago. Contributed by numerous users all over the world, these posts are a type of crowdsourced knowledge. In the past few years, they have been the main reference source for software developers. Studies have shown that code snippets in answer posts are copied into production software. This is a dangerous sign because the code snippets contributed by SO users are not guaranteed to be secure implementations of critical functions, such as transferring sensitive information on the internet. In this project, we conducted a comprehensive study on answer posts related to Java security APIs. By labeling code snippets as secure or insecure, contrasting their distributions over associated attributes such as post score and user reputation, we found that there are a significant number of insecure answers (644 insecure vs 785 secure in our study) on Stack Overflow. Our statistical analysis also revealed the infeasibility of differentiating between secure and insecure posts leveraging the current community feedback system (eg. voting) of Stack Overflow. Stack Overflow crowdsourced knowledge social dynamics security implementation clone detection
8	An empirical case study on Stack Overflow to explore developers’ security challenges Rahman, Muhammad Sajidur January 1900 (has links) Master of Science / Department of Computing and Information Sciences / Eugene Vasserman / The unprecedented growth of ubiquitous computing infrastructure has brought new challenges for security, privacy, and trust. New problems range from mobile apps with incomprehensible permission (trust) model to OpenSSL Heartbleed vulnerability, which disrupted the security of a large fraction of the world's web servers. As almost all of the software bugs and flaws boil down to programming errors/misalignment in requirements, we need to retrace back Software Development Life Cycle (SDLC) and supply chain to check and place security & privacy consideration and implementation plan properly. Historically, there has been a divergent point of view between security teams and developers regarding security. Security is often thought of as a "consideration" or "toll gate" within the project plan rather than being built in from the early stage of project planning, development and production cycles. We argue that security can be effectively made into everyone's business in SDLC through a broader exploration of the users and their social-cultural contexts, gaining insight into their mental models of security and privacy and usage patterns of technology, trying to see why and how security practices being satisfied or not-satisfied, then transferring those observations into new tool building and protocol/interaction design. The overall goal in our current study is to understand the common challenges and/or misconceptions regarding security-related issues among developers. In order to investigate into this issue, we conduct a mixed-method analysis on the data obtained from Stack Overflow(SO), one of the most popular on-line QA sites for software developer community to communicate, collaborate, and share information with one another. In this study, we have adopted techniques from mining software repositories research paradigm and have employed topic modeling for analyzing security-related topics in SO dataset. To our knowledge, our work in SO data mining is one of the earliest systematic attempts to understand the roots of challenges, misconceptions, and deterrent factors, if any, among developers while they try to implement security features during software development. We argue that a proper understanding of these issues is a necessary first step towards "build security in" culture in SDLC. Mining Software Security Security & Privacy Software Engineering Topic Model Stack Overflow
9	Webbramverk för sökning av heterogen data utifrån single-page sökfunktion / Web framework for searching heterogeneous data by single-page search function Andersson, Mattias January 2019 (has links) Detta arbete fokuserar på att utveckla två webbapplikationer i ramverken Django och Node.js tillsammans med Express för att besvara frågan av vilket ramverk som erhåller bäst söktider utifrån en Q&A plattform som använder en PostgreSQL databas vars innehåll är delar av Stack Overflow datasetet. Ramverken jämförs med hjälp av metoden experiment på grund av fördelarna som det erbjuder. Resultatet blev att Node.js gav bättre söktider för ett mindre webbapplikationer medan Django presterade bättre för större webbapplikationer. Genom att halvera storleken av sökresultatens kroppstext till 150 tecken fick Node.js söktider som i snitt var bättre än Django vid större projekt. Antalet sökresultat har en inverkan där vardera ramverk har sina egna intervaller där de ger bäst söktider. Kortsiktigt kan arbetet fortsätta genom att utföra ytterligare mätningar för respektive faktor, långsiktigt kan dessa ramverk jämföras med andra för att se om dessa två ligger bland de bättre eller sämre för denna tillämpningen. Django Node.js PostgreSQL AJAX Stack Overflow Computer and Information Sciences Data- och informationsvetenskap
10	Knowledge Curation in a Developer Community: A Study of Stack Overflow and Mailing Lists Gomez Teshima, Carlos Arturo 05 January 2016 (has links) Media channels play an important role in the flow, construction, and curation of knowledge in software development. Understanding how developers use media channels is key to improving developer practices and supporting channel evolution. In this thesis, I investigate the way developers use media channels to curate knowledge within the R software development community. By applying a case study methodology consisting of mining archival data and survey methods, I investigate the R community on Stack Overflow and the R-help mailing list, using a qualitative approach. The findings reveal that Stack Overflow and mailing lists foster knowledge co-construction differently---crowd-sourced and participatory respectively. Furthermore, developers use actively both channels to optimize knowledge exchange and curation. My thesis contributes to the understanding of knowledge curation by developer communities, and describes a model for a systematic comparison of two or more media channels, within a community of practice. This model allows knowledge categorization and can be used in future studies to explore knowledge flow within multiple media channels. Moreover, based on my observations in conjunction with the survey data analysis, I extracted a set of recommendations to assist practitioners in the use of multiple Question and Answer (Q&A) channels. / Graduate Media channel Stack Overflow Mailing list Curation Data Mining Case Study

Search results