21 |
Validating the Quality of a Big Data Java CorpusPalmqvist, Simon January 2018 (has links)
Recent research within the field of Software Engineering have used GitHub, the largest hub for open source projects with almost 20 million users and 57 million repositories, to mine large amounts of source code to get more trustworthy results when developing machine and deep learning models. Mining GitHub comes with many challenges since the dataset is large and the data does not only contain quality software projects. In this project, we try to mine projects from GitHub based on earlier research by others and try to validate the quality by comparing the projects with a small subset of quality projects with the help of software complexity metrics.
|
22 |
Software Developers Using Signals in Transparent EnvironmentsTsay, Jason Tye 01 April 2017 (has links)
One of the main challenges that modern software developers face is the coordination of dependent agents such as software projects and other developers. Transparent development environments that make low-level software development activities visible hold much promise for assisting developers in making coordination decisions. However, the wealth of information that transparent environments provide is potentially overwhelming when developers are wading through information from potentially millions of developers and millions of software repositories when making decisions around tasks that require coordination with projects or other developers. Overcoming the risk of overload and better assisting developers in these environments requires a principled understanding of what exactly developers need to know about dependencies to make their decisions. My approach to a principled understanding of how developers use information in transparent environments is to model the process using signaling theory as a theoretical lens. Developers making key coordination decisions often must determine qualities about projects and other developers that are not directly observable. Developers infer these unobservable qualities through interpreting information in their environment as signals and use this judgment about the project or developer to inform their decision. In contrast to current software engineering literature which focuses on technical coordination between modules or within projects such as modularity or task assignment mechanisms, this work aims to understand how developers use signals to information coordination decisions with dependencies such as other projects or developers. Through this understanding of the signaling process, I can create improved signals that more accurately represent desired unobservable qualities. My dissertation work examines the qualities and signals that developers use to inform specific coordination tasks through a series of three empirical studies. The specific key coordination tasks studied are evaluating code contributions, discussing problems around contributions, and evaluating projects. My results suggest that when project managers evaluate code contributions, they prefer social signals over technical signals. When project managers discuss contributions, I found that they attend to political signals regarding influence from stakeholders to prioritize which problems need solutions. I found that developers evaluating projects tend to use signals that are related to how the core team works and the potential utility a project provides. In a fourth study, using signaling theory and findings from the qualities and signals that developers use to evaluate projects, I create and evaluate an improved signal called “supportiveness” for community support in projects. I compare this signal against the current signal that developers use, stars count, and find evidence suggesting that my designed signal is more robust and is a stronger indicator of support. The findings of these studies inform the design of tools and environments that assist developers in coordination tasks through suggestions of what signals to show and potentially improving existing signals. My thesis as a whole also suggests opportunities for exploring useful signals for other coordination tasks or even in different transparent environments.
|
23 |
Analýza dat na sociálních sítích s využitím dolování dat / Analysis of Data on Social Networks Based on Data MiningSedlák, Jan January 2015 (has links)
This thesis deals with data mining on social networks. It introduces data mining itself and its utilization on data analysis on social networking services. It analyses APIs of Facebook, Twitter, Google+, LinkedIn and GitHub with respect to data mining. It presents implementation of application for downloading dataset from GitHub and it deals with experiments with obtained dataset. Finally, it introduces design and implementation of application that analyses future project activity.
|
24 |
Authentication and SQL-Injection Prevention Techniques in Web ApplicationsCetin, Cagri 17 June 2019 (has links)
This dissertation addresses the top two “most critical web-application security risks” by combining two high-level contributions.
The first high-level contribution introduces and evaluates collaborative authentication, or coauthentication, a single-factor technique in which multiple registered devices work together to authenticate a user. Coauthentication provides security benefits similar to those of multi-factor techniques, such as mitigating theft of any one authentication secret, without some of the inconveniences of multi-factor techniques, such as having to enter passwords or biometrics. Coauthentication provides additional security benefits, including: preventing phishing, replay, and man-in-the-middle attacks; basing authentications on high-entropy secrets that can be generated and updated automatically; and availability protections against, for example, device misplacement and denial-of-service attacks. Coauthentication is amenable to many applications, including m-out-of-n, continuous, group, shared-device, and anonymous authentications. The principal security properties of coauthentication have been formally verified in ProVerif, and implementations have performed efficiently compared to password-based authentication.
The second high-level contribution defines a class of SQL-injection attacks that are based on injecting identifiers, such as table and column names, into SQL statements. An automated analysis of GitHub shows that 15.7% of 120,412 posted Java source files contain code vulnerable to SQL-Identifier Injection Attacks (SQL-IDIAs). We have manually verified that some of the 18,939 Java files identified during the automated analysis are indeed vulnerable to SQL-IDIAs, including deployed Electronic Medical Record software for which SQL-IDIAs enable discovery of confidential patient information. Although prepared statements are the standard defense against SQL injection attacks, existing prepared-statement APIs do not protect against SQL-IDIAs. This dissertation therefore proposes and evaluates an extended prepared-statement API to protect against SQL-IDIAs.
|
25 |
Systém na správu programovacích konvencí v projektu / Coding Conventions Management SystemOrlíček, Michal January 2021 (has links)
The goal of this thesis is to design and implement coding conventions management system for project. Prior to the creation of the system itself, the research of coding conventions benefits, the analysis of used technologies in open source projects at GitHub service, and the analysis of existing technologies managing coding conventions was done. On the basis of that, usage scenarios were designed, requirements were specified and system architecture was determined. Then the system was implemented as web application based on Blazor and EditorConfig technologies. The main aim was to create a system that would allow to store all types of programming conventions and at the same time allows users to automatically control and generate them. It is published under an open source license within the GitHub service and deployed on the Azure cloud platform.
|
26 |
DependencyVis: Helping Developers Visualize Software Dependency InformationLui, Nathan 01 June 2021 (has links) (PDF)
The use of dependencies have been increasing in popularity over the past decade, especially as package managers such as JavaScript's npm has made getting these packages a simple command to run. However, while incidents such as the left-pad incident has increased awareness of how vulnerable relying on these packages are, there is still some work to be done when it comes to getting developers to take the extra research step to determine if a package is up to standards. Finding metrics of different packages and comparing them is always a difficult and time consuming task, especially since potential vulnerabilities are not the only metric to consider. For example, considering how popular and how actively maintained the package is also just as important.
Therefore, we propose a visualization tool called DependencyVis that is specific to JavaScript projects and npm packages as a solution by analyzing a project's dependencies in order to help developers by looking up the many basic metrics that can address a dependency's popularity, activeness, and vulnerabilities such as the number of GitHub stars, forks, and issues as well as security advisory information from npm audit. This thesis then proposes many use cases for DependencyVis to help users compare dependencies by displaying the dependencies in a graph with metrics represented by aspects such as node color or node size.
|
27 |
A GitHub-Based Voice Assistant for Software Developers and TeamsSereesathien, Siriwan 01 June 2021 (has links) (PDF)
Software developers and teams typically rely on source code and tasks management tools for their projects. They tend to depend on different platforms such as GitHub, Azure DevOps, Bitbucket, and GitLab for task-tracking, feature-tracking, and bug-tracking to develop and maintain their software repositories. Individually, developers may lose concentration when having to navigate through numerous screens consisting of various platforms to perform daily tasks. Additionally, while in meetings (non-virtual), teams are often separate from their machines and often would have to rely on pure recollection of the tasks and issues related to their work. This can delay the decision-making process and take away valuable focus hours of developers. Although there is usually one person with their laptop to guide the meeting and has access to the source code management tools, this can take a lot of time as they are not familiar with all the developers’ independent works. Therefore, a new tool needs to be introduced to help accelerate individual and team meetings’ productivity. In this paper, we continued the work on Robin, a voice-assistant built to answer questions regarding GitHub issues and source code management. Robin has the ability to answer questions in addition to completing actions on the behalf of the developer. This thesis presents Robin's abilities, architecture, and implementation while also examining its usability through a user study. Our study suggests that some people love the idea of having a conversational agent for software development. However, a lot more research and iterations must be done to fully make Robin give the user experience we imagined. In this thesis, we were able to set the foundation of this idea and the lessons that we learned.
|
28 |
Whole-Lake Primary Production CalculatorLeong, Colin D. January 2015 (has links)
No description available.
|
29 |
The differences in requirement elicitation between community- and firm-driven open source software projects on GithubFilip, Harald, Teddy, Andersson January 2017 (has links)
Kunskap om olika utvecklingsmetoder vid start av ett nytt mjukvaruutvecklingsprojekt äravgörande för utvecklarna, styrorganen och slutprodukten. Därför prioriteras ofta nya ochokända metoder ned för att säkerställa att arbetet blir gjort och att lösningen kommer attlevereras i tid och med hög kvalitet. Detta beteende gör på lång sikt att mjukvaruutvecklingsprojektgår miste om nya och bättre utvecklingsmetoder.För belysa nya utvecklingsmetoder och upplysa de som behöver, valde vi att undersökaskillnaderna i krav framställning inom området Open Source Software(OSS)1-utveckling.I vårt arbete ställer tre forskningsfrågor som ska belysa ämnet dessa bevarar vi genom attutföra en fallstudie. I fallstudien undersöker vi hur och av vilka som krav framställts i ettföretagsstyrt projekt jämfört med ett projekt drivet av en frivilligorganisation.Fallstudien visade att externa användare i frivilligorganisation OSS-projekt har lägredelaktighet, det vill säga bidrag till projektartefakter, jämfört med företagsdrivna projektdär deltagandet av externa användare är högre. Slutligen diskuterar vi implikationerna avresultaten för både OSS-projekt drivna av företag och frivilligorganisationer. Vi kan förbåda styrorganen dra slutsatsen att det är möjligt att öka både utvecklingshastighet ochproduktens värde för kunden. / Knowledge about different development methods when starting up a new software developmentproject is crucial for the developers, the governing bodies and the end product.Therefore new and unfamiliar options are taken out of the equation to make sure that thework gets done and that the solution will be delivered on time and with high quality. Thisbehaviour in the long term does, however, exclude new and better ways of executing thework in the process.To shine light upon new development methods and enlighten those who are in needof insight into a new viable option we chose to investigate the differences in requirementelicitation within the area of Open Source Software development. By examining how andby who requirements are elicited in a firm-driven project compared to a community drivenproject, we framed a total of three research questions to base our case study on.The case study showed that in community driven Open Source Software projects externalusers have low participation, in other words contributions to project artefacts, comparedto firm-driven projects where the participation of external users is high. Finally, wediscuss the potential implications of the findings for both community- and firm-driven OSSprojects. We could conclude for both types that it’s possible to increase both developmentspeed and customer product value.
|
30 |
GitHub Uncovered: Revealing the Social Fabric of Software Development CommunitiesAl Rubaye, Abduljaleel 01 January 2024 (has links) (PDF)
The proliferation of open-source software development platforms has given rise to various online social communities where developers can seamlessly collaborate, showcase their projects, and exchange knowledge and ideas. GitHub stands out as a preeminent platform within this ecosystem. It offers developers a space to host and disseminate their code, participate in collaborative ventures, and engage in meaningful dialogues with fellow community members. This dissertation embarks on a comprehensive exploration of various facets of software development communities on GitHub, with a specific focus on innovation diffusion, repository popularity dynamics, code quality enhancement, and user commenting behaviors. This dissertation introduces a popularity-based model that elucidates the diffusion of innovation on GitHub. We scrutinize the influence of a repository's popularity on the transfer of knowledge and the adoption of innovative practices, relying on a dataset encompassing GitHub fork events. Through a meticulous analysis of developers' collaborative coding efforts, this dissertation furnishes valuable insights into the impact of social factors, particularly popularity, on the diffusion of innovation. Furthermore, we introduce a novel approach to computing a weight-based popularity score, denoted as the Weighted Trend Popularity Score (WTPS), derived from the historical trajectory of repository popularity indicators, such as fork and star counts. The accuracy of WTPS as a comprehensive repository popularity indicator is assessed, and the significance of having a singular metric to represent repository popularity is underscored. We delve into the realm of code quality on GitHub by examining it from the perspective of code reviews. Our analysis centers on understanding the code review process and presents an approach rooted in regularity to foster superior code quality by enforcing coding standards. In the concluding phase of our research, we investigate the intricacies of communication within technology-related online communities. Our attention is drawn to the impact of user popularity on communication, as elucidated through an examination of comment timelines and commenting communities. To contextualize our findings, we compare the behavioral patterns of GitHub developers and users on other platforms, such as Reddit and Stack Overflow.
|
Page generated in 0.206 seconds