221 |
Development of Enhanced Pavement Deterioration CurvesErcisli, Safak 17 September 2015 (has links)
Modeling pavement deterioration and predicting the pavement performance is crucial for optimum pavement network management. Currently only a few models exist that incorporate the structural capacity of the pavements into deterioration modeling. This thesis develops pavement deterioration models that take into account, along with the age of the pavement, the pavement structural condition expressed in terms of the Modified Structural Index (MSI). The research found MSI to be a significant input parameter that affects the rate of deterioration of a pavement section by using the Akaike Information Criterion (AIC). The AIC method suggests that a model that includes the MSI is at least 10^21 times more likely to be closer to the true model than a model that does not include the MSI. The developed models display the average deterioration of pavement sections for specific ages and MSI values.
Virginia Department of Transportation (VDOT) annually collects pavement condition data on road sections with various lengths. Due to the nature of data collection practices, many biased measurements or influential outliers exist in this data. Upon the investigation of data quality and characteristics, the models were built based on filtered and cleansed data. Following the regression models, an empirical Bayesian approach was employed to reduce the variance between observed and predicted conditions and to deliver a more accurate prediction model. / Master of Science
|
222 |
On the Impact and Defeat of Regular Expression Denial of ServiceDavis, James Collins 28 May 2020 (has links)
Regular expressions (regexes) are a widely-used yet little-studied software component. Engineers use regexes to match domain-specific languages of strings. Unfortunately, many regex engine implementations perform these matches with worst-case polynomial or exponential time complexity in the length of the string. Because they are commonly used in user-facing contexts, super-linear regexes are a potential denial of service vector known as Regular expression Denial of Service (ReDoS). Part I gives the necessary background to understand this problem.
In Part II of this dissertation, I present the first large-scale empirical studies of super-linear regex use. Guided by case studies of ReDoS issues in practice (Chapter 3), I report that the risk of ReDoS affects up to 10% of the regexes used in practice (Chapter 4), and that these findings generalize to software written in eight popular programming languages (Chapter 5). ReDoS appears to be a widespread vulnerability, motivating the consideration of defenses.
In Part III I present the first systematic comparison of ReDoS defenses. Based on the necessary conditions for ReDoS, a ReDoS defense can be erected at the application level, the regex engine level, or the framework/runtime level. In my experiments I report that application-level defenses are difficult and error prone to implement (Chapter 6), that finding a compatible higher-performing regex engine is unlikely (Chapter 7), that optimizing an existing regex engine using memoization incurs (perhaps acceptable) space overheads (Chapter 8), and that incorporating resource caps into the framework or runtime is feasible but faces barriers to adoption (Chapter 9).
In Part IV of this dissertation, we reflect on our findings. By leveraging empirical software engineering techniques, we have exposed the scope of potential ReDoS vulnerabilities, and given strong motivation for a solution. To assist practitioners, we have conducted a systematic evaluation of the solution space. We hope that our findings assist in the elimination of ReDoS, and more generally that we have provided a case study in the value of data-driven software engineering. / Doctor of Philosophy / Software commonly performs pattern-matching tasks on strings. For example, when validating input in a Web form, software commonly tests whether an input fits the pattern of a credit card number or an email address. Software engineers often implement such string-based pattern matching using a tool called regular expressions (regexes). Regexes permit software engineers to succinctly describe the sequences of characters that make up common "languages" like the set of valid Visa credit card numbers (16 digits, starting with a 4) or the set of valid emails (some characters, an '@', and more characters including at least one'.'). Using regexes on untrusted user input in this manner may be a dangerous decision because some regexes take a long time to evaluate. These slow regexes can be exploited by attackers in order to carry out a denial of service attack known as Regular expression Denial of Service (ReDoS). To date, ReDoS has led to outages affecting hundreds of websites and tens of thousands of users.
While the risk of ReDoS is well known in theory, in this dissertation I present the first large-scale empirical studies measuring the extent to which slow regular expressions are used in practice. I found that about 10% of real regular expressions extracted from hundreds of thousands of software projects can exhibit longer-than-expected worst-case behavior in popular programming languages including JavaScript, Python, and Ruby. Motivated by these findings, I then consider a range of ReDoS solution approaches: application refactoring, regex engine replacement, regex engine optimization, and resource caps. I report that application refactoring is error-prone, and that regex engine replacement seems unlikely due to incompatibilities between regex engines. Some resource caps are more successful than others, but all resource cap approaches struggle with adoption. My novel regex engine optimizations seem the most promising approach for protecting existing regex engines, offering significant time reductions with acceptable space overheads.
|
223 |
Investigating the Reproducbility of NPM packagesGoswami, Pronnoy 19 May 2020 (has links)
The meteoric increase in the popularity of JavaScript and a large developer community has led to the emergence of a large ecosystem of third-party packages available via the Node Package Manager (NPM) repository which contains over one million published packages and witnesses a billion daily downloads. Most of the developers download these pre-compiled published packages from the NPM repository instead of building these packages from the available source code. Unfortunately, recent articles have revealed repackaging attacks to the NPM packages. To achieve such attacks the attackers primarily follow three steps – (1) download the source code of a highly depended upon NPM package, (2) inject malicious code, and (3) then publish the modified packages as either misnamed package (i.e., typo-squatting attack) or as the official package on the NPM repository using compromised maintainer credentials. These attacks highlight the need to verify the reproducibility of NPM packages. Reproducible Build is a concept that allows the verification of build artifacts for pre-compiled packages by re-building the packages using the same build environment configuration documented by the package maintainers. This motivates us to conduct an empirical study (1) to examine the reproducibility of NPM packages, (2) to assess the influence of any non-reproducible packages, and (3) to explore the reasons for non-reproducibility. Firstly, we downloaded all versions/releases of 226 most-depended upon NPM packages, and then built each version with the available source code on Github. Secondly, we applied diffoscope, a differencing tool to compare the versions we built against the version downloaded from the NPM repository. Finally, we did a systematic investigation of the reported differences. At least one version of 65 packages was found to be non-reproducible. Moreover, these non- reproducible packages have been downloaded millions of times per week which could impact a large number of users. Based on our manual inspection and static analysis, most reported differences were semantically equivalent but syntactically different. Such differences result due to non-deterministic factors in the build process. Also, we infer that semantic differences are introduced because of the shortcomings in the JavaScript uglifiers. Our research reveals challenges of verifying the reproducibility of NPM packages with existing tools, reveal the point of failures using case studies, and sheds light on future directions to develop better verification tools. / Master of Science / Software packages are distributed as pre-compiled binaries to facilitate software development. There are various package repositories for various programming languages such as NPM (JavaScript), pip (Python), and Maven (Java). Developers install these pre-compiled packages in their projects to implement certain functionality. Additionally, these package repositories allow developers to publish new packages and help the developer community to reduce the delivery time and enhance the quality of the software product. Unfortunately, recent articles have revealed an increasing number of attacks on the package repositories. Moreover, developers trust the pre-compiled binaries, which often contain malicious code. To address this challenge, we conduct our empirical investigation to analyze the reproducibility of NPM packages for the JavaScript ecosystem. Reproducible Builds is a concept that allows any individual to verify the build artifacts by replicating the build process of software packages. For instance, if the developers could verify that the build artifacts of the pre-compiled software packages available in the NPM repository are identical to the ones generated when they individually build that specific package, they could mitigate and be aware of the vulnerabilities in the software packages. The build process is usually described in configuration files such as package.json and DOCKERFILE. We chose the NPM registry for our study because of three primary reasons – (1) it is the largest package repository, (2) JavaScript is the most widely used programming language, and (3) there is no prior dataset or investigation that has been conducted by researchers. We took a two-step approach in our study – (1) dataset collection, and (2) source-code differencing for each pair of software package versions. For the dataset collection phase, we downloaded all available releases/versions of 226 popularly used NPM packages and for the code-differencing phase, we used an off-the-shelf tool called diffoscope. We revealed some interesting findings. Firstly, at least one of the 65 packages as found to be non-reproducible, and these packages have millions of downloads per week. Secondly, we found 50 package-versions to have divergent program semantics which high- lights the potential vulnerabilities in the source-code and improper build practices. Thirdly, we found that the uglification of JavaScript code introduces non-determinism in the build process. Our research sheds light on the challenges of verifying the reproducibility of NPM packages with the current state-of-the-art tools and the need to develop better verification tools in the future. To conclude, we believe that our work is a step towards realizing the reproducibility of NPM packages and making the community aware of the implications of non-reproducible build artifacts.
|
224 |
Empirical Ionospheric Models: The Road To ConductivityEdwards, Thomas Raymond 15 April 2019 (has links)
The Earth's polar ionosphere is a highly dynamic region of the upper atmosphere, and acts as the closure of the greater magnetospheric current system. This region plays host to many electrodynamic effects that impact terrestrial systems, such as power grids, railroads, and pipelines. These effects are fundamentally related to the currents, electric fields, and conductivity present in the polar ionosphere. Understanding and predicting the electrodynamics of this region is vital to being able to determine the physical impacts on terrestrial systems and provide predictions to government and commercial entities.
Empirical models play a key role in the research and forecasting of the solar wind and interplanetary magnetic field's impact on the polar ionosphere, and is an active area of development and research. Recent interest in polar ionospheric conductivity has led to a community-wide campaign to develop our understanding of this portion of the electrodynamic system.
Characterizing the interactions between the solar wind and the polar ionosphere is a difficult task, as the region of interest is highly data starved in many respects. In particular, satellite based data products are scarce due to being costly and logistically difficult. Recent advancements in data sources (such as the Swarm and CHAMP satellite missions) as well as continued research into the physical relationships between solar wind and interplanetary magnetic field drivers have provided the opportunity to develop new, novel tools to study this region of interest. In this dissertation, two polar ionosphere models are described in Chapters 3 and 4, along with the original research that their construction has produced in Chapter 1. These two models are combined to provide a foundation for future research in this area, which is described in Chapter 5. / Doctor of Philosophy / The Earth is subjected to a constant bombardment of solar particles and magnetic fields, known as the solar wind. Our planet’s geomagnetic field protects the atmosphere from this bombardment, and directs the plasma from the solar wind into the magnetic poles of the earth. This plasma flows through a region of the atmosphere called the ionosphere, where its energy is then dissipated. This energy has many impacts on the surface of the planet, including driving currents in power grids and generating auroral displays. The polar ionosphere is the fundamental connection between the solar wind and the planet, and being able to predict how and where this connection occurs is vital to studying its nature. This work describes two models of the plasma properties in the polar ionosphere, and provides some description of the original research that these models have garnered.
|
225 |
The conditional relationship between beta and returns: a re-assessment.Freeman, Mark C., Guermat, C. January 2006 (has links)
No / Several recent empirical tests of the Capital Asset Pricing Model have been based on the conditional relationship between betas and market returns. This paper shows that this method needs reconsideration. An adjusted version of this test is presented. It is then demonstrated that the adjusted technique has similar, or lower, power to the more easily implemented CAPM test of Fama and MacBeth (1973) if returns are normally distributed.
|
226 |
Unfolding the Rationale for Code CommitsAlsafwan, Khadijah Ahmad 06 June 2018 (has links)
One of the main reasons why developers investigate code history is to search for the rationale for code commits. Existing work found that developers report that rationale is one of the most important aspects to understand code changes and that it can be quite difficult to find. While this finding strongly points out the fact that understanding the rationale for code commits is a serious problem for software engineers, no current research efforts have pursued understanding in detail what specifically developers are searching for when they search for rationale. In other words, while the rationale for code commits is informally defined as, "Why was this code implemented this way?" this question could refer to aspects of the code as disparate as, "Why was it necessary to implement this code?"; "Why is this the way in which it was implemented?"; or "Why was the code implemented at that moment?" Our goal with this study is to improve our understanding of what software developers mean when they talk about the rationale for code commits, i.e., how they "unfold" rationale. We additionally study which components of rationale developers find important, which ones they normally need to find, which ones they consider specifically difficult to find, and which ones they normally record in their own code commits. This new, detailed understanding of the components of the rationale for code commits may serve as inspiration for novel techniques to support developers in seeking and accurately recording rationale. / MS / Modern software systems evolution is based on the contribution of a large number of developers. In version control systems, developers introduce packaged changes called code commits for various reasons. In this process of modifying the code, the software developers make some decisions. These decisions need to be understood by other software developers. The question “why the code is this way?” is used by software developers to ask for the rationale behind code changes. The question could refer to aspects of the code as disparate as, “Why was it necessary to implement this code?”; “Why is this the way in which it was implemented?”; or “Why was the code implemented at that moment?” Our goal with this study is to improve our understanding of what software developers mean when they talk about the rationale for code commits, i.e., how they “unfold” rationale. We additionally study which components of rationale developers nd important, which ones they normally need to nd, which ones they consider specically dicult to nd, and which ones they normally record in their own code commits. This new, detailed understanding of the components of the rationale for code commits will allow researchers and tools builders to understand what the developers mean when they mention rationale. Therefore, assisting the development of tools and techniques to support the developers when seeking and recording rationale.
|
227 |
Notes on partial fronting and copy spell-outMüller, Gereon 21 November 2024 (has links)
Central to Trinh’s (2009) analysis of bare predicate fronting is the Prosodic
Condition on Deletable Chains, a version of which is given in (1).
|
228 |
Understanding User and Developer Perceptions of Dark Patterns in Online EnvironmentsLiang, Huayu 03 January 2025 (has links)
With the rapid development of technology, software applications have become essential in people's daily lives. The number of digital platforms (e.g., website and mobile) available is continuously growing, and so are the persuasive designs that impact user's experience and decision-making in an online environment. Deceptive patterns, also known as dark patterns, refer to user interface (UI) design choices crafted to manipulate or trick users into actions that they are not intended to do in digital environments. These patterns, found everywhere in digital interfaces, exploit users' psychological vulnerability and manipulate them into actions that benefit stakeholders at the expense of users' interests. To bring more awareness of the dark patterns, scholarship on the topic is vastly increasing. However, there is limited study on how dark patterns impact users' perceptions and interaction with applications. Furthermore, work has yet to investigate dark patterns from the perspective of software engineers, the developers who implement user interface designs. To that end, our study seeks to explore users' and developers' perspectives on dark patterns In this study, we used a mixed-method approach, surveying each stakeholder group (N_user=66 and N_developer=38) and mining GitHub data (N=2556) to understand end users' perceptions and experiences and developers' discussions and attitudes about dark patterns. Our findings reveal that users often encounter dark patterns online with limited options for avoidance, which evoke negative emotions. Developers report that external pressures influence their decisions to implement dark patterns, and most recognize their adverse effects on trust and user experience. Discussions on GitHub primarily focus on the existence and prevention of dark patterns, often reflecting negative sentiments. With our findings, we aim to raise stakeholders' awareness of dark patterns and promote ethical UI design to mitigate the use of deceptive designs in online environments. / Master of Science / As technology becomes more integral to our daily lives, more digital platforms, such as websites and mobile apps, are being developed. Unfortunately, some designs manipulate users into making choices they did not mean to, like easy sign-up with a one-click button but hard to unsubscribe. These are known as ``dark patterns'' — user interface tricks that take advantage of how people think or behave online, benefiting companies at the users' expense. While research on these deceptive designs is increasing, there is little information on how they affect users or what developers think about them.
For this study, we investigated how users and developers perceive dark patterns in online environments. We surveyed 66 users and 38 developers and analyzed over 2,556 discussions from open-source coding platforms like GitHub, a popular code hosting platform for open-source projects.
Our findings reveal that users frequently encounter dark patterns online, which can lead to negative emotions and provide few alternatives to avoidance. A minority of developers admit to implementing dark patterns due to external pressures, while most recognize their harmful impact on trust and user experience. GitHub discussions primarily focus on the existence and prevention of dark patterns, often reflecting negative sentiments like stress and frustration.
|
229 |
Empirical Correlates of the Personality Assessment Inventory (PAI) in an Outpatient Sample: A Replication and ExtensionPan, Minqi 05 1900 (has links)
The Personality Assessment Inventory (PAI) has gained widespread favor since its publication. However, validation studies of its interpretive descriptors remained limited to date. As such, the primary goal of the current study aimed at validating the interpretive descriptors through the lens of empirical correlates using the PDSQ as the external criterion. It also served as a replication and extension to the 2018 study conducted by Rogers and colleagues. The final archival sample included 204 clients from the UNT Psychology Clinic who were administered PAI between May 2016 and December 2020. Overall, reliability and construct validity were strongly supported for PAI clinical scales. Further, the current study replicated large majority of the correlates identified by Rogers and colleagues, which boosted the confidence in reproducible interpretations based on empirical correlates. Importantly, investigation of item-level and gendered correlates provided crucial interpretive implications that were otherwise obscured. For example, item-level correlates refined interpretation by clarifying the nature of scale-level correlates, particularly those of moderate strength. On the other hand, notable gender differences were identified for certain scales, which led to drastic differences in patterns of gendered vs. non-gendered correlates. Finally, several important methodological considerations are proposed in hope to facilitate the empirical research concerning measurement validity, as well as combat the current replication crisis. The need to adopt more stringent standards for effect size, as well as the instability of correlates of moderate strength were discussed. Implications of clinical practice and future directions for research are also discussed.
|
230 |
Jackknife Emperical Likelihood Method and its ApplicationsYang, Hanfang 01 August 2012 (has links)
In this dissertation, we investigate jackknife empirical likelihood methods motivated by recent statistics research and other related fields. Computational intensity of empirical likelihood can be significantly reduced by using jackknife empirical likelihood methods without losing computational accuracy and stability. We demonstrate that proposed jackknife empirical likelihood methods are able to handle several challenging and open problems in terms of elegant asymptotic properties and accurate simulation result in finite samples. These interesting problems include ROC curves with missing data, the difference of two ROC curves in two dimensional correlated data, a novel inference for the partial AUC and the difference of two quantiles with one or two samples. In addition, empirical likelihood methodology can be successfully applied to the linear transformation model using adjusted estimation equations. The comprehensive simulation studies on coverage probabilities and average lengths for those topics demonstrate the proposed jackknife empirical likelihood methods have a good performance in finite samples under various settings. Moreover, some related and attractive real problems are studied to support our conclusions. In the end, we provide an extensive discussion about some interesting and feasible ideas based on our jackknife EL procedures for future studies.
|
Page generated in 0.0796 seconds