Global ETD Search

851	Assisting bug report triage through recommendation Anvik, John 05 1900 (has links) A key collaborative hub for many software development projects is the issue tracking system, or bug repository. The use of a bug repository can improve the software development process in a number of ways including allowing developers who are geographically distributed to communicate about project development. However, reports added to the repository need to be triaged by a human, called the triager, to determine if reports are meaningful. If a report is meaningful, the triager decides how to organize the report for integration into the project's development process. We call triager decisions with the goal of determining if a report is meaningful, repository-oriented decisions, and triager decisions that organize reports for the development process, development-oriented decisions. Triagers can become overwhelmed by the number of reports added to the repository. Time spent triaging also typically diverts valuable resources away from the improvement of the product to the managing of the development process. To assist triagers, this dissertation presents a machine learning approach to create recommenders that assist with a variety of development-oriented decisions. In this way, we strive to reduce human involvement in triage by moving the triager's role from having to gather information to make a decision to that of confirming a suggestion. This dissertation introduces a triage-assisting recommender creation process that can create a variety of different development-oriented decision recommenders for a range of projects. The recommenders created with this approach are accurate: recommenders for which developer to assign a report have a precision of 70% to 98% over five open source projects, recommenders for which product component the report is for have a recall of 72% to 92%, and recommenders for who to add to the cc: list of a report that have a recall of 46% to 72%. We have evaluated recommenders created with our triage-assisting recommender creation process using both an analytic evaluation and a field study. In addition, we present in this dissertation an approach to assist project members to specify the project-specific values for the triage-assisting recommender creation process, and show that such recommenders can be created with a subset of the repository data. bug report triage machine learning recommender
852	Design of a self-paced brain computer interface system using features extracted from three neurological phenomena Fatourechi, Mehrdad 05 1900 (has links) Self-paced Brain computer interface (SBCI) systems allow individuals with motor disabilities to use their brain signals to control devices, whenever they wish. These systems are required to identify the user’s “intentional control (IC)” commands and they must remain inactive during all periods in which users do not intend control (called “no control (NC)” periods). This dissertation addresses three issues related to the design of SBCI systems: 1) their presently high false positive (FP) rates, 2) the presence of artifacts and 3) the identification of a suitable evaluation metric. To improve the performance of SBCI systems, the following are proposed: 1) a method for the automatic user-customization of a 2-state SBCI system, 2) a two-stage feature reduction method for selecting wavelet coefficients extracted from movement-related potentials (MRP), 3) an SBCI system that classifies features extracted from three neurological phenomena: MRPs, changes in the power of the Mu and Beta rhythms; 4) a novel method that effectively combines methods developed in 2) and 3 ) and 5) generalizing the system developed in 3) for detecting a right index finger flexion to detecting the right hand extension. Results of these studies using actual movements show an average true positive (TP) rate of 56.2% at the FP rate of 0.14% for the finger flexion study and an average TP rate of 33.4% at the FP rate of 0.12% for the hand extension study. These FP results are significantly lower than those achieved in other SBCI systems, where FP rates vary between 1-10%. We also conduct a comprehensive survey of the BCI literature. We demonstrate that many BCI papers do not properly deal with artifacts. We show that the proposed BCI achieves a good performance of TP=51.8% and FP=0.4% in the presence of eye movement artifacts. Further tests of the performance of the proposed system in a pseudo-online environment, shows an average TP rate =48.8% at the FP rate of 0.8%. Finally, we propose a framework for choosing a suitable evaluation metric for SBCI systems. This framework shows that Kappa coefficient is more suitable than other metrics in evaluating the performance during the model selection procedure. brain computer interface pattern recognition machine learning
853	Data analysis in proteomics novel computational strategies for modeling and interpreting complex mass spectrometry data Sniatynski, Matthew John 11 1900 (has links) Contemporary proteomics studies require computational approaches to deal with both the complexity of the data generated, and with the volume of data produced. The amalgamation of mass spectrometry -- the analytical tool of choice in proteomics -- with the computational and statistical sciences is still recent, and several avenues of exploratory data analysis and statistical methodology remain relatively unexplored. The current study focuses on three broad analytical domains, and develops novel exploratory approaches and practical tools in each. Data transform approaches are the first explored. These methods re-frame data, allowing for the visualization and exploitation of features and trends that are not immediately evident. An exploratory approach making use of the correlation transform is developed, and is used to identify mass-shift signals in mass spectra. This approach is used to identify and map post-translational modifications on individual peptides, and to identify SILAC modification-containing spectra in a full-scale proteomic analysis. Secondly, matrix decomposition and projection approaches are explored; these use an eigen-decomposition to extract general trends from groups of related spectra. A data visualization approach is demonstrated using these techniques, capable of visualizing trends in large numbers of complex spectra, and a data compression and feature extraction technique is developed suitable for use in spectral modeling. Finally, a general machine learning approach is developed based on conditional random fields (CRFs). These models are capable of dealing with arbitrary sequence modeling tasks, similar to hidden Markov models (HMMs), but are far more robust to interdependent observational features, and do not require limiting independence assumptions to remain tractable. The theory behind this approach is developed, and a simple machine learning fragmentation model is developed to test the hypothesis that reproducible sequence-specific intensity ratios are present within the distribution of fragment ions originating from a common peptide bond breakage. After training, the model shows very good performance associating peptide sequences and fragment ion intensity information, lending strong support to the hypothesis. Proteomics Bioinformatics Machine learning Mass spectrometry
854	Machine Learning Methods and Models for Ranking Volkovs, Maksims 13 August 2013 (has links) Ranking problems are ubiquitous and occur in a variety of domains that include social choice, information retrieval, computational biology and many others. Recent advancements in information technology have opened new data processing possibilities and signi cantly increased the complexity of computationally feasible methods. Through these advancements ranking models are now beginning to be applied to many new and diverse problems. Across these problems data, which ranges from gene expressions to images and web-documents, has vastly di erent properties and is often not human generated. This makes it challenging to apply many of the existing models for ranking which primarily originate in social choice and are typically designed for human generated preference data. As the field continues to evolve a new trend has recently emerged where machine learning methods are being used to automatically learn the ranking models. While these methods typically lack the theoretical support of the social choice models they often show excellent empirical performance and are able to handle large and diverse data placing virtually no restrictions on the data type. These model have now been successfully applied to many diverse ranking problems including image retrieval, protein selection, machine translation and many others. Inspired by these promising results the work presented in this thesis aims to advance machine methods for ranking and develop new techniques to allow e ective modeling of existing and future problems. The presented work concentrates on three di erent but related domains: information retrieval, preference aggregation and collaborative ltering. In each domain we develop new models together with learning and inference methods and empirically verify our models on real-life data. Applied Sciences Artificial Intelligence Machine Learning 0800
855	Machine Learning Methods and Models for Ranking Volkovs, Maksims 13 August 2013 (has links) Ranking problems are ubiquitous and occur in a variety of domains that include social choice, information retrieval, computational biology and many others. Recent advancements in information technology have opened new data processing possibilities and signi cantly increased the complexity of computationally feasible methods. Through these advancements ranking models are now beginning to be applied to many new and diverse problems. Across these problems data, which ranges from gene expressions to images and web-documents, has vastly di erent properties and is often not human generated. This makes it challenging to apply many of the existing models for ranking which primarily originate in social choice and are typically designed for human generated preference data. As the field continues to evolve a new trend has recently emerged where machine learning methods are being used to automatically learn the ranking models. While these methods typically lack the theoretical support of the social choice models they often show excellent empirical performance and are able to handle large and diverse data placing virtually no restrictions on the data type. These model have now been successfully applied to many diverse ranking problems including image retrieval, protein selection, machine translation and many others. Inspired by these promising results the work presented in this thesis aims to advance machine methods for ranking and develop new techniques to allow e ective modeling of existing and future problems. The presented work concentrates on three di erent but related domains: information retrieval, preference aggregation and collaborative ltering. In each domain we develop new models together with learning and inference methods and empirically verify our models on real-life data. Applied Sciences Artificial Intelligence Machine Learning 0800
856	The Mechanical Pathway: Reactivating a Derelict Rail Corridor in Edmonton Nally, Michael 25 November 2010 (has links) This architectural thesis addresses a derelict urban rail corridor and the possibility of combining architecture and landscape to reactivate its latent potential as a dynamic seam in the urban fabric. Edmonton is a city built on a foundation of interconnectedness with the nation. Rail access has established the city as a staging hub for various industrial practices since the mid 19th century: import and export, agriculture, oil and gas, etc. As inner city rail access as been discontinued, parcels of rail land have been left as relics; nostalgic reminders of a formerly expansive arterial mechanical network, in turn connecting the city to a mechanical backbone spanning the nation. This architectural intervention will reactivate a piece of rail land in the northwestern part of downtown Edmonton by establishing a dynamic activity corridor around an energy-harnessing machine. / Apart from in-depth studies in renewable resource harvesting and climate, the thesis is driven by studies in rail and agricultural mechanisms, as well as existing post-industrial park typologies. Edmonton linear park rail machine post-industrial
857	Document Clustering with Dual Supervision Hu, Yeming 19 June 2012 (has links) Nowadays, academic researchers maintain a personal library of papers, which they would like to organize based on their needs, e.g., research, projects, or courseware. Clustering techniques are often employed to achieve this goal by grouping the document collection into different topics. Unsupervised clustering does not require any user effort but only produces one universal output with which users may not be satisfied. Therefore, document clustering needs user input for guidance to generate personalized clusters for different users. Semi-supervised clustering incorporates prior information and has the potential to produce customized clusters. Traditional semi-supervised clustering is based on user supervision in the form of labeled instances or pairwise instance constraints. However, alternative forms of user supervision exist such as labeling features. For document clustering, document supervision involves labeling documents while feature supervision involves labeling features. Their joint of use has been called dual supervision. In this thesis, we first explore and propose a framework to use feature supervision for interactive feature selection by indicating whether a feature is useful for clustering. Second, we enhance the semi-supervised clustering with feature supervision using feature reweighting. Third, we propose a unified framework to combine document supervision and feature supervision through seeding. The newly proposed algorithms are evaluated using oracles and demonstrated to be more helpful in producing better clusters matching a single user's point of view than document clustering without any supervision and with only document supervision. Finally, we conduct a user study to confirm that different users have different understandings of the same document collection and prefer personalized clusters. At the same time, we demonstrate that document clustering with dual supervision is able to produce good personalized clusters even with noisy user input. Dual supervision is also demonstrated to be more effective in personalized clustering than no supervision or any single supervision. We also analyze users' behaviors during the user study and present suggestions for the design of document management software. Document Management Text Mining Machine Learning
858	Towards Coevolutionary Genetic Programming with Pareto Archiving Under Streaming Data Atwater, Aaron 13 August 2013 (has links) Classification under streaming data constraints implies that training must be performed continuously, can only access individual exemplars for a short time after they arrive, must adapt to dynamic behaviour over time, and must be able to retrieve a current classifier at any time. A coevolutionary genetic programming framework is adapted to operate in non-stationary streaming data environments. Methods to generate synthetic datasets for benchmarking streaming classification algorithms are introduced, and the proposed framework is evaluated against them. The use of Pareto archiving is evaluated as a mechanism for retaining access to a limited number of useful exemplars throughout training, and several fitness sharing heuristics for archiving are evaluated. Fitness sharing alone is found to be most effective under streams with continuous (incremental) changes, while the addition of an aging heuristic is preferred when the stream has stepwise changes. Tapped delay lines are explored as a method for explicitly incorporating sequence context in cyclical data streams, and their use in combination with the aging heuristic suggests a promising route forward. / Hyperref'd copy available at: https://web.cs.dal.ca/~atwater/ computer science genetic programming machine learning classification
859	Analysis of users' procedural compliance in controlling a simulated process Mann, Olga Teresa Lopez 12 1900 (has links) No description available. Human-machine systems Process control Data processing
860	Design and development of cost-effective computer interfacing systems for mathematical programming algorithms Papacostadopoulos, Christos Paul 08 1900 (has links) No description available. Human-machine systems Linear programming Algorithms

Search results