Global ETD Search

1	De-quantizing quantum machine learning algorithms Sköldhed, Stefanie January 2022 (has links) Today, a modern and interesting research area is machine learning. Another new and exciting research area is quantum computation, which is the study of the information processing tasks accomplished by practising quantum mechanical systems. This master thesis will combine both areas, and investigate quantum machine learning. Kerenidis’ and Prakash’s quantum algorithm for recommendation systems, that offered exponential speedup over the best known classical algorithms at the time, will be examined together with Tang’s classical algorithm regarding recommendation systems, which operates in time only polynomial slower than the previously mentioned algorithm. The speedup in the quantum algorithm was achieved by assuming that the algorithm had quantum access to the data structure and that the mapping to the quantum state was performed in polylog(mn). The speedup in the classical algorithm was attained by assuming that the sampling could be performed in O(logn) and O(logmn) for vectors and matrices, respectively. Quantum Machine Learning Algorithm Classical Machine Learning Algorithm Speedup Recommendation Systems Annan elektroteknik och elektronik
2	Using Peak Intensity and Fragmentation Patterns in Peptide SeQuence IDentification (SQID) - A Bayesian Learning Algorithm for Tandem Mass Spectra Ji, Li January 2006 (has links) As DNA sequence information becomes increasingly available, researchers are now tackling the great challenge of characterizing and identifying peptides and proteins from complex mixtures. Automatic database searching algorithms have been developed to meet this challenge. This dissertation is aimed at improving these algorithms to achieve more accurate and efficient peptide and protein identification with greater confidence by incorporating peak intensity information and peptide cleavage patterns obtained in gas-phase ion dissociation research. The underlying hypothesis is that these algorithms can benefit from knowledge about molecular level fragmentation behavior of particular amino acid residues or residue combinations.SeQuence IDentification (SQID), developed in this dissertation research, is a novel Bayesian learning-based method that attempts to incorporate intensity information from peptide cleavage patterns in a database searching algorithm. It directly makes use of the estimated peak intensity distributions for cleavage at amino acid pairs, derived from probability histograms generated from experimental MS/MS spectra. Rather than assuming amino acid cleavage patterns artificially or disregarding intensity information, SQID aims to take advantage of knowledge of observed fragmentation intensity behavior. In addition, SQID avoids the generation of a theoretical spectrum predication for each candidate sequence, needed by other sequencing methods including SEQUEST. As a result, computational efficiency is significantly improved.Extensive testing has been performed to evaluate SQID, by using datasets from the Pacific Northwest National Laboratory, University of Colorado, and the Institute for Systems Biology. The computational results show that by incorporating peak intensity distribution information, the program's ability to distinguish the correct peptides from incorrect matches is greatly enhanced. This observation is consistent with experiments involving various peptides and searches against larger databases with distraction proteins, which indirectly verifies that peptide dissociation behaviors determine the peptide sequencing and protein identification in MS/MS. Furthermore, testing SQID by using previously identified clusters of spectra associated with unique chemical structure motifs leads to the following conclusions: (1) the improvement in identification confidence is observed with a range of peptides displaying different fragmentation behaviors; (2) the magnitude of improvement is in agreement with the peptide cleavage selectivity, that is, more significant improvements are observed with more selective peptide cleavages. tandem mass spectrometry peptide and protein identificaiton bayesian learning sequencing algorithm machine learning algorithm probability-based algorithm
3	An Ensemble Method for Large Scale Machine Learning with Hadoop MapReduce Liu, Xuan 25 March 2014 (has links) We propose a new ensemble algorithm: the meta-boosting algorithm. This algorithm enables the original Adaboost algorithm to improve the decisions made by different WeakLearners utilizing the meta-learning approach. Better accuracy results are achieved since this algorithm reduces both bias and variance. However, higher accuracy also brings higher computational complexity, especially on big data. We then propose the parallelized meta-boosting algorithm: Parallelized-Meta-Learning (PML) using the MapReduce programming paradigm on Hadoop. The experimental results on the Amazon EC2 cloud computing infrastructure show that PML reduces the computation complexity enormously while retaining lower error rates than the results on a single computer. As we know MapReduce has its inherent weakness that it cannot directly support iterations in an algorithm, our approach is a win-win method, since it not only overcomes this weakness, but also secures good accuracy performance. The comparison between this approach and a contemporary algorithm AdaBoost.PL is also performed. Adaboost Meta-learning Big Data Hadoop MapReduce Ensemble Learning Scalable Machine Learning Algorithm
4	Investigation on integration of sustainable manufacturing and mathematical programming for technology selection and capacity planning Nejadi, Fahimeh January 2016 (has links) Concerns about energy supply and climate change have been driving companies towards more sustainable manufacturing while they are looking on the economic side as well. One practicable task to achieve sustainability in manufacturing is choosing more sustainable technologies among available technologies. Combination of two functions of ‘Technology Selection’ and ‘Capacity Planning’ is not usually addressed in the research literature. The importance of integrated decisions on technology selection and capacity planning at such strategic level is therefore essentially important. This is supported by justifications in some selected manufacturing areas particularly concerning economies of the scale and accumulated knowledge. Furthermore, manufacturing firms are working in a global competitive environment that is changing in a continuous way. Strategic design of systems under such circumstances requires a carefully modelled approach to deal with the complexity of uncertainties. The overall project aims are to develop an integrated methodological approach to solving the combined ‘technology selection’ and ‘capacity planning’ problems in manufacturing sector. The approach will also incorporate the multi-perspective concept of sustainability, while taking uncertainties into account. A framework consisting of four modules is proposed. Problem structuring module adopts an Ontology method to map the technology mix combinations and to capture input data. ‘Optimisation for Sustainable Manufacturing’ module addresses the optimisation of technology selection and capacity planning decisions in an integrated way using Goal, Mixed Integer Programming method. The model developed takes the multi-criteria aspect of sustainability development into account. Three criteria, namely a) Environmental (e.g. Energy consumption and Emissions), b) Economics, and c) Technical (e.g. Quality) are involved. ‘Normalisation algorithm by comparison with the best value’ method is adopted in this research in order to facilitate a systematic comparison among various criteria. The economic evaluation is based on ‘Life-Cycle Analysis’ approach. The ‘Present Value (PV)’ method is adopted to address ‘Time Value of Money’, while taking both ‘Inflation’ and ‘Market Return’ into account in order to make the proposed model more realistic. A mathematical model to represent the total PV of each technology investment, including both capital and running costs, is developed. ‘Sensitivity Analysis’ module addresses the uncertainty element of the problem. A controlled set of re-optimisation runs, which is guided by a tool coded in Visual Basic for Applications (VBA), is developed to perform intensive sensitivity analyses. It is aimed to deal with the uncertainty element of the problem. Within ‘Solution Structuring’ module, two knowledge structuring schemes, namely Decision Tree and Interactive Slider Diagram, are proposed to deal with the large size of solution sets generated by the “Sensitivity Analysis” module. An innovative, hybrid, Supervised and Unsupervised Machine Learning algorithm is developed to generate a decision tree that aims to structure the solution set. The unsupervised learning stage is implemented using DBSCAN algorithm, while the supervised learning element adopts C4.5 algorithm. The methodological approach is tested and validated using an exemplar case study on coating processes in an automotive company. The case is characterised by three operations, twelve possible technology mix states, both capital budget and environmental limits, and 243 different sensitivity analysis experiments. The painting systems are evaluated and compared based on their quality, technology life-cycle costs, and their potential VOC (Volatile Organic Compounds) emissions into the air. 658.5
5	An Ensemble Method for Large Scale Machine Learning with Hadoop MapReduce Liu, Xuan January 2014 (has links) We propose a new ensemble algorithm: the meta-boosting algorithm. This algorithm enables the original Adaboost algorithm to improve the decisions made by different WeakLearners utilizing the meta-learning approach. Better accuracy results are achieved since this algorithm reduces both bias and variance. However, higher accuracy also brings higher computational complexity, especially on big data. We then propose the parallelized meta-boosting algorithm: Parallelized-Meta-Learning (PML) using the MapReduce programming paradigm on Hadoop. The experimental results on the Amazon EC2 cloud computing infrastructure show that PML reduces the computation complexity enormously while retaining lower error rates than the results on a single computer. As we know MapReduce has its inherent weakness that it cannot directly support iterations in an algorithm, our approach is a win-win method, since it not only overcomes this weakness, but also secures good accuracy performance. The comparison between this approach and a contemporary algorithm AdaBoost.PL is also performed. Adaboost Meta-learning Big Data Hadoop MapReduce Ensemble Learning Scalable Machine Learning Algorithm
6	ANOMALY DETECTION USING MACHINE LEARNING FORINTRUSION DETECTION Vaishnavi Rudraraju (18431880) 02 May 2024 (has links) <p dir="ltr">This thesis examines machine learning approaches for anomaly detection in network security, particularly focusing on intrusion detection using TCP and UDP protocols. It uses logistic regression models to effectively distinguish between normal and abnormal network actions, demonstrating a strong ability to detect possible security concerns. The study uses the UNSW-NB15 dataset for model validation, allowing a thorough evaluation of the models' capacity to detect anomalies in real-world network scenarios. The UNSW-NB15 dataset is a comprehensive network attack dataset frequently used in research to evaluate intrusion detection systems and anomaly detection algorithms because of its realistic attack scenarios and various network activities.</p><p dir="ltr">Further investigation is carried out using a Multi-Task Neural Network built for binary and multi-class classification tasks. This method allows for the in-depth study of network data, making it easier to identify potential threats. The model is fine-tuned during successive training epochs, focusing on validation measures to ensure its generalizability. The thesis also applied early stopping mechanisms to enhance the ML model, which helps optimize the training process, reduces the risk of overfitting, and improves the model's performance on new, unseen data.</p><p dir="ltr">This thesis also uses blockchain technology to track model performance indicators, a novel strategy that improves data integrity and reliability. This blockchain-based logging system keeps an immutable record of the models' performance over time, which helps to build a transparent and verifiable anomaly detection framework.</p><p dir="ltr">In summation, this research enhances Machine Learning approaches for network anomaly detection. It proposes scalable and effective approaches for early detection and mitigation of network intrusions, ultimately improving the security posture of network systems.</p> anomaly detection, Machine Learning Algorithm etc
7	Information Freshness Optimization in Real-time Network Applications Liu, Zhongdong 12 June 2024 (has links) In recent years, the remarkable development in ubiquitous communication networks and smart portable devices spawned a wide variety of real-time applications that require timely information updates (e.g., autonomous vehicular systems, industrial automation systems, and live streaming services). These real-time applications all have one thing in common: they desire their knowledge of the information source to be as fresh as possible. In order to measure the freshness of information, a new metric, called the Age-of-Information (AoI) is proposed. AoI is defined as the time elapsed since the generation time of the freshest delivered update. This metric is influenced by both the inter-arrival time and the delay of the updates. As a result of these dependencies, the AoI metric exhibits distinct characteristics compared to traditional delay and throughput metrics. In this dissertation, our goal is to optimize AoI under various real-time network applications. Firstly, we investigate a fundamental problem of how exactly various scheduling policies impact AoI performance. Though there is a large body of work studying the AoI performance under different scheduling policies, the use of the update-size information and its combinations with other information (such as arrival-time information and service preemption) to reduce AoI has still not been explored yet. Secondly, as a recently introduced measure of freshness, the relationship between AoI and other performance metrics remains largely ambiguous. We analyze the tradeoffs between AoI and additional performance metrics, including service performance and update cost, within real-world applications. This dissertation is organized into three parts. In the first part, we realize that scheduling policies leveraging the update-size information can substantially reduce the delay, one of the key components of AoI. However, it remains largely unknown how exactly scheduling policies (especially those making use of update-size information) impact the AoI performance. To this end, we conduct a systematic and comparative study to investigate the impact of scheduling policies on the AoI performance in single-server queues and provide useful guidelines for the design of AoI-efficient scheduling policies. In the second part, we analyze the tradeoffs between AoI and other performance metrics in real-world systems. Specifically, we focus on the following two important tradeoffs. (i) The tradeoff between service performance and AoI that arises in the data-driven real-time applications (e.g., Google Maps and stock trading applications). In these applications, the computing resource is often shared for processing both updates from information sources and queries from end users. Hence there is a natural tradeoff between service performance (e.g., response time to queries) and AoI (i.e., the freshness of data in response to user queries). To address this tradeoff, we begin by introducing a simple single-server two-queue model that captures the coupled scheduling between updates and queries. Subsequently, we design threshold-based scheduling policies to prioritize either updates or queries. Finally, we conduct a rigorous analysis of the performance of these threshold-based scheduling policies. (ii) The tradeoff between update cost and AoI that appear in the crowdsensing-based applications (e.g., Google Waze and GasBuddy). On the one hand, users are not satisfied if the responses to their requests are stale; on the other side, there is a cost for the applications to update their information regarding certain points of interest since they typically need to make monetary payments to incentivize users. To capture this tradeoff, we first formulate an optimization problem with the objective of minimizing the sum of the staleness cost (which is a function of the AoI) and the update cost, then we obtain a closed-form optimal threshold-based policy by reformulating the problem as a Markov decision process (MDP). In the third part, we study the minimization of data freshness and transmission costs (e.g., energy cost) under an (arbitrary) time-varying wireless channel without and with machine learning (ML) advice. We consider a discrete-time system where a resource-constrained source transmits time-sensitive data to a destination over a time-varying wireless channel. Each transmission incurs a fixed cost, while not transmitting results in a staleness cost measured by the AoI. The source needs to balance the tradeoff between these transmission and staleness costs. To tackle this challenge, we develop a robust online algorithm aimed at minimizing the sum of transmission and staleness costs, ensuring a worst-case performance guarantee. While online algorithms are robust, they tend to be overly conservative and may perform poorly on average in typical scenarios. In contrast, ML algorithms, which leverage historical data and prediction models, generally perform well on average but lack worst-case performance guarantees. To harness the advantages of both approaches, we design a learning-augmented online algorithm that achieves two key properties: (i) consistency: closely approximating the optimal offline algorithm when the ML prediction is accurate and trusted; (ii) robustness: providing a worst-case performance guarantee even when ML predictions are inaccurate. / Doctor of Philosophy / In recent years, the rapid growth of communication networks and smart devices has spurred the emergence of real-time applications like autonomous vehicles and industrial automation systems. These applications share a common need for timely information. The freshness of information can be measured using a new metric called Age-of-Information (AoI). This dissertation aims to optimize AoI across various real-time network applications, organized into three parts. In the first part, we explore how scheduling policies (particularly those considering update size) impact the AoI performance. Through a systematic and comparative study in single-server queues, we provide useful guidelines for the design of AoI-efficient scheduling policies. The second part explores the tradeoff between update cost and AoI in crowdsensing applications like Google Waze and GasBuddy, where users demand fresh responses to their requests; however, updating information incurs update costs for applications. We aim to minimize the sum of staleness cost (a function of AoI) and update cost. By reformulating the problem as a Markov decision process (MDP), we design a simple threshold-based policy and prove its optimality. In the third part, we study the minimization of data freshness and transmission costs (e.g., energy cost) under a time-varying wireless channel. We first develop a robust online algorithm that achieves a competitive ratio of 3, ensuring a worst-case performance guarantee. Furthermore, when advice is available, e.g., predictions from machine learning (ML) models, we design a learning-augmented online algorithm that exhibits two desired properties: (i) consistency: closely approximating the optimal offline algorithm when the ML prediction is accurate and trusted; (ii) robustness: guaranteeing worst-case performance even with inaccurate ML prediction. While this dissertation marks a significant advancement in AoI research, numerous open problems remain. For instance, our learning-augmented online algorithm treats ML predictions as external inputs. Exploring the co-design and training of ML and online algorithms to improve performance could yield interesting insights. Additionally, while AoI typically assesses update importance based solely on timestamps, the content of updates also holds significance. Incorporating considerations of both age and semantics of information is imperative in future research. Information freshness Age-of-Information latency transmission cost Internet of Things optimization machine learning algorithm algorithm design
8	Optimizing Initialization, Feature Selection, and Tensor Dimension Reduction in Unsupervised Learning: Methods and Applications Huyunting Huang Sr. (8039492) 17 April 2025 (has links) <p dir="ltr">Unsupervised machine learning (ML) is essential for analyzing complex data without labels. Many challenges have been identified. This dissertation addresses three key challenges: clustering initialization, unsupervised feature selection, and dimension reduction for tensors. The thesis also applies unsupervised ML to the airborne LiDAR data.</p><p dir="ltr">Chapter 2 introduces an improved initialization strategy for K-Means clustering and Gaussian Mixture Models (GMM). The proposed method improves clustering stability and accuracy.</p><p dir="ltr">Chapter 3 develops a stepwise unsupervised feature selection framework, called the Forward Partial-Variable Clustering with Full-Variable Loss (FPCFL), to improve clustering performance in high-dimensional data.</p><p dir="ltr">Chapter 4 focuses on tensor dimension reduction and feature selection in multiway data. It introduces Low-Rank Sparse Tensor Approximation (LRSTA) for efficient data compression and High-Order Orthogonal Decomposition (HOOD) for improved sparsity and interpretability, particularly in large-scale datasets like image and video analysis.</p><p dir="ltr">Chapter 5 explores unsupervised ML in airborne LiDAR data, applying clustering and dimensionality reduction to enhance ground filtering and object detection in 3D point clouds.</p><p dir="ltr">This dissertation advances unsupervised ML by improving clustering reliability, optimizing feature selection, and enhancing tensor decomposition, contributing to more effective and scalable data-driven analysis.</p> Semi- and unsupervised learning Statistical theory Optimizatiion Tensor dimension reduction Unsupervised Machine Learning Algorithm feature secelction
9	Strategies for Discriminating Earthquakes Using a Repeating Signal Detector to Investigate Induced Seismicity in Eastern Ohio Chiorini, Sutton 01 December 2019 (has links) No description available. Geological Geology Geophysical Geophysics
10	SMART-LEARNING ENABLED AND THEORY-SUPPORTED OPTIMAL CONTROL Sixiong You (14374326) 03 May 2023 (has links) <p> This work focuses on solving the general optimal control problems with smart-learning-enabled and theory-supported optimal control (SET-OC) approaches. The proposed SET-OC includes two main directions. Firstly, according to the basic idea of the direct method, the smart-learning-enabled iterative optimization algorithm (SEIOA) is proposed for solving discrete optimal control problems. Via discretization and reformulation, the optimal control problem is converted into a general quadratically constrained quadratic programming (QCQP) problem. Then, the SEIOA is applied to solving QCQPs. To be specific, first, a structure-exploiting decomposition scheme is introduced to reduce the complexity of the original problem. Next, an iterative search, combined with an intersection-cutting plane, is developed to achieve global convergence. Furthermore, considering the implicit relationship between the algorithmic parameters and the convergence rate of the iterative search, deep learning is applied to design the algorithmic parameters from an appropriate amount of training data to improve convergence property. To demonstrate the effectiveness and improved computational performance of the proposed SEIOA, the developed algorithms have been implemented in extensive real-world application problems, including unmanned aerial vehicle path planning problems and general QCQP problems. According to the theoretical analysis of global convergence and the simulation results, the efficiency, robustness, and improved convergence rate of the optimization framework compared to the state-of-the-art optimization methods for solving general QCQP problems are analyzed and verified. Secondly, the onboard learning-based optimal control method (L-OCM) is proposed to solve the optimal control problems. Supported by the optimal control theory, the necessary conditions of optimality for optimal control of the optimal control problem can be derived, which leads to two two-point-boundary-value-problems (TPBVPs). Then, critical parameters are identified to approximate the complete solutions of the TPBVPs. To find the implicit relationship between the initial states and these critical parameters, deep neural networks are constructed to learn the values of these critical parameters in real-time with training data obtained from the offline solutions. To demonstrate the effectiveness and improved computational performance of the proposed L-OCM approaches, the developed algorithms have been implemented in extensive real-world application problems, including two-dimensional human-Mars entry, powered-descent, landing guidance problems, and fuel-optimal powered descent guidance (PDG) problems. In addition, considering there is no thorough analysis of the properties of the optimal control profile for PDG when considering the state constraints, a rigid theoretical analysis of the fuel-optimal PDG problem with state constraints is further provided. According to the theoretical analysis and simulation results, the optimality, robustness, and real-time performance of the proposed L-OCM are analyzed and verified, which indicates the potential for onboard implementation. </p> Flight dynamics optimal control problems Quadratic programming. nonconvex programming Machine Learning Algorithm etc Powered-Descent guidance Entry guidance

Search results