• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 20
  • 2
  • 1
  • Tagged with
  • 26
  • 26
  • 26
  • 16
  • 16
  • 11
  • 9
  • 8
  • 8
  • 7
  • 6
  • 6
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Adaptively-Halting RNN for Tunable Early Classification of Time Series

Hartvigsen, Thomas 11 November 2018 (has links)
Early time series classification is the task of predicting the class label of a time series before it is observed in its entirety. In time-sensitive domains where information is collected over time it is worth sacrificing some classification accuracy in favor of earlier predictions, ideally early enough for actions to be taken. However, since accuracy and earliness are contradictory objectives, a solution to this problem must find a task-dependent trade-off. There are two common state-of-the-art methods. The first involves an analyst selecting a timestep at which all predictions must be made. This does not capture earliness on a case-by-case basis, so if the selecting timestep is too early, all later signals are missed, and if a signal happens early, the classifier still waits to generate a prediction. The second method is the exhaustive search for signals, which encodes no timing information and is not scalable to high dimensions or long time series. We design the first early classification model called EARLIEST to tackle this multi-objective optimization problem, jointly learning (1) to decide at which time step to halt and generate predictions and (2) how to classify the time series. Each of these is learned based on the task and data features. We achieve an analyst-controlled balance between the goals of earliness and accuracy by pairing a recurrent neural network that learns to classify time series as a supervised learning task with a stochastic controller network that learns a halting-policy as a reinforcement learning task. The halting-policy dictates sequential decisions, one per timestep, of whether or not to halt the recurrent neural network and classify the time series early. This pairing of networks optimizes a global objective function that incorporates both earliness and accuracy. We validate our method via critical clinical prediction tasks in the MIMIC III database from the Beth Israel Deaconess Medical Center along with another publicly available time series classification dataset. We show that EARLIEST out-performs two state-of-the-art LSTM-based early classification methods. Additionally, we dig deeper into our model's performance using a synthetic dataset which shows that EARLIEST learns to halt when it observes signals without having explicit access to signal locations. The contributions of this work are three-fold. First, our method is the first neural network-based solution to early classification of time series, bringing the recent successes of deep learning to this problem. Second, we present the first reinforcement-learning based solution to the unsupervised nature of early classification, learning the underlying distributions of signals without access to this information through trial and error. Third, we propose the first joint-optimization of earliness and accuracy, allowing learning of complex relationships between these contradictory goals.
2

Difference Histograms: A new tool for time series analysis applied to bearing fault diagnosis

"van Wyk, BJ, van Wyk, MA, Qi,G 24 December 2008 (has links)
Abstract A powerful tool for bearing time series feature extraction and classification is introduced that is computationally inexpensive, easy to implement and suitable for real-time applications. In this paper the proposed technique is applied to two rolling element bearing time series classification problems and shown that in some cases no data pre-processing, artificial neural network or nearest neighbour approaches are required. From the results obtained it is clear that for the specific applications considered, the proposed method performed as well as or better than alternative approaches based on conventional feature extraction.
3

Concatenated Decision Paths Classification for Time Series Shapelets - A New Approach for One Dimensional Data Classification and its Application

Mitzev, Ivan Stefanov 04 May 2018 (has links)
Time series are very common in presenting collected data such as economic indicators, natural phenomenon, control engineering data, among others. In the last decade, the interest in time series data mining increased as the amount of collected data increased dramatically. Standard approaches for time series classification are based on collecting distance measures, such as the Euclidian distance (ED) and dynamic time warping (DTW) along with 1-NN classifier for further classification. Recently, more advanced types of classification were found, introducing primitives (named time series shapelet) that consistently represent a certain class. The time series shapelet is a small sub-section of the entire time series, which is “particularly discriminating”. It appears that shapelets based classification produces higher accuracies on some data sets, based on the fact that the global features are more sensitive to noise than locals. Despite its advantages, the time series shapelets classification has an apparent disadvantage: very slow training time. This work attempts to improve the training time for the originally proposed time series shapelets classification algorithm and introduces a new approach for time series classification based on concatenated decision tree paths. First, the classical algorithm for time series classification based on shapelets, is significantly improved in terms of the training time. The improvement is based on using randomly generated sequences tuned in a particle-swarm-optimization (PSO) environment, instead of using sub-series from the original time series. Second, a new highly accurate classification method, based on concatenated decision tree paths, is introduced. The approach builds a unique representative pattern of a certain class based on the taken paths in a pool of decision trees. Third, the proposed method has been successfully extended for a 2-class-labels classification problem where only one decision tree can be built. A variety of 2-class-labels decision trees were built based on different splitting criterion (distance to a random shapelet); thus- increasing the pool of decision trees and increasing the overall accuracy. Fourth, the proposed method was successfully applied on two classes image classification problem, by converting the image into time series. An accuracy of around 95% was achieved for the pedestrian detection case from the Daimler database.
4

TEST ORACLE AUTOMATION WITH MACHINE LEARNING : A FEASIBILITY STUDY

Imamovic, Nermin January 2018 (has links)
The train represents a complex system, where every sub-system has an important role. If a subsystem doesn’t work how it should, the correctness of whole the train can be uncertain. To ensure that system works properly, we should test each sub-system individually and integrate them together in the whole system. Each of these subsystems consists of the different modules with different functionalities what should be tested. Testing of different functionalities often requires a different approach. For some functionalities, it is necessary domain knowledge from the human expert, such as classification of signals in different use cases in Propulsion and Controls (PPC) in Bombardier Transportation. Due to this reason, we need to simulate of using experts knowledge in the certain domain. We are investigating the use of machine learning techniques for solving this cases and creating system what will automatically classify different signals using the previous human knowledge. This case study is conducted in Bombardier Transportation (BT), Västerås in departments Train Control Management System (TCMS) and Propulsion and Controls (PPC), where data is collected, analyzed and evaluated. We proposed a method for solving the oracle problem based on machine learning approach for different for certain use case. Also, we explained different steps what can be used for solving the test oracle problem where signals are part of verdict process
5

The Application of Machine Learning Techniques in Flight Test Applications

Cooke, Alan, Melia, Thomas, Grayson, Siobhan 11 1900 (has links)
This paper discusses the use of diagnostics based on machine learning (ML) within a flight test context. The paper begins by discussing some of the problems associated with instrumenting a test aircraft and how they could be ameliorated using ML-based diagnostics. We then describe a number of types of supervised ML algorithms which can be used in this context. In addition, key practical aspects of applying these algorithms, such as feature engineering and parameter selection, are also discussed. The paper then outlines a real-world application developed by Curtiss-Wright, called Machine Learning for Advanced System Diagnostics (MLASD). This description includes key challenges that were encountered during the development process and how suitable input features were identified. Real-world results are also presented. Finally, we suggest some further applications of ML techniques, in addition to describing other areas of development.
6

Bug Prediction with Machine Learning : Bloodhound 0.1

Rehnholm, Gustav, Rysjö, Felix January 2021 (has links)
Introduction   Bugs in software is a problem that grows over time if they are not dealt with in an early stage, therefore it is desirable to find bugs as early as possible. Bugs usually correlate with low software quality, which can be measured with different code metrics. The goal of this thesis is to find out if machine learning can be used to predict bugs, using code metric trends.  Method   To achieve the thesis goal a program was developed, which will be called Bloodhound, that analyses code metric trends to predict bugs using the machine learning algorithm k nearest neighbour. The code metrics required to do so is extracted using the program cdbs, which in turn uses the program SonarQube to create the source code metrics.  Results   Bloodhound were trained with a time-frame of 42 days between the dates June 1, 2016 to July 13, 2016 containing 202 commits and 312 changed files from the JabRef repository. The files were changed on average 1.5 times. Bloodhound never found more than 25% of the bugs and of its bug predictions, was right at most 42% of the time.  Conclusion   Bloodhound did not succeed in predicting bugs. But that was most likely because the time frame was too short to generate any significant trends.
7

Banger for the Buck : Predicting Growth of Music Tracks using Machine Learning / En sång för slanten

Nilsson, Elliot, Wensink, Liza January 2022 (has links)
The advent of music streaming has made it increasingly important for actors in the music industry to understand if tracks are going to succeed or not. This study investigates if it is possible to accurately classify the growth of the listener base of a music track based on multivariate time series with listener behavior data. 18 popular time series classification algorithms were used to build predictive models which were evaluated in a 10-fold cross-validation. We also examined the algorithms’ potential to deliver business value for a record label. Lastly, the possibilities and challenges of applying a data-driven business model in the music industry were investigated by performing a comparative analysis of a modern and traditional record label. Six algorithms were found to significantly outperform the baseline. Two algorithms based on convolutional kernels, RR and AMini, were found to present the biggest business value because of their accuracy and low time complexity. While it may be necessary for record labels to adopt data-driven business models to flourish in the modern market, there are difficulties regarding the competitiveness of digital solutions and complications in moving the focus from networking to developing technology. / Spridningen av musiktjänster har gjort det alltmer viktigt för aktörer i musikbranschen att förstå vilka låtar som kommer att lyckas och inte. Denna studie undersöker om det är möjligt att klassificera tillväxten av en låts lyssnarantal baserat på multivariata tidsserier innehållandes data om lyssnarbeteende. 18 populära algoritmer för tidsserieklassificering användes för att bygga prediktiva modeller som utvärderades med 10-delad korsvalidering. Vi undersökte sedan algoritmernas potential att skapa affärsvärde för ett skivbolag. Slutligen studerades möjligheter och utmaningar som datadrivna affärsmodeller presenterar i denna bransch genom en komparativ analys av ett modernt och traditionellt skivbolag. Sex algoritmer visade sig signifikant överträffa en baslinjeklassificerare. Vi fann att två algoritmer baserade på faltningskärnor, RR och AMini, kunde skapa störst affärsvärde på grund av deras noggrannhet samt låga tidskomplexitet. Det verkar vara nödvändigt för skivbolag att anamma datadrivna affärsmodeller för att frodas i den moderna marknaden, men det finns svårigheter som måste beaktas vad gäller konkurrenskraften för digitala lösningar samt förflyttandet av fokuset från nätverksbyggande till teknologiutveckling.
8

IMBALANCED TIME SERIES FORECASTING AND NEURAL TIME SERIES CLASSIFICATION

Chen, Xiaoqian 01 August 2023 (has links) (PDF)
This dissertation will focus on the forecasting and classification of time series. Specifically, the forecasting problem will focus on imbalanced time series (ITS) which contain a mix of a mix of low probability extreme observations and high probability normal observations. Two approaches are proposed to improve the forecasting of ITS. In the first approach proposed in chapter 2, an ITS will be modelled as a composition of normal and extreme observations, the input predictor variables and the associated forecast output will be combined into moving blocks, and the blocks will be categorized as extreme event (EE) or normal event (NE) blocks. Imbalance will be decreased by oversampling the minority EE blocks and undersampling the majority NE blocks using modifications of block bootstrapping and synthetic minority oversampling technique (SMOTE). Convolution neural networks (CNNs) and long-short term memory (LSTMs) will be selected for forecast modelling. In the second approach described in chapter 3, which focuses on improving the forecasting accuracies LSTM models, a training strategy called Circular-Shift Circular Epoch Training (CSET), is proposed to preserve the natural ordering of observations in epochs during training without any attempt to balance the extreme and normal observations. The strategy will be universal because it could be applied to train LSTMs to forecast events in normal time series or in imbalanced time series in exactly the same manner. The CSET strategy will be formulated for both univariate and multivariate time series forecasting. The classification problem will focus on the classification event-related potential neural time series by exploiting information offered by the cone of influence (COI) of the continuous wavelet transform (CWT). The COI is a boundary that is superimposed on the wavelet scalogram to delineate the coefficients that are accurate from those that are inaccurate due to edge effects. The features derived from the inaccurate coefficients are, therefore, unreliable. It is hypothesized that the classifier performance would improve if unreliable features, which are outside the COI, are zeroed out, and the performance would improve even further if those features are cropped out completely. Two CNN multidomain models will be introduced to fuse the multichannel Z-scalograms and the V-scalograms. In the first multidomain model, referred to as the Z-CuboidNet, the input to the CNN will be generated by fusing the Z-scalograms of the multichannel ERPs into a frequency-time-spatial cuboid. In the second multidomain model, referred to as the V-MatrixNet, the CNN input will be formed by fusing the frequency-time vectors of the V-scalograms of the multichannel ERPs into a frequency-time-spatial matrix.
9

Time-Series Classification: Technique Development and Empirical Evaluation

Yang, Ching-Ting 31 July 2002 (has links)
Many interesting applications involve decision prediction based on a time-series sequence or a set of time-series sequences, which are referred to as time-series classification problems. Past classification analysis research predominately focused on constructing a classification model from training instances whose attributes are atomic and independent. Direct application of traditional classification analysis techniques to time-series classification problems requires the transformation of time-series data into non-time-series data attributes by applying some statistical operations (e.g., average, sum, etc). However, such statistical transformation often results in information loss. In this thesis, we proposed the Time-Series Classification (TSC) technique, based on the nearest neighbor classification approach. The result of empirical evaluation showed that the proposed time-series classification technique had better performance than the statistical-transformation-based approach.
10

System Complexity Reduction via Feature Selection

January 2011 (has links)
abstract: This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve high accuracy, but the combination of many rules is difficult to interpret. Rule condition subset selection (RCSS) methods for associative classification are considered. RCSS aims to prune the rule conditions into a subset via feature selection. The subset then can be summarized into rule-based classifiers. Experiments show that classifiers after RCSS can substantially improve the classification interpretability without loss of accuracy. An ensemble feature selection method is proposed to learn Markov blankets for either discrete or continuous networks (without linear, Gaussian assumptions). The method is compared to a Bayesian local structure learning algorithm and to alternative feature selection methods in the causal structure learning problem. Feature selection is also used to enhance the interpretability of time series classification. Existing time series classification algorithms (such as nearest-neighbor with dynamic time warping measures) are accurate but difficult to interpret. This research leverages the time-ordering of the data to extract features, and generates an effective and efficient classifier referred to as a time series forest (TSF). The computational complexity of TSF is only linear in the length of time series, and interpretable features can be extracted. These features can be further reduced, and summarized for even better interpretability. Lastly, two variable importance measures are proposed to reduce the feature selection bias in tree-based ensemble models. It is well known that bias can occur when predictor attributes have different numbers of values. Two methods are proposed to solve the bias problem. One uses an out-of-bag sampling method called OOBForest, and the other, based on the new concept of a partial permutation test, is called a pForest. Experimental results show the existing methods are not always reliable for multi-valued predictors, while the proposed methods have advantages. / Dissertation/Thesis / Ph.D. Industrial Engineering 2011

Page generated in 0.1725 seconds