Global ETD Search

681	A Boosted-Window Ensemble Elahi, Haroon January 2014 (has links) Context. The problem of obtaining predictions from stream data involves training on the labeled instances and suggesting the class values for the unseen stream instances. The nature of the data-stream environments makes this task complicated. The large number of instances, the possibility of changes in the data distribution, presence of noise and drifting concepts are just some of the factors that add complexity to the problem. Various supervised-learning algorithms have been designed by putting together efficient data-sampling, ensemble-learning, and incremental-learning methods. The performance of the algorithm is dependent on the chosen methods. This leaves an opportunity to design new supervised-learning algorithms by using different combinations of constructing methods. Objectives. This thesis work proposes a fast and accurate supervised-learning algorithm for performing predictions on the data-streams. This algorithm is called as Boosted-Window Ensemble (BWE), which is invented using the mixture-of-experts technique. BWE uses Sliding Window, Online Boosting and incremental-learning for data-sampling, ensemble-learning, and maintaining a consistent state with the current stream data, respectively. In this regard, a sliding window method is introduced. This method uses partial-updates for sliding the window on the data-stream and is called Partially-Updating Sliding Window (PUSW). The investigation is carried out to compare two variants of sliding window and three different ensemble-learning methods for choosing the superior methods. Methods. The thesis uses experimentation approach for evaluating the Boosted-Window Ensemble (BWE). CPU-time and the Prediction accuracy are used as performance indicators, where CPU-time is the execution time in seconds. The benchmark algorithms include: Accuracy-Updated Ensemble1 (AUE1), Accuracy-Updated Ensemble2 (AUE2), and Accuracy-Weighted Ensemble (AWE). The experiments use nine synthetic and five real-world datasets for generating performance estimates. The Asymptotic Friedman test and the Wilcoxon Signed-Rank test are used for hypothesis testing. The Wilcoxon-Nemenyi-McDonald-Thompson test is used for performing post-hoc analysis. Results. The hypothesis testing suggests that: 1) both for the synthetic and real-wrold datasets, the Boosted Window Ensemble (BWE) has significantly lower CPU-time values than two benchmark algorithms (Accuracy-updated Ensemble1 (AUE1) and Accuracy-weighted Ensemble (AWE). 2) BWE returns similar prediction accuracy as AUE1 and AWE for synthetic datasets. 3) BWE returns similar prediction accuracy as the three benchmark algorithms for the real-world datasets. Conclusions. Experimental results demonstrate that the proposed algorithm can be as accurate as the state-of-the-art benchmark algorithms, while obtaining predictions from the stream data. The results further show that the use of Partially-Updating Sliding Window has resulted in lower CPU-time for BWE as compared with the chunk-based sliding window method used in AUE1, AUE2, and AWE. Stream Mining Supervised-learning by classification Online learning algorithms Ensemble Methods Boosting Computer Sciences Datavetenskap (datalogi)
682	Etude du régime d'écoulement du fluide dans le jeu d'un ensemble piston-cylindre en vue de l'optimisation du calcul du coefficient de déformation. / The study of the fluid flow in the piston-cylinder assembly gap for optimizing the deformation coefficient calculation. Wongthep, Padipat 07 October 2013 (has links) Les balances manométriques sont utilisées en métrologie des pressions statiques.Des projets européens tel qu'Euromet 463 ont mis en évidence des écarts systématiques entreles mesures expérimentales et les calculs des paramètres nécessaires à la caractérisation desbalances de référence. La vitesse de chute du piston est l'un d'eux pourtant essentiel dans laprocédure d'étalonnage. L'objectif des travaux de thèse est l'ajustement des méthodesd'estimation de cette vitesse de chute. Cela permettra d'améliorer la caractérisation du jeuinterne de la balance, de déterminer plus précisément la section effective de ce jeu et parconséquent d'améliorer l'incertitude portant sur le coefficient de déformation, paramètre clé del'étalonnage par comparaison. Jusqu'à présent le modèle de calcul de l'écoulement du fluidedans la balance était quasi unidimensionnel. Il assimilait le jeu entre le piston et le cylindre àdeux parois parallèles. Dans cette étude, les équations de l'écoulement du flu ide sont modifiéespour évaluer l'influence du modèle dans un jeu annulaire. Les corrections dues à la vitesse dechute du piston sont également prises en compte. Les calculs des déformations des structuressont réalisés en utilisant la méthode des éléments finis. Les travaux expérimentaux portent surdes ensembles piston-cylindre 50, 200 et 1000 MPa du Laboratoire National de Métrologie etd'Essais (LNE). Une confrontation calcul-expérience est réalisée en prenant en compte lesparamètres de variabilités tels que la géométrie ou les propriétés du fluide. / The pressure balances are utilized in the metrology of the static pressure. TheEuropean project as "Euromet 463" has underlined the lack of agreement between experimentalmeasurements and calculations of the parameters necessary for the pressure balance. Thepiston fall rate is, in particular, an essential parameter in the calibration procedure. The aim ofthe thesis is the adjustment of methods for estimating the piston fall rate. This will improve thecharacterization of the gap, determine the effective area more precisely and consequentlyimprove the uncertainty on the pressure distortion coefficient, a key parameter for the calibrationby comparison. lndeed, the former quasi one-dimensional madel assimilates the gap betweenpiston and cylinder as formed by two parallel walls, which is an approximation. ln this study, theequations of the fluid flow are modified to evaluate the influence of the model in an annular gapmadel. ln addition, corrections due to the velocity of the piston wall are taken into account. Thisresearch work is applied on the piston-cylinder units 50, 200 and 1000 MPa of the Laboratoirenational de métrologie et d'essais (LNE). Taking into consideration the variability of parameterssuch as geometry or fluid properties, a comparison between the experiment and the calculationis carried out. Balance de pression Ensemble piston-cylindre Coefficient de déformation Pressure balance Pisto-cylinder unit Pressure distortion coefficient
683	Sur les barrières des systèmes non linéaires sous contraintes avec une application aux systèmes hybrides / On Barriers in Constrained Nonlinear Systems with an Application to Hybrid Systems Esterhuizen, Willem 18 December 2015 (has links) Cette thèse est consacrée à l'étude de la théorie des barrières pour les systèmes non linéaires sous contraintes d'entrées et d'état. La principale contribution concerne la généralisation au cas de contraintes mixtes, c'est-à-dire dépendant des entrées et de l'état de façon couplée. Ce type de contraintes apparaît souvent dans les applications et dans les systèmes différentiellement plats sous contraintes. On prouve un théorème du type principe du minimum qui permet de construire la barrière et l'ensemble admissible associé. De plus, dans le cas d'intersection de plusieurs trajectoires ainsi construites, on démontre que les points intersections transversaux sont des points d'arrêt de la barrière. Ces résultats sont utilisés pour calculer l'ensemble admissible d'un pendule inversé avec un câble non-rigide monté sur un chariot, la contrainte correspondant au fait que le câble reste tendu. Ce problème correspond en fait à la détermination de l'ensemble potentiellement sûr dans le cadre des systèmes hybrides. / This thesis deals with the theory of barriers in input and state constrained nonlinear systems. Our main contribution is a generalisation to the case where the constraints are mixed, that is they depend on both the input and the state in a coupled way. Constraints of this type often appear in applications, as well as in constrained flat systems. We prove a minimum-like principle that allows the construction of the barrier and the associated admissible set. Moreover, in case of intersection of some of the trajectories involved in this principle, we prove that such transversal intersection points are stopping points of the barrier.We demonstrate the utility of all the theoretical contributions by finding the admissible set for the inverted pendulum on a cart with a non-rigid cable, the constraint being that the cable remains taut. Note that this problem corresponds to the determination of potentially safe sets in hybrid systems. Barrières Systèmes non linéaires Ensemble admissible Barriers Nonlinear systems Admissible set 003.7
684	The Sneetches Schneider, Gregory Alan 12 1900 (has links) The Sneetches is a theater piece for children based on the Dr. Suess story The Sneetches (Random House, New York, 1961). It is scored for narrator, flute, B6 clarinet, bassoon, violins I & II, viola, and cello with optional staging. The staged version of The Sneetches requires two to six actors/dancers, appropriate scenery and props, and the active participation of children from the audience, preferably ages eight or under. The Sneetches is essentially through-composed. The overall form of the music is shaped primarily by the events portrayed in the narrative. Although individual subsections may have traditional forms, they should not be viewed as independent movements of a larger work, but rather as fragments of a whole. theatre pieces music compositions Seuss, -- Dr. -- Musical settings. Seuss, -- Dr.
685	An investigation of ensemble methods to improve the bias and/or variance of option pricing models based on Lévy processes Steinki, Oliver January 2015 (has links) This thesis introduces a novel theoretical option pricing ensemble framework to improve the bias and variance of option pricing models, especially those based on Levy Processes. In particular, we present a completely new, yet very general theoretical framework to calibrate and combine several option pricing models using ensemble methods. This framework has four main steps: general option pricing tasks, ensemble generation, ensemble pruning and ensemble integration. The modularity allows for a exible implementation in terms of asset classes, base models, pricing techniques and ensemble architecture. 338.5
686	Ensemble Learning Method on Machine Maintenance Data Zhao, Xiaochuang 05 November 2015 (has links) In the industry, a lot of companies are facing the explosion of big data. With this much information stored, companies want to make sense of the data and use it to help them for better decision making, especially for future prediction. A lot of money can be saved and huge revenue can be generated with the power of big data. When building statistical learning models for prediction, companies in the industry are aiming to build models with efficiency and high accuracy. After the learning models have been developed for production, new data will be generated. With the updated data, the models have to be updated as well. Due to this nature, the model performs best today doesn’t mean it will necessarily perform the same tomorrow. Thus, it is very hard to decide which algorithm should be used to build the learning model. This paper introduces a new method that ensembles the information generated by two different classification statistical learning algorithms together as inputs for another learning model to increase the final prediction power. The dataset used in this paper is NASA’s Turbofan Engine Degradation data. There are 49 numeric features (X) and the response Y is binary with 0 indicating the engine is working properly and 1 indicating engine failure. The model’s purpose is to predict whether the engine is going to pass or fail. The dataset is divided in training set and testing set. First, training set is used twice to build support vector machine (SVM) and neural network models. Second, it used the trained SVM and neural network model taking X of the training set as input to predict Y1 and Y2. Then, it takes Y1 and Y2 as inputs to build the Penalized Logistic Regression model, which is the ensemble model here. Finally, use the testing set follow the same steps to get the final prediction result. The model accuracy is calculated using overall classification accuracy. The result shows that the ensemble model has 92% accuracy. The prediction accuracies of SVM, neural network and ensemble models are compared to prove that the ensemble model successfully captured the power of the two individual learning model. Machine Learning Support Vector Machine Ensemble Penalized Logistic Regression Predictive Maintenance Binary Classification Statistics and Probability
687	Dynamic Committees for Handling Concept Drift in Databases (DCCD) AlShammeri, Mohammed January 2012 (has links) Concept drift refers to a problem that is caused by a change in the data distribution in data mining. This leads to reduction in the accuracy of the current model that is used to examine the underlying data distribution of the concept to be discovered. A number of techniques have been introduced to address this issue, in a supervised learning (or classification) setting. In a classification setting, the target concept (or class) to be learned is known. One of these techniques is called “Ensemble learning”, which refers to using multiple trained classifiers in order to get better predictions by using some voting scheme. In a traditional ensemble, the underlying base classifiers are all of the same type. Recent research extends the idea of ensemble learning to the idea of using committees, where a committee consists of diverse classifiers. This is the main difference between the regular ensemble classifiers and the committee learning algorithms. Committees are able to use diverse learning methods simultaneously and dynamically take advantage of the most accurate classifiers as the data change. In addition, some committees are able to replace their members when they perform poorly. This thesis presents two new algorithms that address concept drifts. The first algorithm has been designed to systematically introduce gradual and sudden concept drift scenarios into datasets. In order to save time and avoid memory consumption, the Concept Drift Introducer (CDI) algorithm divides the number of drift scenarios into phases. The main advantage of using phases is that it allows us to produce a highly scalable concept drift detector that evaluates each phase, instead of evaluating each individual drift scenario. We further designed a novel algorithm to handle concept drift. Our Dynamic Committee for Concept Drift (DCCD) algorithm uses a voted committee of hypotheses that vote on the best base classifier, based on its predictive accuracy. The novelty of DCCD lies in the fact that we employ diverse heterogeneous classifiers in one committee in an attempt to maximize diversity. DCCD detects concept drifts by using the accuracy and by weighing the committee members by adding one point to the most accurate member. The total loss in accuracy for each member is calculated at the end of each point of measurement, or phase. The performance of the committee members are evaluated to decide whether a member needs to be replaced or not. Moreover, DCCD detects the worst member in the committee and then eliminates this member by using a weighting mechanism. Our experimental evaluation centers on evaluating the performance of DCCD on various datasets of different sizes, with different levels of gradual and sudden concept drift. We further compare our algorithm to another state-of-the-art algorithm, namely the MultiScheme approach. The experiments indicate the effectiveness of our DCCD method under a number of diverse circumstances. The DCCD algorithm generally generates high performance results, especially when the number of concept drifts is large in a dataset. For the size of the datasets used, our results showed that DCCD produced a steady improvement in performance when applied to small datasets. Further, in large and medium datasets, our DCCD method has a comparable, and often slightly higher, performance than the MultiScheme technique. The experimental results also show that the DCCD algorithm limits the loss in accuracy over time, regardless of the size of the dataset. Data Mining Machine Learning Concept Drift Concept Shift Non-Stationary Environments Ensemble Learning Learning Committees Dynamic Committees
688	Intelligent Adaptation of Ensemble Size in Data Streams Using Online Bagging Olorunnimbe, Muhammed January 2015 (has links) In this era of the Internet of Things and Big Data, a proliferation of connected devices continuously produce massive amounts of fast evolving streaming data. There is a need to study the relationships in such streams for analytic applications, such as network intrusion detection, fraud detection and financial forecasting, amongst other. In this setting, it is crucial to create data mining algorithms that are able to seamlessly adapt to temporal changes in data characteristics that occur in data streams. These changes are called concept drifts. The resultant models produced by such algorithms should not only be highly accurate and be able to swiftly adapt to changes. Rather, the data mining techniques should also be fast, scalable, and efficient in terms of resource allocation. It then becomes important to consider issues such as storage space needs and memory utilization. This is especially relevant when we aim to build personalized, near-instant models in a Big Data setting. This research work focuses on mining in a data stream with concept drift, using an online bagging method, with consideration to the memory utilization. Our aim is to take an adaptive approach to resource allocation during the mining process. Specifically, we consider metalearning, where the models of multiple classifiers are combined into an ensemble, has been very successful when building accurate models against data streams. However, little work has been done to explore the interplay between accuracy, efficiency and utility. This research focuses on this issue. We introduce an adaptive metalearning algorithm that takes advantage of the memory utilization cost of concept drift, in order to vary the ensemble size during the data mining process. We aim to minimize the memory usage, while maintaining highly accurate models with a high utility. We evaluated our method against a number of benchmarking datasets and compare our results against the state-of-the art. Return on Investment (ROI) was used to evaluate the gain in performance in terms of accuracy, in contrast to the time and memory invested. We aimed to achieve high ROI without compromising on the accuracy of the result. Our experimental results indicate that we achieved this goal. Data stream Concept drift Metalearning Cost sensitive adaptation ROI Utility Adaptive ensemble size Online bagging
689	An Ensemble Method for Large Scale Machine Learning with Hadoop MapReduce Liu, Xuan January 2014 (has links) We propose a new ensemble algorithm: the meta-boosting algorithm. This algorithm enables the original Adaboost algorithm to improve the decisions made by different WeakLearners utilizing the meta-learning approach. Better accuracy results are achieved since this algorithm reduces both bias and variance. However, higher accuracy also brings higher computational complexity, especially on big data. We then propose the parallelized meta-boosting algorithm: Parallelized-Meta-Learning (PML) using the MapReduce programming paradigm on Hadoop. The experimental results on the Amazon EC2 cloud computing infrastructure show that PML reduces the computation complexity enormously while retaining lower error rates than the results on a single computer. As we know MapReduce has its inherent weakness that it cannot directly support iterations in an algorithm, our approach is a win-win method, since it not only overcomes this weakness, but also secures good accuracy performance. The comparison between this approach and a contemporary algorithm AdaBoost.PL is also performed. Adaboost Meta-learning Big Data Hadoop MapReduce Ensemble Learning Scalable Machine Learning Algorithm
690	Inner Ensembles: Using Ensemble Methods in Learning Step Abbasian, Houman January 2014 (has links) A pivotal moment in machine learning research was the creation of an important new research area, known as Ensemble Learning. In this work, we argue that ensembles are a very general concept, and though they have been widely used, they can be applied in more situations than they have been to date. Rather than using them only to combine the output of an algorithm, we can apply them to decisions made inside the algorithm itself, during the learning step. We call this approach Inner Ensembles. The motivation to develop Inner Ensembles was the opportunity to produce models with the similar advantages as regular ensembles, accuracy and stability for example, plus additional advantages such as comprehensibility, simplicity, rapid classification and small memory footprint. The main contribution of this work is to demonstrate how broadly this idea can be applied, and highlight its potential impact on all types of algorithms. To support our claim, we first provide a general guideline for applying Inner Ensembles to different algorithms. Then, using this framework, we apply them to two categories of learning methods: supervised and un-supervised. For the former we chose Bayesian network, and for the latter K-Means clustering. Our results show that 1) the overall performance of Inner Ensembles is significantly better than the original methods, and 2) Inner Ensembles provide similar performance improvements as regular ensembles. Inner Ensemble Bayesian Network Inner Ensembles Bayesian Network K-Means Inner K-Means

Search results