Global ETD Search

381	Learning Robust Support Vector Machine Classifiers With Uncertain Observations Bhadra, Sahely 03 1900 (has links) (PDF) The central theme of the thesis is to study linear and non linear SVM formulations in the presence of uncertain observations. The main contribution of this thesis is to derive robust classfiers from partial knowledge of the underlying uncertainty. In the case of linear classification, a new bounding scheme based on Bernstein inequality has been proposed, which models interval-valued uncertainty in a less conservative fashion and hence is expected to generalize better than the existing methods. Next, potential of partial information such as bounds on second order moments along with support information has been explored. Bounds on second order moments make the resulting classifiers robust to moment estimation errors. Uncertainty in the dataset will lead to uncertainty in the kernel matrices. A novel distribution free large deviation inequality has been proposed which handles uncertainty in kernels through co-positive programming in a chance constraint setting. Although such formulations are NP hard, under several cases of interest the problem reduces to a convex program. However, the independence assumption mentioned above, is restrictive and may not always define a valid uncertain kernel. To alleviate this problem an affine set based alternative is proposed and using a robust optimization framework the resultant problem is posed as a minimax problem. In both the cases of Chance Constraint Program or Robust Optimization (for non-linear SVM), mirror descent algorithm (MDA) like procedures have been applied. Support Vector Machines Machine Learning Robust Classifiers Robust Optimization Kernel Matrices - Uncertainty Chance Constraint Programming Interval-Valued Uncertainty Vector Machine Classifiers Kernel Matrix Robust Classification Robust Formulations Knowledge Uncertainity Mirror Descent Algorithm (MDA) Computer Science
382	Prediktor vlivu aminokyselinových substitucí na stabilitu proteinů / Predictor of the Effect of Amino Acid Substitutions on Protein Stability Flax, Michal January 2017 (has links) This paper deals with prediction of influence of amino acids mutations on protein stability. The prediction is based on different methods of machine learning. Protein mutations are classified as mutations that increase or decrease protein stability. The application also predicts the magnitude of change in Gibbs free energy after the mutation.
383	Rozpoznání displeje embedded zařízení / Embedded display recognition Novotný, Václav January 2018 (has links) This master thesis deals with usage of machine learning methods in computer vision for classification of unknown images. The first part contains research of available machine learning methods, their limitations and also their suitability for this task. The second part describes the processes of creating training and testing gallery. In the practical part, the solution for the problem is proposed and later realised and implemented. Proper testing and evaluation of resulting system is conducted.
384	Detekce, lokalizace a rozpoznání dopravních značek / Detection, Localization and Recognition of Traffic Signs Svoboda, Tomáš January 2011 (has links) This master's thesis deals with the localization, detection and recognition of traffic signs. The possibilities of selection of areas with possible traffic signs occurrence are analysed. The properties of different kinds of features used for traffic signs recognition are described next. It focuses on the features based on histogram of oriented gradients. Some possible classifiers are discussed, in the first place the cascade of support vector machines, which are used in resulting system. A description of the system implementation and data sets for 5 types of traffic signs is part of this thesis. Many experiments were accomplished with created system. The results of the experiments are very good. New datasets were acquired from approximately 9 hours of processed video sequences. There are about 13 500 images in these datasets.
385	Automatic Flight Maneuver Identification Using Machine Learning Methods Bodin, Camilla January 2020 (has links) This thesis proposes a general approach to solve the offline flight-maneuver identification problem using machine learning methods. The purpose of the study was to provide means for the aircraft professionals at the flight test and verification department of Saab Aeronautics to automate the procedure of analyzing flight test data. The suggested approach succeeded in generating binary classifiers and multiclass classifiers that identified six flight maneuvers of different complexity from real flight test data. The binary classifiers solved the problem of identifying one maneuver from flight test data at a time, while the multiclass classifiers solved the problem of identifying several maneuvers from flight test data simultaneously. To achieve these results, the difficulties that this time series classification problem entailed were simplified by using different strategies. One strategy was to develop a maneuver extraction algorithm that used handcrafted rules. Another strategy was to represent the time series data by statistical measures. There was also an issue of an imbalanced dataset, where one class far outweighed others in number of samples. This was solved by using a modified oversampling method on the dataset that was used for training. Logistic Regression, Support Vector Machines with both linear and nonlinear kernels, and Artifical Neural Networks were explored, where the hyperparameters for each machine learning algorithm were chosen during model estimation by 4-fold cross-validation and solving an optimization problem based on important performance metrics. A feature selection algorithm was also used during model estimation to evaluate how the performance changes depending on how many features were used. The machine learning models were then evaluated on test data consisting of 24 flight tests. The results given by the test data set showed that the simplifications done were reasonable, but the maneuver extraction algorithm could sometimes fail. Some maneuvers were easier to identify than others and the linear machine learning models resulted in a poor fit to the more complex classes. In conclusion, both binary classifiers and multiclass classifiers could be used to solve the flight maneuver identification problem, and solving a hyperparameter optimization problem boosted the performance of the finalized models. Nonlinear classifiers performed the best on average across all explored maneuvers. Flight Aircraft Machine Learning Flight Dynamics Classification Supervised Learning Support Vector Machines Neural Networks Logistic Regression Feature Selection Recursive Feature Elimination Feature Representation k-fold cross-validation maneuvers flight maneuvers Control Engineering Reglerteknik
386	Aplicación de técnicas de análisis de regresión y aprendizaje automático para la estimación de sobre dilución en el método de Sub Level Stoping - Compañía Minera Condestable / Application of regression analysis and machine learning techniques for the estimation of over dilution in the Sub Level Stopping method - Compania Minera Condestable Penadillo Palomino, Cristina Tessa 20 March 2021 (has links) El presente trabajo de investigación tiene como objetivo aplicar técnicas de análisis de regresión y aprendizaje automático (ML) para mejorar los resultados de estimación de sobre dilución en tajos explotados por el método de Sub Level Stoping (SLS) de la Compañía Minera Condestable (CMC) a través de la generación de ecuaciones de regresión y código en lenguaje de Python para las técnicas de ML. Para la estimación de sobre dilución se analizaron las reconciliaciones de tajos explotados con el método de SLS del período 2017-2019 con la aplicación de las técnicas: Análisis de Regresión Lineal Múltiple (ARLM), regresión no lineal múltiple (ARNM) y métodos de aprendizaje automático (ML) como Máquinas de Vectores de Soporte (SVM) y bosques aleatorios (RF), lo que permitió establecer comparaciones entre los resultados a nivel predictivo y tecnológico con la metodología de O’Hara aplicada actualmente en CMC para la estimación de sobre dilución de tajos SLS. La aplicación de las técnicas mencionadas implicó variables operativas como: nivel, buzamiento, densidad, burden, espaciamiento, altura, longitud, ancho, RQD, RMR y ratio de tonelada por metro de perforación (TMP) de los tajos evaluados, mientras que el objetivo o variable dependiente fue la sobre dilución. Ello permitió inicialmente identificar que las técnicas de regresión ARLM y ARNM mejoraron el coeficiente de determinación R2 de O’Hara en 5.5% y 4.4%. Luego, con la aplicación de herramientas de aprendizaje automático se identificó que ambas técnicas (SVM y RF) lograron la mejora en 0.3% y 18.5% respectivamente. El resultado de ello fue la reducción de la diferencia de costos estimados obtenidos con la metodología de O’Hara relacionados al costo adicional por carguío y transporte de carga rota de dilución. / This research work aims to apply Regression Analysis and Machine Learning (ML) techniques to improve the results of estimating over dilution in stopes mined by Sub Level Stoping (SLS) method at Compania Minera Condestable (CMC) through the generation of regression equations and code in Python language for ML techniques. For the estimation of over dilution, the reconciliations of stopes mined with the SLS method for the period 2017-2019 were analysed with the application of the techniques: Multiple Linear Regression Analysis (MLRA), Multiple Non-linear Regression Analysis (MLNRA) and Machine Learning (ML) methods such as Support Vector Machine (SVM) and Random Forests (RF), which allowed comparisons of the results at predictive and technological level with the O'Hara methodology currently applied at CMC for the estimation of over dilution of SLS stopes. The application of the afore mentioned techniques involved operational variables such as: level, dip, density, burden, spacing, height, length, width, RQD, RMR and tonne per metre drilling (TMP) ratio of the evaluated stopes, while the objective or dependent variable was over dilution. This initially identified that the ARLM and ARNM regression techniques improved O'Hara's R2 determination coefficient by 5.5% and 4.4%. Then, with the application of machine learning tools it was identified that both techniques (SVM and RF) achieved the improvement by 0.3% and 18.5% respectively. This resulted in a reduction of the estimated cost difference obtained with the O'Hara methodology related to the additional cost of loading and transporting broken stock from the dilution. / Tesis Análisis de Regresión Lineal Múltiple Máquinas de Vectores de Soporte Bosques aleatorios Minería subterránea Multiple Linear Regression Analysis Support Vector Machines Random forests Underground mining
387	An Approach for Incremental Semi-supervised SVM Emara, Wael, Karnstedt, Mehmed Kantardzic Marcel, Sattler, Kai-Uwe, Habich, Dirk, Lehner, Wolfgang 11 May 2022 (has links) In this paper we propose an approach for incremental learning of semi-supervised SVM. The proposed approach makes use of the locality of radial basis function kernels to do local and incremental training of semi-supervised support vector machines. The algorithm introduces a se- quential minimal optimization based implementation of the branch and bound technique for training semi-supervised SVM problems. The novelty of our approach lies in the in the introduction of incremental learning techniques to semisupervised SVMs. info:eu-repo/classification/ddc/005 ddc:005
388	Machine Learning for Exploring State Space Structure in Genetic Regulatory Networks Thomas, Rodney H. 01 January 2018 (has links) Genetic regulatory networks (GRN) offer a useful model for clinical biology. Specifically, such networks capture interactions among genes, proteins, and other metabolic factors. Unfortunately, it is difficult to understand and predict the behavior of networks that are of realistic size and complexity. In this dissertation, behavior refers to the trajectory of a state, through a series of state transitions over time, to an attractor in the network. This project assumes asynchronous Boolean networks, implying that a state may transition to more than one attractor. The goal of this project is to efficiently identify a network's set of attractors and to predict the likelihood with which an arbitrary state leads to each of the network’s attractors. These probabilities will be represented using a fuzzy membership vector. Predicting fuzzy membership vectors using machine learning techniques may address the intractability posed by networks of realistic size and complexity. Modeling and simulation can be used to provide the necessary training sets for machine learning methods to predict fuzzy membership vectors. The experiments comprise several GRNs, each represented by a set of output classes. These classes consist of thresholds τ and ¬τ, where τ = [τlaw,τhigh]; state s belongs to class τ if the probability of its transitioning to attractor 􀜣 belongs to the range [τlaw,τhigh]; otherwise it belongs to class ¬τ. Finally, each machine learning classifier was trained with the training sets that was previously collected. The objective is to explore methods to discover patterns for meaningful classification of states in realistically complex regulatory networks. The research design took a GRN and a machine learning method as input and produced output class < Ατ > and its negation ¬ < Ατ >. For each GRN, attractors were identified, data was collected by sampling each state to create fuzzy membership vectors, and machine learning methods were trained to predict whether a state is in a healthy attractor or not. For T-LGL, SVMs had the highest accuracy in predictions (between 93.6% and 96.9%) and precision (between 94.59% and 97.87%). However, naive Bayesian classifiers had the highest recall (between 94.71% and 97.78%). This study showed that all experiments have extreme significance with pvalue < 0.0001. The contribution this research offers helps clinical biologist to submit genetic states to get an initial result on their outcomes. For future work, this implementation could use other machine learning classifiers such as xgboost or deep learning methods. Other suggestions offered are developing methods that improves the performance of state transition that allow for larger training sets to be sampled. asynchronous Boolean networks attractors Boolean networks cross-validation decision trees fuzzy basins fuzzy membership vectors fuzzy vectors genetic regulatory networks Markov Chain Monte Carlo naïve Bayesian classifiers support vector machines Computer Sciences
389	Optimizing hydropathy scale to improve IDP prediction and characterizing IDPs' functions Huang, Fei January 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Intrinsically disordered proteins (IDPs) are flexible proteins without defined 3D structures. Studies show that IDPs are abundant in nature and actively involved in numerous biological processes. Two crucial subjects in the study of IDPs lie in analyzing IDPs’ functions and identifying them. We thus carried out three projects to better understand IDPs. In the 1st project, we propose a method that separates IDPs into different function groups. We used the approach of CH-CDF plot, which is based the combined use of two predictors and subclassifies proteins into 4 groups: structured, mixed, disordered, and rare. Studies show different structural biases for each group. The mixed class has more order-promoting residues and more ordered regions than the disordered class. In addition, the disordered class is highly active in mitosis-related processes among others. Meanwhile, the mixed class is highly associated with signaling pathways, where having both ordered and disordered regions could possibly be important. The 2nd project is about identifying if an unknown protein is entirely disordered. One of the earliest predictors for this purpose, the charge-hydropathy plot (C-H plot), exploited the charge and hydropathy features of the protein. Not only is this algorithm simple yet powerful, its input parameters, charge and hydropathy, are informative and readily interpretable. We found that using different hydropathy scales significantly affects the prediction accuracy. Therefore, we sought to identify a new hydropathy scale that optimizes the prediction. This new scale achieves an accuracy of 91%, a significant improvement over the original 79%. In our 3rd project, we developed a per-residue C-H IDP predictor, in which three hydropathy scales are optimized individually. This is to account for the amino acid composition differences in three regions of a protein sequence (N, C terminus and internal). We then combined them into a single per-residue predictor that achieves an accuracy of 74% for per-residue predictions for proteins containing long IDP regions. Intrinsically disordered proteins Support vector machine Clustering Proteins -- Conformation -- Research Proteins -- Denaturation Protein folding -- Research Support vector machines Aggregation (Chemistry) Amino acids -- Analysis Cellular signal transduction Molecular biology -- Mathematics Algorithms
390	Získávání znalostí z objektově relačních databází / Knowledge Discovery in Object Relational Databases Chytka, Karel Unknown Date (has links) The goal of this master's thesis is to acquaint with a problem of a knowledge discovery and objectrelational data classification. It summarizes problems which are connected with mining spatiotemporal data. There is described data mining kernel algorithm SVM. The second part solves classification method implementation. This method solves data mining in a Caretaker trajectory database. This thesis contains application's implementation for spatio-temporal data preprocessing, their organization in database and presentation too.

Search results