Global ETD Search

141	Vyhledávání obrazu na základě podobnosti / Image search using similarity measures Harvánek, Martin January 2014 (has links) There are these methods implemented: circular sectors, color moments, color coherence vector and Gabor filters, they are based on low-level image features. These methods were evaluated after their optimal parameters were found. The finding of optimal parameters of methods is done by measuring of classification accuracy of learning operators and usage of operator cross validation on images in program RapidMiner. Implemented methods are evaluated on these image categories - ancient, beach, bus, dinousaur, elephant, flower, food, horse, mountain and natives, based on total average precision. The classification accuracy result is increased by 8 % by implemented modification (HSB color space + statistical function median) of original method circular sectors. The combination of methods color moments, circular sectors and Gabor filters with weighted ratio gives the best total average precision at 70,48 % and is the best method among all implemented methods.
142	Klasifikace malých nekódujících RNA / Classification of Small Noncoding RNAs Žigárdi, Tomáš January 2015 (has links) This masters's thesis contains description of designed and implemented tool for classification of plant microRNA without genome. Properties of mature and star sequences in microRNA duplexes are used. Implemented method is based on clustering of RNA sequences (with CD-HIT) to mainly reduce their count. Selected representants from each clusters are classified using support vector machine. Performance of classification is more than 96% (based on cross-validation method using the training data).
143	Automatic Flight Maneuver Identification Using Machine Learning Methods Bodin, Camilla January 2020 (has links) This thesis proposes a general approach to solve the offline flight-maneuver identification problem using machine learning methods. The purpose of the study was to provide means for the aircraft professionals at the flight test and verification department of Saab Aeronautics to automate the procedure of analyzing flight test data. The suggested approach succeeded in generating binary classifiers and multiclass classifiers that identified six flight maneuvers of different complexity from real flight test data. The binary classifiers solved the problem of identifying one maneuver from flight test data at a time, while the multiclass classifiers solved the problem of identifying several maneuvers from flight test data simultaneously. To achieve these results, the difficulties that this time series classification problem entailed were simplified by using different strategies. One strategy was to develop a maneuver extraction algorithm that used handcrafted rules. Another strategy was to represent the time series data by statistical measures. There was also an issue of an imbalanced dataset, where one class far outweighed others in number of samples. This was solved by using a modified oversampling method on the dataset that was used for training. Logistic Regression, Support Vector Machines with both linear and nonlinear kernels, and Artifical Neural Networks were explored, where the hyperparameters for each machine learning algorithm were chosen during model estimation by 4-fold cross-validation and solving an optimization problem based on important performance metrics. A feature selection algorithm was also used during model estimation to evaluate how the performance changes depending on how many features were used. The machine learning models were then evaluated on test data consisting of 24 flight tests. The results given by the test data set showed that the simplifications done were reasonable, but the maneuver extraction algorithm could sometimes fail. Some maneuvers were easier to identify than others and the linear machine learning models resulted in a poor fit to the more complex classes. In conclusion, both binary classifiers and multiclass classifiers could be used to solve the flight maneuver identification problem, and solving a hyperparameter optimization problem boosted the performance of the finalized models. Nonlinear classifiers performed the best on average across all explored maneuvers. Flight Aircraft Machine Learning Flight Dynamics Classification Supervised Learning Support Vector Machines Neural Networks Logistic Regression Feature Selection Recursive Feature Elimination Feature Representation k-fold cross-validation maneuvers flight maneuvers Control Engineering Reglerteknik
144	Machine Learning Applications for Downscaling Groundwater Storage Changes Integrating Satellite Gravimetry and Other Observations Agarwal, Vibhor January 2021 (has links) No description available. Geographic Information Science Geography Remote Sensing Geophysical Geological Machine Learning GRACE Downscaling Central Valley North China Plain Random Forest Artificial Neural Network Groundwater Depletion Groundwater Storage Iterative forward modeling Leakage correction cross-validation
145	Online Non-linear Prediction of Financial Time Series Patterns da Costa, Joel 11 September 2020 (has links) We consider a mechanistic non-linear machine learning approach to learning signals in financial time series data. A modularised and decoupled algorithm framework is established and is proven on daily sampled closing time-series data for JSE equity markets. The input patterns are based on input data vectors of data windows preprocessed into a sequence of daily, weekly and monthly or quarterly sampled feature measurement changes (log feature fluctuations). The data processing is split into a batch processed step where features are learnt using a Stacked AutoEncoder (SAE) via unsupervised learning, and then both batch and online supervised learning are carried out on Feedforward Neural Networks (FNNs) using these features. The FNN output is a point prediction of measured time-series feature fluctuations (log differenced data) in the future (ex-post). Weight initializations for these networks are implemented with restricted Boltzmann machine pretraining, and variance based initializations. The validity of the FNN backtest results are shown under a rigorous assessment of backtest overfitting using both Combinatorially Symmetrical Cross Validation and Probabilistic and Deflated Sharpe Ratios. Results are further used to develop a view on the phenomenology of financial markets and the value of complex historical data under unstable dynamics. online learning feedforward neural network restricted Boltzmann machine variance weight initialization stacked autoencoder pattern prediction JSE non-linear financial time series backtest overfitting deflated Sharpe ratio probabilistic Sharpe ratio
146	Machine Learning for Exploring State Space Structure in Genetic Regulatory Networks Thomas, Rodney H. 01 January 2018 (has links) Genetic regulatory networks (GRN) offer a useful model for clinical biology. Specifically, such networks capture interactions among genes, proteins, and other metabolic factors. Unfortunately, it is difficult to understand and predict the behavior of networks that are of realistic size and complexity. In this dissertation, behavior refers to the trajectory of a state, through a series of state transitions over time, to an attractor in the network. This project assumes asynchronous Boolean networks, implying that a state may transition to more than one attractor. The goal of this project is to efficiently identify a network's set of attractors and to predict the likelihood with which an arbitrary state leads to each of the network’s attractors. These probabilities will be represented using a fuzzy membership vector. Predicting fuzzy membership vectors using machine learning techniques may address the intractability posed by networks of realistic size and complexity. Modeling and simulation can be used to provide the necessary training sets for machine learning methods to predict fuzzy membership vectors. The experiments comprise several GRNs, each represented by a set of output classes. These classes consist of thresholds τ and ¬τ, where τ = [τlaw,τhigh]; state s belongs to class τ if the probability of its transitioning to attractor 􀜣 belongs to the range [τlaw,τhigh]; otherwise it belongs to class ¬τ. Finally, each machine learning classifier was trained with the training sets that was previously collected. The objective is to explore methods to discover patterns for meaningful classification of states in realistically complex regulatory networks. The research design took a GRN and a machine learning method as input and produced output class < Ατ > and its negation ¬ < Ατ >. For each GRN, attractors were identified, data was collected by sampling each state to create fuzzy membership vectors, and machine learning methods were trained to predict whether a state is in a healthy attractor or not. For T-LGL, SVMs had the highest accuracy in predictions (between 93.6% and 96.9%) and precision (between 94.59% and 97.87%). However, naive Bayesian classifiers had the highest recall (between 94.71% and 97.78%). This study showed that all experiments have extreme significance with pvalue < 0.0001. The contribution this research offers helps clinical biologist to submit genetic states to get an initial result on their outcomes. For future work, this implementation could use other machine learning classifiers such as xgboost or deep learning methods. Other suggestions offered are developing methods that improves the performance of state transition that allow for larger training sets to be sampled. asynchronous Boolean networks attractors Boolean networks cross-validation decision trees fuzzy basins fuzzy membership vectors fuzzy vectors genetic regulatory networks Markov Chain Monte Carlo naïve Bayesian classifiers support vector machines Computer Sciences
147	Systemic Identification of Radiomic Features Resilient to Batch Effects and Acquisition Variations for Diagnosis of Active Crohn's Disease on CT Enterography Pattiam Giriprakash, Pavithran 23 August 2021 (has links) No description available. Biomedical Engineering Biomedical Research Biology Medical Imaging Radiology
148	Algorithmic Methods for Multi-Omics Biomarker Discovery Li, Yichao January 2018 (has links) No description available. Bioinformatics Computer Science Motif Diabetes Transcription Factor HiC Set Cover Machine Learning Ensemble Learning HbA1C Glycated Peptide Motif Discovery Motif Pair 3D Genome Organization DREAM challenge Python Data Analytics Hist1 Clustering Analysis Cross Validation
149	Accuracy and Reproducibility of Laboratory Diffuse Reflectance Measurements with Portable VNIR and MIR Spectrometers for Predictive Soil Organic Carbon Modeling Semella, Sebastian, Hutengs, Christopher, Seidel, Michael, Ulrich, Mathias, Schneider, Birgit, Ortner, Malte, Thiele-Bruhn, Sören, Ludwig, Bernard, Vohland, Michael 09 June 2023 (has links) Soil spectroscopy in the visible-to-near infrared (VNIR) and mid-infrared (MIR) is a cost-effective method to determine the soil organic carbon content (SOC) based on predictive spectral models calibrated to analytical-determined SOC reference data. The degree to which uncertainty in reference data and spectral measurements contributes to the estimated accuracy of VNIR and MIR predictions, however, is rarely addressed and remains unclear, in particular for current handheld MIR spectrometers. We thus evaluated the reproducibility of both the spectral reflectance measurements with portable VNIR and MIR spectrometers and the analytical dry combustion SOC reference method, with the aim to assess how varying spectral inputs and reference values impact the calibration and validation of predictive VNIR and MIR models. Soil reflectance spectra and SOC were measured in triplicate, the latter by different laboratories, for a set of 75 finely ground soil samples covering a wide range of parent materials and SOC contents. Predictive partial least-squares regression (PLSR) models were evaluated in a repeated, nested cross-validation approach with systematically varied spectral inputs and reference data, respectively. We found that SOC predictions from both VNIR and MIR spectra were equally highly reproducible on average and similar to the dry combustion method, but MIR spectra were more robust to calibration sample variation. The contributions of spectral variation (ΔRMSE < 0.4 g·kg−1) and reference SOC uncertainty (ΔRMSE < 0.3 g·kg−1) to spectral modeling errors were small compared to the difference between the VNIR and MIR spectral ranges (ΔRMSE ~1.4 g·kg−1 in favor of MIR). For reference SOC, uncertainty was limited to the case of biased reference data appearing in either the calibration or validation. Given better predictive accuracy, comparable spectral reproducibility and greater robustness against calibration sample selection, the portable MIR spectrometer was considered overall superior to the VNIR instrument for SOC analysis. Our results further indicate that random errors in SOC reference values are effectively compensated for during model calibration, while biased SOC calibration data propagates errors into model predictions. Reference data uncertainty is thus more likely to negatively impact the estimated validation accuracy in soil spectroscopy studies where archived data, e.g., from soil spectral libraries, are used for model building, but it should be negligible otherwise. info:eu-repo/classification/ddc/620 ddc:620
150	Chemometric Applications To A Complex Classification Problem: Forensic Fire Debris Analysis Waddell, Erin 01 January 2013 (has links) Fire debris analysis currently relies on visual pattern recognition of the total ion chromatograms, extracted ion profiles, and target compound chromatograms to identify the presence of an ignitable liquid. This procedure is described in the ASTM International E1618-10 standard method. For large data sets, this methodology can be time consuming and is a subjective method, the accuracy of which is dependent upon the skill and experience of the analyst. This research aimed to develop an automated classification method for large data sets and investigated the use of the total ion spectrum (TIS). The TIS is calculated by taking an average mass spectrum across the entire chromatographic range and has been shown to contain sufficient information content for the identification of ignitable liquids. The TIS of ignitable liquids and substrates were compiled into model data sets. Substrates are defined as common building materials and household furnishings that are typically found at the scene of a fire and are, therefore, present in fire debris samples. Fire debris samples were also used which were obtained from laboratory-scale and large-scale burns. An automated classification method was developed using computational software that was written in-house. Within this method, a multi-step classification scheme was used to detect ignitable liquid residues in fire debris samples and assign these to the classes defined in ASTM E1618-10. Classifications were made using linear discriminant analysis, quadratic discriminant analysis (QDA), and soft independent modeling of class analogy (SIMCA). The model data sets iv were tested by cross-validation and used to classify fire debris samples. Correct classification rates were calculated for each data set. Classifier performance metrics were also calculated for the first step of the classification scheme which included false positive rates, true positive rates, and the precision of the method. The first step, which determines a sample to be positive or negative for ignitable liquid residue, is arguably the most important in the forensic application. Overall, the highest correct classification rates were achieved using QDA for the first step of the scheme and SIMCA for the remaining steps. In the first step of the classification scheme, correct classification rates of 95.3% and 89.2% were obtained using QDA to classify the crossvalidation test set and fire debris samples, respectively. For this step, the cross-validation test set resulted in a true positive rate of 96.2%, a false positive rate of 9.3%, and a precision of 98.2%. The fire debris data set had a true positive rate of 82.9%, a false positive rate of 1.3%, and a precision of 99.0%. Correct classifications rates of 100% were achieved for both data sets in the majority of the remaining steps which used SIMCA for classification. The lowest correct classification rate, 69.2%, was obtained for the fire debris samples in one of the final steps in the classification scheme. In this research, the first statistically valid error rates for fire debris analysis have been developed through cross-validation of large data sets. The fire debris analyst can use the automated method as a tool for detecting and classifying ignitable liquid residues in fire debris samples. The error rates reduce the subjectivity associated with the current methods and provide a level of confidence in sample classification that does not currently exist in forensic fire debris analysis. Forensic science fire debris analysis gas chromatography mass spectrometry chemometrics multivariate statistics discriminant analysis principal components analysis (pca) error rates cross validation Chemistry

Search results