Global ETD Search

311	Species Discrimination and Monitoring of Abiotic Stress Tolerance by Chlorophyll Fluorescence Transients MISHRA, Anamika January 2012 (has links) Chlorophyll fluorescence imaging has now become a versatile and standard tool in fundamental and applied plant research. This method captures time series images of the chlorophyll fluorescence emission of whole leaves or plants upon various illuminations, typically combination of actinic light and saturating flashes. Several conventional chlorophyll fluorescence parameters have been recognized that have physiological interpretation and are useful for, e.g., assessment of plant health status and early detection of biotic and abiotic stresses. Chlorophyll florescence imaging enabled us to probe the performance of plants by visualizing physiologically relevant fluorescence parameters reporting physiology and biochemistry of the plant leaves. Sometimes there is a need to find the most contrasting fluorescence features/parameters in order to quantify stress response at very early stage of the stress treatment. The conventional fluorescence utilizes well defined single image such as F0, Fp, Fm, Fs or arithmetic combinations of basic images such as Fv/Fm, PSII, NPQ, qP. Therefore, although conventional fluorescence parameters have physiological interpretation, they may not be representing highly contrasting image sets. In order to find the effect of stress treatments at very early stage, advanced statistical techniques, based on classifiers and feature selection methods, have been developed to select highly contrasting chlorophyll fluorescence images out of hundreds of captured images. We combined sets of highly performing images resulting in images with very high contrast, the so called combinatorial imaging. The application of advanced statistical methods on chlorophyll fluorescence imaging data allows us to succeed in tasks, where conventional approaches do not work. This thesis aims to explore the application of conventional chlorophyll fluorescence parameters as well as advanced statistical techniques of classifiers and feature selection methods for high-throughput screening. We demonstrate the applicability of the technique in discriminating three species of the same family Lamiaceae at a very early stage of their growth. Further, we show that chlorophyll fluorescence imaging can be used for measuring cold and drought tolerance of Arabidopsis thaliana and tomato plants, respectively, in a simulated high ? throughput screening.
312	Shoulder Keypoint-Detection from Object Detection Kapoor, Prince 22 August 2018 (has links) This thesis presents detailed observation of different Convolutional Neural Network (CNN) architecture which had assisted Computer Vision researchers to achieve state-of-the-art performance on classification, detection, segmentation and much more to name image analysis challenges. Due to the advent of deep learning, CNN had been used in almost all the computer vision applications and that is why there is utter need to understand the miniature details of these feature extractors and find out their pros and cons of each feature extractor meticulously. In order to perform our experimentation, we decided to explore an object detection task using a particular model architecture which maintains a sweet spot between computational cost and accuracy. The model architecture which we had used is LSTM-Decoder. The model had been experimented with different CNN feature extractor and found their pros and cons in variant scenarios. The results which we had obtained on different datasets elucidates that CNN plays a major role in obtaining higher accuracy and we had also achieved a comparable state-of-the-art accuracy on Pedestrian Detection Dataset. In extension to object detection, we also implemented two different model architectures which find shoulder keypoints. So, One of our idea can be explicated as follows: using the detected annotation from object detection, a small cropped image is generated which would be feed into a small cascade network which was trained for detection of shoulder keypoints. The second strategy is to use the same object detection model and fine tune their weights to predict shoulder keypoints. Currently, we had generated our results for shoulder keypoint detection. However, this idea could be extended to full-body pose Estimation by modifying the cascaded network for pose estimation purpose and this had become an important topic of discussion for the future work of this thesis. Shoulder Keypoint Detection Object Detection CNN Feature Extractors LSTM-decoder
313	Fast Corn Grading System Verification and Modification Smith, Leanna Marie 01 May 2012 (has links) A fast corn grading system can replace the traditional method in unofficial corn grading locations. The initial design of the system proved that it can classify corn kernels with a high success rate. This study tested the robustness of the system against samples from different locations with different moisture contents. The experimental results were compared with the official grading results for 3 out of the 6 samples. This study also tested the limitations of the segmentation algorithm. The results showed that 60 to 70 kernels in a 100 cm2 could be correctly segmented in a relatively short running time. Classification accuracy would improve with modifications to the system, including increased training samples of damaged kernels, uniform illumination, color calibration, and improved weight approximation of the kernels. Color Calibration Corn Feature Extraction Grading Weight Estimation
314	NOISE IMPACT REDUCTION IN CLASSIFICATION APPROACH PREDICTING SOCIAL NETWORKS CHECK-IN LOCATIONS Jedari Fathi, Elnaz 01 May 2017 (has links) Since August 2010, Facebook has entered the self-reported positioning world by providing the check-in service to its users. This service allows users to share their physical location using the GPS receiver in their mobile devices such as a smart-phone, tablet, or smart-watch. Over the years, big datasets of recorded check-ins have been collected with increasing popularity of social networks. Analyzing the check-in datasets reveals valuable information and patterns in users’ check-in behavior as well as places check-in history. The analysis results can be used in several areas including business planning and financial decisions, for instance providing location-based deals. In this thesis, we leverage novel data mining methodology to learn from big check-in data and predict the next check-in place based on only places’ history and with no reference to individual users. To this end, we study a large Facebook check-in dataset. This dataset has a high level of noise in location coordinates due to multiple collection sources, which are users’ mobile devices. The research question is how we can leverage a noise impact reduction technique to enhance performance of prediction model. We design our own noise handling mechanism to deal with feature noise. The predictive model is generated by Random Forest classification algorithm in a shared-memory parallel environment. We represent how the performance of predictors is enhanced by minimizing noise impacts. The solution is a preprocessing feature noise cleansing approach implemented in R and works fast for big check-in datasets. Check in Classification Feature Noise Location Noise Location Prediction Noise Handling
315	A credit scoring model based on classifiers consensus system approach Ala'raj, Maher A. January 2016 (has links) Managing customer credit is an important issue for each commercial bank; therefore, banks take great care when dealing with customer loans to avoid any improper decisions that can lead to loss of opportunity or financial losses. The manual estimation of customer creditworthiness has become both time- and resource-consuming. Moreover, a manual approach is subjective (dependable on the bank employee who gives this estimation), which is why devising and implementing programming models that provide loan estimations is the only way of eradicating the ‘human factor’ in this problem. This model should give recommendations to the bank in terms of whether or not a loan should be given, or otherwise can give a probability in relation to whether the loan will be returned. Nowadays, a number of models have been designed, but there is no ideal classifier amongst these models since each gives some percentage of incorrect outputs; this is a critical consideration when each percent of incorrect answer can mean millions of dollars of losses for large banks. However, the LR remains the industry standard tool for credit-scoring models development. For this purpose, an investigation is carried out on the combination of the most efficient classifiers in credit-scoring scope in an attempt to produce a classifier that exceeds each of its classifiers or components. In this work, a fusion model referred to as ‘the Classifiers Consensus Approach’ is developed, which gives a lot better performance than each of single classifiers that constitute it. The difference of the consensus approach and the majority of other combiners lie in the fact that the consensus approach adopts the model of real expert group behaviour during the process of finding the consensus (aggregate) answer. The consensus model is compared not only with single classifiers, but also with traditional combiners and a quite complex combiner model known as the ‘Dynamic Ensemble Selection’ approach. As a pre-processing technique, step data-filtering (select training entries which fits input data well and remove outliers and noisy data) and feature selection (remove useless and statistically insignificant features which values are low correlated with real quality of loan) are used. These techniques are valuable in significantly improving the consensus approach results. Results clearly show that the consensus approach is statistically better (with 95% confidence value, according to Friedman test) than any other single classifier or combiner analysed; this means that for similar datasets, there is a 95% guarantee that the consensus approach will outperform all other classifiers. The consensus approach gives not only the best accuracy, but also better AUC value, Brier score and H-measure for almost all datasets investigated in this thesis. Moreover, it outperformed Logistic Regression. Thus, it has been proven that the use of the consensus approach for credit-scoring is justified and recommended in commercial banks. Along with the consensus approach, the dynamic ensemble selection approach is analysed, the results of which show that, under some conditions, the dynamic ensemble selection approach can rival the consensus approach. The good sides of dynamic ensemble selection approach include its stability and high accuracy on various datasets. The consensus approach, which is improved in this work, may be considered in banks that hold the same characteristics of the datasets used in this work, where utilisation could decrease the level of mistakenly rejected loans of solvent customers, and the level of mistakenly accepted loans that are never to be returned. Furthermore, the consensus approach is a notable step in the direction of building a universal classifier that can fit data with any structure. Another advantage of the consensus approach is its flexibility; therefore, even if the input data is changed due to various reasons, the consensus approach can be easily re-trained and used with the same performance. 005.74
316	Improved shrunken centroid method for better variable selection in cancer classification with high throughput molecular data Xukun, Li January 1900 (has links) Master of Science / Department of Statistics / Haiyan Wang / Cancer type classification with high throughput molecular data has received much attention. Many methods have been published in this area. One of them is called PAM (nearest centroid shrunken algorithm), which is simple and efficient. It can give very good prediction accuracy. A problem with PAM is that this method selects too many genes, some of which may have no influence on cancer type. A reason for this phenomenon is that PAM assumes that all genes have identical distribution and give a common threshold parameter for genes selection. This may not hold in reality since expressions from different genes could have very different distributions due to complicated biological process. We propose a new method aimed to improve the ability of PAM to select informative genes. Keeping informative genes while reducing false positive variables can lead to more accurate classification result and help to pinpoint target genes for further studies. To achieve this goal, we introduce variable specific test based on Edgeworth expansion to select informative genes. We apply this test on each gene and select some genes based on the result of the test so that a large number of genes will be excluded. Afterward, soft thresholding with cross-validation can be further applied to decide a common threshold value. Simulation and real application show that our method can reduce the irrelevant information and select the informative genes more precisely. The simulation results give us more insight about where the newly proposed procedure could improve the accuracy, especially when the data set is skewed or unbalanced. The method can be applied to broad molecular data, including, for example, lipidomic data from mass spectrum, copy number data from genomics, eQLT analysis with GWAS data, etc. We expect the proposed method will help life scientists to accelerate discoveries with highthroughput data. Feature selection High dimensional classification Cornish-Fisher expansion Shrunken centroid
317	Automating Fixture Setups Based on Point Cloud Data & CAD Model January 2016 (has links) abstract: Metal castings are selectively machined-based on dimensional control requirements. To ensure that all the finished surfaces are fully machined, each as-cast part needs to be measured and then adjusted optimally in its fixture. The topics of this thesis address two parts of this process: data translations and feature-fitting clouds of points measured on each cast part. For the first, a CAD model of the finished part is required to be communicated to the machine shop for performing various machining operations on the metal casting. The data flow must include GD&T specifications along with other special notes that may be required to communicate to the machinist. Current data exchange, among various digital applications, is limited to translation of only CAD geometry via STEP AP203. Therefore, an algorithm is developed in order to read, store and translate the data from a CAD file (for example SolidWorks, CREO) to a standard and machine readable format (ACIS format - .sat). Second, the geometry of cast parts varies from piece to piece and hence fixture set-up parameters for each part must be adjusted individually. To predictively determine these adjustments, the datum surfaces, and to-be-machined surfaces are scanned individually and the point clouds reduced to feature fits. The scanned data are stored as separate point cloud files. The labels associated with the datum and to-be-machined (TBM) features are extracted from the .sat file. These labels are further matched with the file name of the point cloud data to identify data for the respective features. The point cloud data and the CAD model are then used to fit the appropriate features (features at maximum material condition (MMC) for datums and features at least material condition (LMC) for TBM features) using the existing normative feature fitting (nFF) algorithm. Once the feature fitting is complete, a global datum reference frame (GDRF) is constructed based on the locating method that will be used to machine the part. The locating method is extracted from a fixture library that specifies the type of fixturing used to machine the part. All entities are transformed from its local coordinate system into the GDRF. The nominal geometry, fitted features, and the GD&T information are then stored in a neutral file format called the Constraint Tolerance Feature (CTF) Graph. The final outputs are then used to identify the locations of the critical features on each part and these are used to establish the adjustments for its setup prior to machining, in another module, not part of this thesis. / Dissertation/Thesis / Masters Thesis Mechanical Engineering 2016 Mechanical engineering Automating Fixture Adjustments CAD Translation Feature Fitting
318	Novel Methods of Biomarker Discovery and Predictive Modeling using Random Forest January 2017 (has links) abstract: Random forest (RF) is a popular and powerful technique nowadays. It can be used for classification, regression and unsupervised clustering. In its original form introduced by Leo Breiman, RF is used as a predictive model to generate predictions for new observations. Recent researches have proposed several methods based on RF for feature selection and for generating prediction intervals. However, they are limited in their applicability and accuracy. In this dissertation, RF is applied to build a predictive model for a complex dataset, and used as the basis for two novel methods for biomarker discovery and generating prediction interval. Firstly, a biodosimetry is developed using RF to determine absorbed radiation dose from gene expression measured from blood samples of potentially exposed individuals. To improve the prediction accuracy of the biodosimetry, day-specific models were built to deal with day interaction effect and a technique of nested modeling was proposed. The nested models can fit this complex data of large variability and non-linear relationships. Secondly, a panel of biomarkers was selected using a data-driven feature selection method as well as handpick, considering prior knowledge and other constraints. To incorporate domain knowledge, a method called Know-GRRF was developed based on guided regularized RF. This method can incorporate domain knowledge as a penalized term to regulate selection of candidate features in RF. It adds more flexibility to data-driven feature selection and can improve the interpretability of models. Know-GRRF showed significant improvement in cross-species prediction when cross-species correlation was used to guide selection of biomarkers. The method can also compete with existing methods using intrinsic data characteristics as alternative of domain knowledge in simulated datasets. Lastly, a novel non-parametric method, RFerr, was developed to generate prediction interval using RF regression. This method is widely applicable to any predictive models and was shown to have better coverage and precision than existing methods on the real-world radiation dataset, as well as benchmark and simulated datasets. / Dissertation/Thesis / Doctoral Dissertation Biomedical Informatics 2017 Biostatistics feature selection prediction interval predictive modeling random forest
319	Non-Linear Classification as a Tool for Predicting Tennis Matches / Non-Linear Classification as a Tool for Predicting Tennis Matches Hostačný, Jakub January 2018 (has links) Charles University Faculty of Social Sciences Institute of Economic Studies MASTER'S THESIS Non-Linear Classification as a Tool for Predicting Tennis Matches Author: Be. Jakub Hostacny Supervisor: RNDr. Matus Baniar Academic Year: 2017/2018 Abstract In this thesis, we examine the prediction accuracy and the betting performance of four machine learning algorithms applied to men tennis matches - penalized logistic regression, random forest, boosted trees, and artificial neural networks. To do so, we employ 40 310 ATP matches played during 1/2001-10/2016 and 342 input features. As for the prediction accuracy, our models outperform current state-of-art models for both non-grand-slam (69%) and grand slam matches (79%). Concerning the overall accuracy rate, all model specifications beat backing a better-ranked player, while the majority also surpasses backing a bookmaker's favourite. As far as the betting performance is concerned, we develop six profitable betting strategies for betting on favourites applied to non-grand-slam with ROI ranging from 0.8% to 6.5%. Also, we identify ten profitable betting strategies for betting on favourites applied to grand slam matches with ROI fluctuating between 0.7% and 9.3%. We beat both bench mark rules - backing a better-ranked player as well as backing a bookmaker's...
320	Detecção automática de rastros de Dust Devils na superfície de Marte / Statella, Thiago. January 2012 (has links) Orientador: Erivaldo Antônio da Silva / Coorientador: Pedro Miguel Berardo Duarte Pina / Banca: Ana Lucia Bezerra Candeias / Banca: João Rodrigues Tavares Júnior / Banca: José Roberto Nogueira / Banca: Maurício Araújo Dias / Resumo: Dust Devils são vórtices convectivos formados por correntes de ar quente instáveis, próximas à superfície planetária. Inúmeros pesquisadores têm estudado dust devils marcianos na tentativa de melhor entender o fenômeno. Em geral, as áreas de pesquisa compreendem a simulação numérica e mecânica de dust devils em laboratório, metodologias para reconhecimento de vórtices por robôs pousados na superfície de Marte e a detecção de vórtices e rastros em imagens orbitais. A despeito do grande número de artigos relacionados ao assunto, nenhum deles aborda a detecção automática de rastros de dust devils, tarefa que ganha especial importância quando a quantidade de imagens da superfície de Marte cresce a uma taxa maior que a capacidade humana de analisá-las em um curto período de tempo. Esta Tese descreve um método inédito para detecção automática de rastros de dust devils. O banco de imagens utilizado contém 200 imagens (90 MOC e 110 HiRISE), distribuídas pelas regiões Aeolis, Noachis, Argyre, Eridania e Hellas. O método é fortemente baseado na Morfologia Matemática e usa transformações como abertura e fechamento por área morfológicos, fechamento por caminho morfológico, método de Otsu... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: Dust devils are vortices caused by unstable wind convection processes near the planetary surfaces, due to solar heat. Many researchers have being studying Martian dust devils in an attempt to better understand the phenomena. Generally, the research fields comprise mechanic and numerical simulation of dust devils in laboratories, methodologies for recognition of dust devils plumes from rovers on Mars surface, detection of plumes and tracks from orbital images. Despite the number of papers regarding the subject, none of them addresses the automatic detection of dust devil tracks which is an important issue as the amount of images taken grows at a rate greater than the human capability to analyze them. This Thesis describes a novel method to detect Martian dust devil tracks automatically. The dataset comprises 200 images (90 MOC and 110 HiRISE), distributed over the regions of Aeolis, Noachis, Argyre, Eridania and Hellas. The method is strongly based on Mathematical Morphology and uses transformations such as morphological surface area closing and opening, morphological path closing and Otsu's method for automatic image binarization, among others. The method was applied to the dataset and results were compared... (Complete abstract click electronic access below) / Doutor Cartografia. Morfologia matemática. Automatic feature detection. eng Dust devils. eng

Search results