• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 65
  • 12
  • 6
  • 4
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 129
  • 129
  • 50
  • 49
  • 24
  • 23
  • 20
  • 20
  • 19
  • 18
  • 18
  • 17
  • 17
  • 17
  • 16
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Detection of Urban Damage Using Remote Sensing and Machine Learning Algorithms: Revisiting the 2010 Haiti Earthquake

Cooner, Austin Jeffrey 19 December 2016 (has links)
Remote sensing continues to be an invaluable tool in earthquake damage assessments and emergency response. This study evaluates the effectiveness of multilayer feedforward neural networks, radial basis neural networks, and Random Forests in detecting earthquake damage caused by the 2010 Port-au-Prince, Haiti 7.0 moment magnitude (Mw) event. Additionally, textural and structural features including entropy, dissimilarity, Laplacian of Gaussian, and rectangular fit are investigated as key variables for high spatial resolution imagery classification. Our findings show that each of the algorithms achieved nearly a 90% kernel density match using the United Nations Operational Satellite Applications Programme (UNITAR/UNOSAT) dataset as validation. The multilayer feedforward network was able to achieve an error rate below 40% in detecting damaged buildings. Spatial features of texture and structure were far more important in algorithmic classification than spectral information, highlighting the potential for future implementation of machine learning algorithms which use panchromatic or pansharpened imagery alone. / Master of Science
22

Applications and extensions of Random Forests in genetic and environmental studies

Michaelson, Jacob 20 December 2010 (has links)
Transcriptional regulation refers to the molecular systems that control the concentration of mRNA species within the cell. Variation in these controlling systems is not only responsible for many diseases, but also contributes to the vast phenotypic diversity in the biological world. There are powerful experimental approaches to probe these regulatory systems, and the focus of my doctoral research has been to develop and apply effective computational methods that exploit these rich data sets more completely. First, I present a method for mapping genetic regulators of gene expression (expression quantitative trait loci, or eQTL) using Random Forests. This approach allows for flexible modeling and feature selection, and results in eQTL that are more biologically supportable than those mapped with competing methods. Next, I present a method that finds interactions between genes that in turn regulate the expression of other genes. This is accomplished by finding recurring decision motifs in the forest structure that represent dependencies between genetic loci. Third, I present a method to use distributional differences in eQTL data to establish the regulatory roles of genes relative to other disease-associated genes. Using this method, we found that genes that are master regulators of other disease genes are more likely to be consistently associated with the disease in genetic association studies. Finally, I present a novel application of Random Forests to determine the mode of regulation of toxin-perturbed genes, using time-resolved gene expression. The results demonstrate a novel approach to supervised weighted clustering of gene expression data.
23

Risk Factors for Suicidal Behaviour Among Canadian Civilians and Military Personnel: A Recursive Partitioning Approach

Rusu, Corneliu 05 April 2018 (has links)
Background: Suicidal behaviour is a major public health problem that has not abated over the past decade. Adopting machine learning algorithms that allow for combining risk factors that may increase the predictive accuracy of models of suicide behaviour is one promising avenue toward effective prevention and treatment. Methods: We used Canadian Community Health Survey – Mental Health and Canadian Forces Mental Health Survey to build conditional inference random forests models of suicidal behaviour in Canadian general population and Canadian Armed Forces. We generated risk algorithms for suicidal behaviour in each sample. We performed within- and between-sample validation and reported the corresponding performance metrics. Results: Only a handful of variables were important in predicting suicidal behaviour in Canadian general population and Canadian Armed Forces. Each model’s performance on within-sample validation was satisfactory, with moderate to high sensitivity and high specificity, while the performance on between-sample validation was conditional on the size and heterogeneity of the training sample. Conclusion: Using conditional inference random forest methodology on large nationally representative mental health surveys has the potential of generating models of suicidal behaviour that not only reflect its complex nature, but indicate that the true positive cases are likely to be captured by this approach.
24

Explorando associações entre sarcopenia, obesidade e osteoporose : estudo com pacientes da atenção primária em saúde /

Delacosta, Thais Cristina January 2019 (has links)
Orientador: Henrique Luiz Monteiro / Resumo: A avaliação da composição corporal é um recurso utilizado para a detecção, prevenção e tratamento de doenças relacionadas às alterações no padrão e distribuição dos tecidos corporais. Osteopenia/osteoporose, sarcopenia e obesidade se sobrepõem, criando combinação de outros distúrbios teciduais, sendo a obesidade osteosarcopênica a mais multifacetada. A prática de atividade física pode prevenir/tratar diversas doenças metabólicas e intervir positivamente na capacidade funcional de adultos e idosos. Entretanto, há poucos estudos de intervenção para a população diagnosticada com obesidade sarcopênica ou osteosarcopênica e o uso e aplicação das informações que são disponibilizadas pelo raio-X de dupla energia (DEXA) ainda permanecem bem pouco exploradas. Objetivo: analisar os componentes da composição corporal e explorar a associação entre sarcopenia, obesidade e osteoporose e hábitos de atividades físicas de homens e mulheres, em tratamento na atenção básica em saúde. Explorar as informações que o DEXA disponibiliza e o comportamento de indicadores para predição da obesidade, sarcopenia e suas combinações. Metodologia: estudo transversal com pacientes de 50 anos ou mais, usuários da atenção básica em saúde do município de Bauru-SP. Para a análise da composição corporal foi utilizado DEXA. Entrevistas sobre as características dos pacientes, atividade física habitual, poder aquisitivo, escolaridade, tabagismo e consumo de álcool foram realizadas. Resultados: os 206 pacientes avali... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: Body composition is a resource used for the detection, prevention and treatment of diseases related to changes in the pattern and distribution of body tissues. Osteopenia / osteoporosis, sarcopenia and obesity overlap, creating a combination of other tissue disorders, with osteosarcopenic obesity being the most multifaceted. The practice of physical activity can prevent / treat several metabolic diseases and intervene positively in the functional capacity of adults and the elderly. However, there are few intervention studies for the population diagnosed with sarcopenic or osteosarcopenic obesity, and the use and application of the information provided by dual energy X-ray (DXA) remains poorly explored. Aim: to analyze the components of body composition and explore the association between sarcopenia, obesity and osteoporosis and physical activity habits of men and women, being treated in basic health care. Explore the information that DXA provides and the behavior of indicators for predicting obesity, sarcopenia and their combinations. Methodology: cross-sectional study with patients aged 50 years or older, users of primary health care in the city of Bauru-SP. For body composition analysis, DXA was used. Interviews about patient characteristics, habitual physical activity, purchasing power, schooling, smoking and alcohol consumption were performed. Results: The 206 patients evaluated presented 66.9 + 7 years old, with a predominance of females (81.6%). The prevalence of two or... (Complete abstract click electronic access below) / Mestre
25

Machine learning approaches for assessing moderate-to-severe diarrhea in children < 5 years of age, rural western Kenya 2008-2012

Ayers, Tracy L 13 May 2016 (has links)
Worldwide diarrheal disease is a leading cause of morbidity and mortality in children less than five years of age. Incidence and disease severity remain the highest in sub-Saharan Africa. Kenya has an estimated 400,000 severe diarrhea episodes and 9,500 diarrhea-related deaths per year in children. Current statistical methods for estimating etiological and exposure risk factors for moderate-to-severe diarrhea (MSD) in children are constrained by the inability to assess a large number of parameters due to limitations of sample size, complex relationships, correlated predictors, and model assumptions of linearity. This dissertation examines machine learning statistical methods to address weaknesses associated with using traditional logistic regression models. The studies presented here investigate data from a 4-year, prospective, matched case-control study of MSD among children less than five years of age in rural Kenya from the Global Enteric Multicenter Study. The three machine learning approaches were used to examine associations with MSD and include: least absolute shrinkage and selection operator, classification trees, and random forest. A principal finding in all three studies was that machine learning methodological approaches are useful and feasible to implement in epidemiological studies. All provided additional information and understanding of the data beyond using only logistic regression models. The results from all three machine learning approaches were supported by comparable logistic regression results indicating their usefulness as epidemiological tools. This dissertation offers an exploration of methodological alternatives that should be considered more frequently in diarrheal disease epidemiology, and in public health in general.
26

The Effectiveness of a Random Forests Model in Detecting Network-Based Buffer Overflow Attacks

Julock, Gregory Alan 01 January 2013 (has links)
Buffer Overflows are a common type of network intrusion attack that continue to plague the networked community. Unfortunately, this type of attack is not well detected with current data mining algorithms. This research investigated the use of Random Forests, an ensemble technique that creates multiple decision trees, and then votes for the best tree. The research Investigated Random Forests' effectiveness in detecting buffer overflows compared to other data mining methods such as CART and Naïve Bayes. Random Forests was used for variable reduction, cost sensitive classification was applied, and each method's detection performance compared and reported along with the receive operator characteristics. The experiment was able to show that Random Forests outperformed CART and Naïve Bayes in classification performance. Using a technique to obtain Buffer Overflow most important variables, Random Forests was also able to improve upon its Buffer Overflow classification performance.
27

Metody kontrukce klasifikátorů vhodných pro segmentaci zákazníků / Construction of classifiers suitable for segmentation of clients

Hricová, Jana January 2013 (has links)
Title: Construction of classifiers suitable for segmentation of clients Author: Bc. Jana Hricová Department: Department of Probability and Mathematical Statistics Supervisor: prof. RNDr. Jaromír Antoch, CSc., Department of Probability and Mathematical Statistics Abstract: The master thesis discusses methods that are a part of the data analy- sis, called classification. In the thesis are presented classification methods used to construct tree like classifiers suitable for customer segmentation. Core methodo- logy that is discussed in our thesis is CART (Classification and Regression Trees) and then methodologies around ensemble models that use historical data to cons- truct classification and regression forests, namely Bagging, Boosting, Arcing and Random Forest. Here described methods were applied to real data from the field of customer segmentation and also to simulated data, both processed with RStudio software. Keywords: classification, tree like classifiers, random forests
28

Machine learning for materials science

Rouet-Leduc, Bertrand January 2017 (has links)
Machine learning is a branch of artificial intelligence that uses data to automatically build inferences and models designed to generalise and make predictions. In this thesis, the use of machine learning in materials science is explored, for two different problems: the optimisation of gallium nitride optoelectronic devices, and the prediction of material failure in the setting of laboratory earthquakes. Light emitting diodes based on III-nitrides quantum wells have become ubiquitous as a light source, owing to their direct band-gap that covers UV, visible and infra-red light, and their very high quantum efficiency. This efficiency originates from most electronic transitions across the band-gap leading to the emission of a photon. At high currents however this efficiency sharply drops. In chapters 3 and 4 simulations are shown to provide an explanation for experimental results, shedding a new light on this drop of efficiency. Chapter 3 provides a simple and yet accurate model that explains the experimentally observed beneficial effect that silicon doping has on light emitting diodes. Chapter 4 provides a model for the experimentally observed detrimental effect that certain V-shaped defects have on light emitting diodes. These results pave the way for the association of simulations to detailed multi-microscopy. In the following chapters 5 to 7, it is shown that machine learning can leverage the use of device simulations, by replacing in a targeted and efficient way the very labour intensive tasks of making sure the numerical parameters of the simulations lead to convergence, and that the physical parameters reproduce experimental results. It is then shown that machine learning coupled with simulations can find optimal light emitting diodes structures, that have a greatly enhanced theoretical efficiency. These results demonstrate the power of machine learning for leveraging and automatising the exploration of device structures in simulations. Material failure is a very broad problem encountered in a variety of fields, ranging from engineering to Earth sciences. The phenomenon stems from complex and multi-scale physics, and failure experiments can provide a wealth of data that can be exploited by machine learning. In chapter 8 it is shown that by recording the acoustic waves emitted during the failure of a laboratory fault, an accurate predictive model can be built. The machine learning algorithm that is used retains the link with the physics of the experiment, and a new signal is thus discovered in the sound emitted by the fault. This new signal announces an upcoming laboratory earthquake, and is a signature of the stress state of the material. These results show that machine learning can help discover new signals in experiments where the amount of data is very large, and demonstrate a new method for the prediction of material failure.
29

Modelos computacionais prognósticos de lesões traumáticas do plexo braquial em adultos / Prognostic computational models for traumatic brachial plexus injuries in adults

Abud, Luciana de Melo e 20 June 2018 (has links)
Estudos de prognóstico clínico consistem na predição do curso de uma doença em pacientes e são utilizados por profissionais da saúde com o intuito de aumentar as chances ou a qualidade de sua recuperação. Sob a perspectiva computacional, a criação de um modelo prognóstico clínico é um problema de classificação, cujo objetivo é identificar a qual classe (dentro de um conjunto de classes predefinidas) uma nova amostra pertence. Este projeto visa a criar modelos prognósticos de lesões traumáticas do plexo braquial, um conjunto de nervos que inervam os membros superiores, utilizando dados de pacientes adultos com esse tipo de lesão. Os dados são provenientes do Instituto de Neurologia Deolindo Couto (INDC) da Universidade Federal do Rio de Janeiro (UFRJ) e contêm dezenas de atributos clínicos coletados por meio de questionários eletrônicos. Com esses modelos prognósticos, deseja-se identificar de maneira automática os possíveis preditores do curso desse tipo de lesão. Árvores de decisão são classificadores frequentemente utilizados para criação de modelos prognósticos, por se tratarem de um modelo transparente, cujo resultado pode ser examinado e interpretado clinicamente. As Florestas Aleatórias, uma técnica que utiliza um conjunto de árvores de decisão para determinar o resultado final da classificação, podem aumentar significativamente a acurácia e a generalização dos modelos gerados, entretanto ainda são pouco utilizadas na criação de modelos prognósticos. Neste projeto, exploramos a utilização de florestas aleatórias nesse contexto, bem como a aplicação de métodos de interpretação de seus modelos gerados, uma vez que a transparência do modelo é um aspecto particularmente importante em domínios clínicos. A estimativa de generalização dos modelos resultantes foi feita por meio de métodos que viabilizam sua utilização sobre um número reduzido de instâncias, uma vez que os dados relativos ao prognóstico são provenientes de 44 pacientes do INDC. Além disso, adaptamos a técnica de florestas aleatórias para incluir a possível existência de valores faltantes, que é uma característica presente nos dados utilizados neste projeto. Foram criados quatro modelos prognósticos - um para cada objetivo de recuperação, sendo eles a ausência de dor e forças satisfatórias avaliadas sobre abdução do ombro, flexão do cotovelo e rotação externa no ombro. As acurácias dos modelos foram estimadas entre 77% e 88%, utilizando o método de validação cruzada leave-one-out. Esses modelos evoluirão com a inclusão de novos dados, provenientes da contínua chegada de novos pacientes em tratamento no INDC, e serão utilizados como parte de um sistema de apoio à decisão clínica, de forma a possibilitar a predição de recuperação de um paciente considerando suas características clínicas. / Studies of prognosis refer to the prediction of the course of a disease in patients and are employed by health professionals in order to improve patients\' recovery chances and quality. Under a computational perspective, the creation of a prognostic model is a classification task that aims to identify to which class (within a predefined set of classes) a new sample belongs. The goal of this project is the creation of prognostic models for traumatic injuries of the brachial plexus, a network of nerves that innervates the upper limbs, using data from adult patients with this kind of injury. The data come from the Neurology Institute Deolindo Couto (INDC) of Rio de Janeiro Federal University (UFRJ) and they are characterized by dozens of clinical features that are collected by means of electronic questionnaires. With the use of these prognostic models we intended to automatically identify possible predictors of the course of brachial plexus injuries. Decision trees are classifiers that are frequently used for the creation of prognostic models since they are a transparent technique that produces results that can be clinically examined and interpreted. Random Forests are a technique that uses a set of decision trees to determine the final classification results and can significantly improve model\'s accuracy and generalization, yet they are still not commonly used for the creation of prognostic models. In this project we explored the use of random forests for that purpose, as well as the use of interpretation methods for the resulting models, since model transparency is an important aspect in clinical domains. Model assessment was achieved by means of methods whose application over a small set of samples is suitable, since the available prognostic data refer to only 44 patients from INDC. Additionally, we adapted the random forests technique to include missing data, that are frequent among the data used in this project. Four prognostic models were created - one for each recovery goal, those being absence of pain and satisfactory strength evaluated over shoulder abduction, elbow flexion and external shoulder rotation. The models\' accuracies were estimated between 77% and 88%, calculated through the leave-one-out cross validation method. These models will evolve with the inclusion of new data from new patients that will arrive at the INDC and they will be used as part of a clinical decision support system, with the purpose of prediction of a patient\'s recovery considering his or her clinical characteristics.
30

EVALUATE PROBE SPEED DATA QUALITY TO IMPROVE TRANSPORTATION MODELING

Rahman, Fahmida 01 January 2019 (has links)
Probe speed data are widely used to calculate performance measures for quantifying state-wide traffic conditions. Estimation of the accurate performance measures requires adequate speed data observations. However, probe vehicles reporting the speed data may not be available all the time on each road segment. Agencies need to develop a good understanding of the adequacy of these reported data before using them in different transportation applications. This study attempts to systematically assess the quality of the probe data by proposing a method, which determines the minimum sample rate for checking data adequacy. The minimum sample rate is defined as the minimum required speed data for a segment ensuring the speed estimates within a defined error range. The proposed method adopts a bootstrapping approach to determine the minimum sample rate within a pre-defined acceptance level. After applying the method to the speed data, the results from the analysis show a minimum sample rate of 10% for Kentucky’s roads. This cut-off value for Kentucky’s roads helps to identify the segments where the availability is greater than the minimum sample rate. This study also shows two applications of the minimum sample rates resulted from the bootstrapping. Firstly, the results are utilized to identify the geometric and operational factors that contribute to the minimum sample rate of a facility. Using random forests regression model as a tool, functional class, section length, and speed limit are found to be the significant variables for uninterrupted facility. Contrarily, for interrupted facility, signal density, section length, speed limit, and intersection density are the significant variables. Lastly, the speed data associated with the segments are applied to improve Free Flow Speed estimation by the traditional model.

Page generated in 0.0781 seconds