Global ETD Search

161	Mapeamento digital de solos e o mapa de solos como ferramenta para classificação de aptidão de uso das terras / Digital soil mapping and soil map as a tool for classification of land suitability Höfig, Pedro January 2014 (has links) No Brasil, a execução de mapeamento de solos em todo o território nacional é uma demanda permanente das instituições de pesquisa e por órgãos de planejamento, dado que é uma importante ferramenta para o planejamento da ocupação racional das terras. O Mapeamento Digital de Solo (MDS) surge como alternativa para aumentar a viabilidade de execução de levantamentos de solos, utilizando-se de informações relacionadas ao relevo para mapear os solos. Este estudo objetiva testar metodologias de MDS com extrapolação para área fisiografimente semelhante e reclassificar o mapa pedológico gerado por MDS para criar um mapa de aptidão agrícola das terras e compará-lo com o mapa interpretativo gerado a partir do mapa convencional. Tendo em vista a escassez de dados existentes na Encosta do Sudeste do Rio Grande do Sul, o trabalho foi realizado em Sentinela do Sul e Cerro Grande do Sul. O MDS usou como modelos preditores um modelo geral de árvore de decisão (AD), testando-se um modelo para toda área e também o uso conjunto de dois modelos de predição. Uma vez que o MDS mapeia normalmente classes e propriedades dos solos e que desconhece-se o uso de tal técnica para gerar mapas de aptidão agrícola das terras, parte-se da hipótese que estes mapas possam ser criados a partir da reclassificação do mapa de solos gerados por MDS. O uso de modelos conjuntos de AD gerou modelos com mais acertos e maior capacidade de reprodução do mapa convencional de solos. A extrapolação para o município de Cerro Grande do Sul se mostrou eficiente. Ao classificar a aptidão agrícola das terras, a concordância entre o mapa convencional e os mapas preditos foi maior do que a concordância entre os mapas de solos. / In Brazil, the implementation of soil mapping throughout the national territory is a constant demand of research institutions and planning organs, as it is an important tool for rational planning of land occupation. Digital Soil Mapping (DSM) is an alternative to increase the viability of the soil survey because plots the information based on the relief to draw the soil map. This study aims to test methodologies DSM applied to similar landscapes areas. It also aims to reclassify the pedological map generated by DSM to create a new land suitability classes map and compare it with the land suitability classes map generated from conventional maps. The study was conducted in South Sentinel and Cerro Grande do Sul considering the lack of data in that area. The MDS was generated using a global model of decision tree (DT) for the entire area and combined with the use of two predictive models. The use of DSM to land suitability classes map is unknown. Perhaps interpretive maps created from the reclassification of DSM can produce more accurate maps than the predictor model would generate of the pedological map. The use of set models of DT created models with greater hits and higher reproductive capacity of the conventional map. The extrapolation to Cerro Grande do Sul was efficient . The DSM was more efficient to classify land suitability classes than to classify pedological maps, but this system of land sutability needs adjustments to reflect the local reality. Aptidão agrícola Mapeamento digital Uso da terra Classificacao do solo Sentinela do Sul (RS) Cerro Grande do Sul (RS) Decision trees Soil survey
162	Uma abordagem para a indução de árvores de decisão voltada para dados de expressão gênica / An Approach for the Induction of Decision Trees Focused on Gene Expression Data Pedro Santoro Perez 18 April 2012 (has links) Estudos de expressão gênica têm sido de extrema importância, permitindo desenvolver terapias, exames diagnósticos, medicamentos e desvendar uma infinidade de processos biológicos. No entanto, estes estudos envolvem uma série de dificuldades: grande quantidade de genes, sendo que geralmente apenas um pequeno número deles está envolvido no problema estudado; presença de ruído nos dados analisados; entre muitas outras. O projeto de pesquisa deste mestrado consiste no estudo de algoritmos de indução de árvores de decisão; na definição de uma metodologia capaz de tratar dados de expressão gênica usando árvores de decisão; e na implementação da metodologia proposta como algoritmos capazes de extrair conhecimento a partir desse tipo de dados. A indução de árvores de decisão procura por características relevantes nos dados que permitam modelar precisamente um conceito, mas tem também a preocupação com a compreensibilidade do modelo gerado, auxiliando os especialistas na descoberta de conhecimento, algo importante nas áreas médica e biológica. Por outro lado, tais indutores apresentam relativa instabilidade, podendo gerar modelos bem diferentes com pequenas mudanças nos dados de treinamento. Este é um dos problemas tratados neste mestrado. Mas o principal problema tratado se refere ao comportamento destes indutores em dados de alta dimensionalidade, mais especificamente dados de expressão gênica: atributos irrelevantes prejudicam o aprendizado e vários modelos com desempenho similar podem ser gerados. Diversas técnicas foram exploradas para atacar os problemas mencionados, mas este estudo se concentrou em duas delas: windowing, que foi a técnica mais explorada e para a qual este mestrado propôs uma série de alterações com vistas à melhoria de seu desempenho; e lookahead, que procura construir a árvore levando em considerações passos subsequentes do processo de indução. Quanto ao windowing, foram explorados aspectos relacionados ao procedimento de poda das árvores geradas durante a execução do algoritmo; uso do erro estimado em substituição ao erro de treinamento; uso de ponderação do erro calculado durante a indução de acordo com o tamanho da janela; e uso da confiança na classificação para decidir quais exemplos utilizar na atualização da janela corrente. Com relação ao lookahead, foi implementada uma versão de um passo à frente, ou seja, para tomar a decisão na iteração corrente, o indutor leva em consideração a razão de ganho de informação do passo seguinte. Os resultados obtidos, principalmente com relação às medidas de desempenho baseadas na compreensibilidade dos modelos induzidos, mostram que os algoritmos aqui propostos superaram algoritmos clássicos de indução de árvores. / Gene expression studies have been of great importance, allowing the development of new therapies, diagnostic exams, drugs and the understanding of a variety of biological processes. Nevertheless, those studies involve some obstacles: a huge number of genes, while only a very few of them are really relevant to the problem at hand; data with the presence of noise; among others. This research project consists of: the study of decision tree induction algorithms; the definition of a methodology capable of handling gene expression data using decision trees; and the implementation of that methodology as algorithms that can extract knowledge from that kind of data. The decision tree induction searches for relevant characteristics in the data which would allow it to precisely model a certain concept, but it also worries about the comprehensibility of the generated model, helping specialists to discover new knowledge, something very important in the medical and biological areas. On the other hand, such inducers present some instability, because small changes in the training data might produce great changes in the generated model. This is one of the problems being handled in this Master\'s project. But the main problem this project handles refers to the behavior of those inducers when it comes to high-dimensional data, more specifically to gene expression data: irrelevant attributes may harm the learning process and many models with similar performance may be generated. A variety of techniques have been explored to treat those problems, but this study focused on two of them: windowing, which was the most explored technique and to which this project has proposed some variations in order to improve its performance; and lookahead, which builds each node of a tree taking into consideration subsequent steps of the induction process. As for windowing, the study explored aspects related to the pruning of the trees generated during intermediary steps of the algorithm; the use of the estimated error instead of the training error; the use of the error weighted according to the size of the current window; and the use of the classification confidence as the window update criterion. As for lookahead, a 1-step version was implemented, i.e., in order to make the decision in the current iteration, the inducer takes into consideration the information gain ratio of the next iteration. The results show that the proposed algorithms outperform the classical ones, especially considering measures of complexity and comprehensibility of the induced models. Aprendizado de Máquina Árvores de Decisão Bioinformática Expressão Gênica Lookahead Windowing Bioinformatics Decision Trees Gene Expression Lookahead Machine Learning Windowing
163	Novel Learning-Based Task Schedulers for Domain-Specific SoCs January 2020 (has links) abstract: This Master’s thesis includes the design, integration on-chip, and evaluation of a set of imitation learning (IL)-based scheduling policies: deep neural network (DNN)and decision tree (DT). We first developed IL-based scheduling policies for heterogeneous systems-on-chips (SoCs). Then, we tested these policies using a system-level domain-specific system-on-chip simulation framework [11]. Finally, we transformed them into efficient code using a cloud engine [1] and implemented on a user-space emulation framework [61] on a Unix-based SoC. IL is one area of machine learning (ML) and a useful method to train artificial intelligence (AI) models by imitating the decisions of an expert or Oracle that knows the optimal solution. This thesis's primary focus is to adapt an ML model to work on-chip and optimize the resource allocation for a set of domain-specific wireless and radar systems applications. Evaluation results with four streaming applications from wireless communications and radar domains show how the proposed IL-based scheduler approximates an offline Oracle expert with more than 97% accuracy and 1.20× faster execution time. The models have been implemented as an add-on, making it easy to port to other SoCs. / Dissertation/Thesis / Masters Thesis Computer Engineering 2020 Computer engineering Electrical engineering decision trees deep neural networks domain-specific systems-on-chip (DSSoC) imitation learning scheduling
164	Dolování dat / Data Mining Stehno, David January 2013 (has links) The aim of the thesis was to study and describe data mining methodology CRISP-DM. From the collected database of calls to the call center a prediction was performed, based on CRISP-DM methodology. In phase of test situation modeling four different testing methods were used: the k-NN, neural network, linear regression and super vector machine. The input attributes importance for further prediction was evaluated based on different selections. The results and findings may provide data for further more accurate forecasts in the future; not only in number of calls but also other indicators relevant to the call center.
165	Natural Language Explanation Model for Decision Trees Silva, Jesús, Hernández Palma, Hugo, Niebles Núẽz, William, Ruiz-Lazaro, Alex, Varela, Noel 07 January 2020 (has links) This study describes a model of explanations in natural language for classification decision trees. The explanations include global aspects of the classifier and local aspects of the classification of a particular instance. The proposal is implemented in the ExpliClas open source Web service [1], which in its current version operates on trees built with Weka and data sets with numerical attributes. The feasibility of the proposal is illustrated with two example cases, where the detailed explanation of the respective classification trees is shown. Decision trees Web services Classification decision Classification trees Global aspects Natural language explanations Natural languages Numerical attributes
166	Minska risk för vindskador i granbestånd – hur fungerar ett verktyg för riskanalys i praktiken / Reducing the risk of wind damage in spruce forest stands – evaluating a practical tool Wimarson, Anders January 2021 (has links) Starka vindar orsakar stora skador för det svenska skogsbruket och samhället. Därför är det viktigt att kunna hitta de bestånd som har hög sannolikhet att drabbas av dessa skador. För att lyckas med detta krävs ett enkelt verktyg där bestånden kan bedömas med denutrustning och den kunskap som finns ute på de svenska skogsgårdarna.Den här studien utvärderar och testar ett verktyg som är framtagen av Olofsson & Blennow (2005). Resultatet visar att verktyget fungerar och att det är användarvänligt. Av 90 undersökta bedömningarresulterade 23 % i hög sannolikhet för stormskador på den undersökta gården i norra Halland. Studien visar också på vikten av att använda aktuella data och arbeta med hög noggrannhet i framtagandet avbeståndsdata. De viktigaste parametrarna för att bedömasannolikheten var beståndskantshöjd och HD-kvot. Wind damage spruce risk analysis decision trees climate change Vindskador gran riskanalys beslutsträd klimatförändring Forest Science Skogsvetenskap
167	A Comparison of Machine Learning Techniques to Predict University Rates Park, Samuel M. 06 September 2019 (has links) No description available. Mathematics Statistics
168	Digital Education Resource Mining for Decision Support AL Fanah, Muna M.S. January 2021 (has links) Nowadays education becomes a competitive and challenging domain, both nationally and internationally in terms of quality, visibility, experience of academic delivery affecting institutions, applicants, regulatory bodies. Currently data becomes more available for the general and public use, and plays also an increasingly significant role in decision support for education topics. For example, world university rankings (WUR) such as Quacquarelli Symonds (QS), Central World University Rankings (CWUR), Times Higher Education (Times) and national university rankings (e.g. the Guardian newspaper Best UK Universities and the Complete University Guide league tables) have published their data for many years now and are increasingly used in such decision making processes by institutions and general public. University rankings e-learners Classification Prediction Decision Trees Model-based clustering Markov Chains Education Resource mining Decision support
169	Development and validation of clinical prediction models to diagnose acute respiratory infections in children and adults from Canadian Hutterite communities. Vuichard Gysin, Danielle January 2016 (has links) Acute respiratory infections (ARI) caused by influenza and other respiratory viruses affect millions of people annually. Although usually self-limiting a more complicated or severe course may occur in previously healthy people but are more likely in individuals with underlying illnesses. The most common viral agent is rhinovirus whereas influenza is less frequent but is well known to cause winter epidemics. In primary care, rapid diagnosis of influenza virus infections is essential in order to provide treatment. Clinical presentations vary among the different pathogens but may overlap and may also depend on host factors. Predictive models have been developed for influenza but study results may be biased because only individuals presenting with fever were included. Most of these models have not been adequately validated and their predictive power, therefore, is likely overestimated. The main objective of this thesis was to compare different mathematical models for the derivation of clinical prediction rules in individuals presenting with symptoms of ARI to better distinguish between influenza, influenza A subtypes and entero-/rhinovirus-related illness in children and adults and to evaluate model performance by using data-splitting for internal validation. Data from a completed prospective cluster-randomized trial for the indirect effect of influenza vaccination in children of Hutterite communities served as a basis of my thesis. There were a total of 3288 first episodes per season of ARI in 2202 individuals and 321 (9.8%) influenza positive events over three influenza seasons (2008-2011). The data set was divided into children under 18 years and adults. Both data sets were randomly split by subjects into a derivation (2/3 of the dataset) and a validation population (1/3 of the dataset). All predictive models were developed in the derivation sets. Demographic factors and the classical symptoms of ARI were evaluated with logistic regression and Cox proportional hazard models using forward stepwise selection applying robust estimators to account for non-independent data and by means of recursive partitioning. The beta coefficients of the independent predictors were used to develop different point scores. These scores were then tested in the validation groups and performance between validation and derivation set was compared using receiver operating characteristics (ROC) curves. We determined sensitivities and specificities, positive and negative predictive values, and likelihood ratios at different cut-points which could reflect test and treatment thresholds. Fever, chills, and cough were the most important predictors in children whereas chills and cough but not fever were most predictive of influenza virus infection in adults. Performance of the individual models was moderate with areas under the receiver operating characteristic curves between 0.75 and 0.80 for the main outcome influenza A or B virus infection. There was no statistically significant difference in performance between the derivation and validation sets for the main outcome. The results have shown, that various mathematical models have similar discriminative ability to distinguish influenza from other respiratory viruses. The scores could assist clinicians in their decision-making. However, performance of the models was slightly overestimated due to potential clustering of data and the results would first needed to be validated in a different population before application in clinical practice. / Thesis / Master of Science (MSc) / Every year, millions of people are attacked by "the flu" or the common cold. Certain signs and symptoms apparently are more discriminative between the common cold and the flu. However, the decision between starting a simple symptom orientated treatment, treating empirically for influenza or ordering a rapid diagnostic test that has only moderate sensitivity and specificity can be challenging. This thesis, therefore, aims to help physicians in their decision-making process by developing simple scores and decision trees for the diagnosis of influenza versus non-influenza respiratory infections. Data from a completed trial for the indirect effect of influenza vaccination in children of Hutterite communities served as a basis of my thesis. There were a total of 3288 first seasonal episodes of ARI in 2202 individuals and 321 (9.8%) influenza positive events over three influenza seasons (2008-2011). The data set was divided into children under 18 years and adults. Both data sets were split into a derivation and a validation set (=holdout group). Different mathematical models were applied to the derivation set and demographic factors as well as the classical symptoms of ARI were evaluated. The scores generated from the most important factors that remained in the model were then tested in the validation group and performance between validation and derivation set was compared. Accuracy was determined at different cut-points which could reflect test and treatment thresholds. Fever, chills, and cough were the most important predictors in children whereas chills and cough but not fever were most predictive of influenza virus infection in adults. Performance of the individual models was moderate for the main outcome influenza A or B virus infection. There was no statistically significant difference in performance between the derivation and validation sets for the main outcome. The results have shown, that various mathematical models have similar discriminative ability to distinguish influenza from other respiratory viruses. The scores could assist clinicians in their decision-making. However, the results would first needed to be validated in a different population before application in clinical practice. Prediction models Recursive partitioning Generalized Estimating Equations Cox Proportional Hazard models Acute respiratory infections Score Influenza Decision trees
170	Modeling Nonignorable Missingness with Response Times Using Tree-based Framework in Cognitive Diagnostic Models Yang, Yi January 2023 (has links) As the testing moves from paper-and-pencil to computer-based assessment, both response accuracy (RA) and response time (RT) together provide a potential for improving the performance evaluation and ability estimation of the test takers. Most joint models utilizing RAs and RTs simultaneously assumed an IRT model for the RA measurement at the lower level, among which the hierarchical speed-accuracy (SA) model proposed by van der Linden (2007) is the most prevalent in literature. Zhan et al. (2017) extended the SA model in cognitive diagnostic modeling (CDM) by proposing the hierarchical joint response and times DINA (JRT-DINA) model, but little is known about its generalizability with the presence of missing data. Large-scale assessments are used in educational effectiveness studies to quantify educational achievement, in which the amount of item nonresponses is not negligible (Pohl et al., 2012; Pohl et al., 2019; Rose et al., 2017; Rose et al., 2010) due to lack of proficiency, lack of motivation and/or lack of time. Treating unplanned missingness as ignorable leads to biased sample-based estimates of item and person parameters (R. J. A. Little & Rubin, 2020; Rubin, 1976), therefore, in the past few decades, intensive efforts have been focused on nonignorable missingness (Glas & Pimentel, 2008; Holman & Glas, 2005; Pohl et al., 2019; Rose et al., 2017; Rose et al., 2010; Ulitzsch et al., 2020a, 2020b). However, a great majority of these methods were limited in item nonresponse types and/or model complexity until J. Lu and Wang (2020) incorporated the mixture cure-rate model (Lee & Ying, 2015) and the tree-based IRT framework (Debeer et al., 2017), which inherited a built-in behavior process for item nonresponses thus introduced no additional latent propensity parameters to the joint model. Nevertheless, these approaches were discussed within the IRT framework, and the traditional measurement models could not provide cognitive diagnostic information about attribute mastery. This dissertation first postulates the CDMTree model, an extension of the tree-based RT process model in CDM, and then explores its efficacy through a real data analysis using PISA 2012 computer-based assessment of mathematics data. The follow-up simulation study compares the proposed model to the JRT-DINA model under multiple conditions to deal with various types of nonignorable missingness, i.e. both omitted items (OIs) and not-reached items (NRIs) due to time limits. A fully Bayesian approach is used for the estimation of the model with the Markov chain Monte Carlo (MCMC) method. Cognitive Diagnostic Battery Education--Computer-assisted instruction Decision trees Machine learning Bayesian statistical decision theory Monte Carlo method

Search results