Global ETD Search

331	Applications of machine learning to agricultural land values: prediction and causal inference Er, Emrah January 1900 (has links) Doctor of Philosophy / Department of Agricultural Economics / Nathan P. Hendricks / This dissertation focuses on the prediction of agricultural land values and the effects of water rights on land values using machine learning algorithms and hedonic pricing methods. I predict agricultural land values with different machine learning algorithms, including ridge regression, least absolute shrinkage and selection operator, random forests, and extreme gradient boosting methods. To analyze the causal effects of water right seniority on agricultural land values, I use the double-selection LASSO technique. The second chapter presents the data used in the dissertation. A unique set of parcel sales from Property Valuation Division of Kansas constitute the backbone of the data used in the estimation. Along with parcel sales data, I collected detailed basis, water, tax, soil, weather, and urban influence data. This chapter provides detailed explanation of various data sources and variable construction processes. The third chapter presents different machine learning models for irrigated agricultural land price predictions in Kansas. Researchers, and policymakers use different models and data sets for price prediction. Recently developed machine learning methods have the power to improve the predictive ability of the models estimated. In this chapter I estimate several machine learning models for predicting the agricultural land values in Kansas. Results indicate that the predictive power of the machine learning methods are stronger compared to standard econometric methods. Median absolute error in extreme gradient boosting estimation is 0.1312 whereas it is 0.6528 in simple OLS model. The fourth chapter examines whether water right seniority is capitalized into irrigated agricultural land values in Kansas. Using a unique data set of irrigated agricultural land sales, I analyze the causal effect of water right seniority on agricultural land values. A possible concern during the estimation of hedonic models is the omitted variable bias so we use double-selection LASSO regression and its variable selection properties to overcome the omitted variable bias. I also estimate generalized additive models to analyze the nonlinearities that may exist. Results show that water rights have a positive impact on irrigated land prices in Kansas. An additional year of water right seniority causes irrigated land value to increase nearly $17 per acre. Further analysis also suggest a nonlinear relationship between seniority and agricultural land prices. Land Values Machine Learning Prediction Causal Inference
332	Geometry and uncertainty in deep learning for computer vision Kendall, Alex Guy January 2019 (has links) Deep learning and convolutional neural networks have become the dominant tool for computer vision. These techniques excel at learning complicated representations from data using supervised learning. In particular, image recognition models now out-perform human baselines under constrained settings. However, the science of computer vision aims to build machines which can see. This requires models which can extract richer information than recognition, from images and video. In general, applying these deep learning models from recognition to other problems in computer vision is significantly more challenging. This thesis presents end-to-end deep learning architectures for a number of core computer vision problems; scene understanding, camera pose estimation, stereo vision and video semantic segmentation. Our models outperform traditional approaches and advance state-of-the-art on a number of challenging computer vision benchmarks. However, these end-to-end models are often not interpretable and require enormous quantities of training data. To address this, we make two observations: (i) we do not need to learn everything from scratch, we know a lot about the physical world, and (ii) we cannot know everything from data, our models should be aware of what they do not know. This thesis explores these ideas using concepts from geometry and uncertainty. Specifically, we show how to improve end-to-end deep learning models by leveraging the underlying geometry of the problem. We explicitly model concepts such as epipolar geometry to learn with unsupervised learning, which improves performance. Secondly, we introduce ideas from probabilistic modelling and Bayesian deep learning to understand uncertainty in computer vision models. We show how to quantify different types of uncertainty, improving safety for real world applications.
333	Applications of econometrics and machine learning to development and international economics Hersh, Jonathan Samuel 07 November 2018 (has links) In the first chapter, I explore whether features derived from high resolution satellite images of Sri Lanka are able to predict poverty or income at local areas. I extract from satellite imagery area specific indicators of economic well-being including the number of cars, type and extent of crops, length and type of roads, roof extent and roof type, building height and number of buildings. Estimated models are able to explain between 60 to 65 percent of the village-specific variation in poverty and average level of log income. The second chapter investigates the effects of preferential trade programs such as the U.S. African Growth and Opportunity Act (AGOA) on the direction of African countries’ exports. While these programs intend to promote African exports, textbook models of trade suggest that such asymmetric tariff reductions could divert African exports from other destinations to the tariff reducing economy. I examine the import patterns of 177 countries and estimate the diversion effect using a triple-difference estimation strategy, which exploits time variation in the product and country coverage of AGOA. I find no evidence of systematic trade diversion within Africa, but do find evidence of diversion from other industrialized destinations, particularly for apparel products. In the third chapter I apply three model selection methods – Lasso regularized regression, Bayesian Model Averaging, and Extreme Bound Analysis -- to candidate variables in a gravity models of trade. I use a panel dataset of of 198 countries covering the years 1970 to 2000, and find model selection methods suggest many fewer variables are robust that those suggested by the null hypothesis rejection methodology from ordinary least squares. Economics Economic development Machine learning Poverty Trade
334	The OGCleaner: Detecting False-Positive Sequence Homology Fujimoto, Masaki Stanley 01 June 2017 (has links) Within bioinformatics, phylogenetics is the study of the evolutionary relationships between different species and organisms. The genetic revolution has caused an explosion in the amount of raw genomic information that is available to scientists for study. While there has been an explosion in available data, analysis methods have lagged behind. A key task in phylogenetics is identifying homology clusters. Current methods rely on using heuristics based on pairwise sequence comparison to identify homology clusters. We propose the Orthology Group Cleaner (the OGCleaner) as a method to evaluate cluster level verification of putative homology clusters in order to create higher quality phylogenetic tree reconstruction. machine learning orthology clusters phylogenetics Computer Sciences
335	Creating and Automatically Grading Annotated Questions Wood, Alicia Crowder 01 September 2016 (has links) We have created a question type that allows teachers to easily create questions, helps provide an intuitive user experience for students to take questions, and reduces the time it currently takes teachers to grade and provide feedback to students. This question type, or an "annotated" question, will allow teachers to test students' knowledge in a particular subject area by having students "annotate" or mark text and video sources to answer questions. Through user testing we determined that overall the interface and the implemented system decrease the time it would take a teacher to grade annotated quiz questions. However, there are some limitations based on the way students answered text annotated questions that would require a rewrite of the user interface and system design to decrease the grading time even more for teachers. education machine learning user experience Computer Sciences
336	Local properties and rupture characteristics of thoracic aortic aneurysm tissue Luo, Yuanming 01 May 2018 (has links) Ascending thoracic aortic aneurysms (ATAAs) are focal dilatations in the aorta that are prone to rupture or dissect. Currently, the clinically used indicator of the rupture risk is the diameter. However, it has been demonstrated that the diameter alone may not properly predict the risk. To evaluate the rupture risk, one must look into the local mechanical conditions at the rupture site and understand how rupture is triggered in the tissue which is a layered fibrous media. A challenge facing experimental studies of ATAA rupture is that the ATAA tissue is highly heterogeneous; experimental protocols that operate under the premise of tissue homogeneity will have difficulty delineating the heterogeneous properties. In general, rupture initiates at the location where the micro-structure starts to break down and consequently, it is more meaningful to investigate the local conditions at the rupture site. In this work, a combined experimental and computational method was developed and employed to characterize wall stress, strain, and property distributions in harvested ATAA samples to a sub-millimeter resolution. The results show that all tested samples exhibit a significant degree of heterogeneous in their mechanical properties. Large inter-subject variability is also observed. A heterogeneous anisotropic finite strain hyperelastic model was introduced to describe the tissue; the distributions of the material parameters were identified. The elastic energy stored in the tissue was computed. It was found that the tissue fractures preferentially in the direction of the highest stiffness, generating orifices that are locally transverse to the peak stiffness direction. The rupture appears to initiate at the position absorbed of the highest energy. Machine learning was used to classify the curves at rupture and non-rupture locations. Features including material properties and curve geometric characteristics were used. The work showed that the rupture and non-rupture states can indeed be classified using pre-rupture response features. Support vector machine(SVM) and random forest algorithm was employed to provide insight on the importance of the features. Inspired by the importance scores provided by random forest, the rupture groups were interrogated and some strong correlations between the strength and the response features were revealed. In particular, it was found that the strength correlates strongly with the tension at the point where the curvature of the total tension strain curve attains maximum, which occurs early in the response. ATAA Heterogeneity Machine learning Rupture Mechanical Engineering
337	STREAMLINING CLINICAL DETECTION OF ALZHEIMER’S DISEASE USING ELECTRONIC HEALTH RECORDS AND MACHINE LEARNING TECHNIQUES Unknown Date (has links) Alzheimer’s disease is typically detected using a combination of cognitive-behavioral assessment exams and interviews of both the patient and a family member or caregiver, both administered and interpreted by a trained physician. This procedure, while standard in medical practice, can be time consuming and expensive for both the patient and the diagnostician especially because proper training is required to interpret the collected information and determine an appropriate diagnosis. The use of machine learning techniques to augment diagnostic procedures has been previously examined in limited capacity but to date no research examines real-world medical applications of predictive analytics for health records and cognitive exam scores. This dissertation seeks to examine the efficacy of detecting cognitive impairment due to Alzheimer’s disease using machine learning, including multi-modal neural network architectures, with a real-world clinical dataset used to determine the accuracy and applicability of the generated models. An in-depth analysis of each type of data (e.g. cognitive exams, questionnaires, demographics) as well as the cognitive domains examined (e.g. memory, attention, language) is performed to identify the most useful targets, with cognitive exams and questionnaires being found to be the most useful features and short-term memory, attention, and language found to be the most important cognitive domains. In an effort to reduce medical costs and streamline procedures, optimally predictive and efficient groups of features were identified and selected, with the best performing and economical group containing only three questions and one cognitive exam component, producing an accuracy of 85%. The most effective diagnostic scoring procedure was examined, with simple threshold counting based on medical documentation being identified as the most useful. Overall predictive analysis found that Alzheimer’s disease can be detected most accurately using a bimodal multi-input neural network model using separated cognitive domains and questionnaires, with a detection accuracy of 88% using the real-world testing set, and that the technique of analyzing domains separately serves to significantly improve model efficacy compared to models that combine them. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2019. / FAU Electronic Theses and Dissertations Collection Alzheimer's disease Electronic Health Records Machine learning
338	DATA-DRIVEN MULTISCALE PREDICTION OF MATERIAL PROPERTIES USING MACHINE LEARNING ALGORITHMS Moonseop Kim (7326788) 16 October 2019 (has links) <div> <div> <div> <p>The objective of this study is that combination of molecular dynamics (MD) simulations and machine learning to complement each other. In this study, four steps are conducted. </p> <p>First is based on the empirical potentials development in silicon nanowires for theory parts of molecular dynamics. Many-body empirical potentials have been developed for the last three decades, and with the advance of supercomputers, these potentials are expected to be even more useful for the next three decades. Atomistic calculations using empirical potentials can be particularly useful in understanding the structural aspects of Si or Si-H systems, however, existing empirical potentials have many errors of parameters. We propose a novel technique to understand and construct interatomic potentials with an emphasis on parameter fitting, in which the relationship between material properties and potential parameters is explained. The input database has been obtained from density functional theory (DFT) calculations with the Vienna ab initio simulation package (VASP) using the projector augmented-wave method within the generalized gradient approximation. The DFT data are used in the fitting process to guarantee the compatibility within the context of multiscale modeling. </p> <p>Second, application part of MD simulations, enhancement of mechanical properties was focused in this research by using MEAM potentials. For instance, Young’s modulus, ultimate tensile strength, true strain, true stress and stress-strain relationship were calculated for nanosized Cu-precipitates using quenching & partitioning (Q&P) processing and nanosized Fe3C strengthened ultrafine-grained (UFG) ferritic steel. In the stress-strain relationship, the structure of simulation is defined using the constant total number of particles, constant-energy, constant-volume ensemble (NVE) is pulled in the y-direction, or perpendicular to the boundary interface, to increase strain. The strain in increased for a specified number of times in a loop and the stress is calculated at each point before the simulation loops.</p></div></div> </div> <div> <div> <div> <p>Third, based on the MD simulations, machine learning and the peridynamics are applied to prediction of disk damage patterns. The peridynamics is the nonlocal extension of classical continuum mechanics and same as MD model. Especially, FEM is based on the partial differential equations, however, partial derivatives do not exist on crack and damage surfaces. To complement this problem, the peridynamics was used which is based on the integral equations and overcome deficiencies in the modeling of deformation discontinuities. In this study, the forward problem (i), if we have images of damage and crack, crack patterns are predicted by using trained data compared to true solutions which are hit by changing the x and y hitting coordinates on the disk. The inverse problem (ii), if we have images of damage and crack, the corresponding hitting location, indenter velocity and indenter size are predicted by using trained data. Furthermore, we did the regression analysis for the images of the crack patterns with Neural processes to predict the crack patterns. In the regression problem, by representing the results of the variance according to the epochs, it can be confirmed that the result of the variance is decreased by increasing the epoch through the neural processes. Therefore, the result of the training gradually improves, and the ranges of the variance are expressed as 0 to 0.035. The most critical point of this study is that the neural processes makes an accurate prediction even if the information of the training data is missing or not enough. The results show that if the context points are set to 10, 100, 300, and 784, the training information is deliberately omitted such as context points of 10, 100 and 300, and the predictions are different when context points are significantly lower. However, when comparing the results of context points 100 and 784, the predicted results appear to be very similar to each other because of the Gaussian processes in the neural processes. Therefore, if the training data is trained through the Neural processes, the missing information of training data can be supplemented to predict the results. </p> <p>Finally, we predicted the data by applying various data using deep learning as well as MD simulation data. This study applied the deep learning to Cryo-EM images and Line Trip (LT) data with power systems. In this study, deep learning method was applied to reduce the effort of selection of high-quality particles. This study proposes a learning frame structure using deep learning and aims at freeing passively selecting high quality particles as the ultimate goal. For predicting the line trip data and bad data detection, we choose to analyze the frequency signal because suddenly the frequency changes in the power system due to events such as generator trip, line trip or load shedding in large power systems. </p> </div> </div> </div> Mechanical Engineering machine learning-based Molecular Dynamics
339	Regularized Discriminant Analysis: A Large Dimensional Study Yang, Xiaoke 28 April 2018 (has links) In this thesis, we focus on studying the performance of general regularized discriminant analysis (RDA) classifiers. The data used for analysis is assumed to follow Gaussian mixture model with different means and covariances. RDA offers a rich class of regularization options, covering as special cases the regularized linear discriminant analysis (RLDA) and the regularized quadratic discriminant analysis (RQDA) classi ers. We analyze RDA under the double asymptotic regime where the data dimension and the training size both increase in a proportional way. This double asymptotic regime allows for application of fundamental results from random matrix theory. Under the double asymptotic regime and some mild assumptions, we show that the asymptotic classification error converges to a deterministic quantity that only depends on the data statistical parameters and dimensions. This result not only implicates some mathematical relations between the misclassification error and the class statistics, but also can be leveraged to select the optimal parameters that minimize the classification error, thus yielding the optimal classifier. Validation results on the synthetic data show a good accuracy of our theoretical findings. We also construct a general consistent estimator to approximate the true classification error in consideration of the unknown previous statistics. We benchmark the performance of our proposed consistent estimator against classical estimator on synthetic data. The observations demonstrate that the general estimator outperforms others in terms of mean squared error (MSE). Random matrix theory Machine Learning discriminant Analysis
340	An architecture for situated learning agents Mitchell, Matthew Winston, 1968- January 2003 (has links) Abstract not available Artificial intelligence

Search results