Global ETD Search

51	Weighing Machine Learning Algorithms for Accounting RWISs Characteristics in METRo : A comparison of Random Forest, Deep Learning & kNN Landmér Pedersen, Jesper January 2019 (has links) The numerical model to forecast road conditions, Model of the Environment and Temperature of Roads (METRo), laid the foundation of solving the energy balance and calculating the temperature evolution of roads. METRo does this by providing a numerical modelling system making use of Road Weather Information Stations (RWIS) and meteorological projections. While METRo accommodates tools for correcting errors at each station, such as regional differences or microclimates, this thesis proposes machine learning as a supplement to the METRo prognostications for accounting station characteristics. Controlled experiments were conducted by comparing four regression algorithms, that is, recurrent and dense neural network, random forest and k-nearest neighbour, to predict the squared deviation of METRo forecasted road surface temperatures. The results presented reveal that the models utilising the random forest algorithm yielded the most reliable predictions of METRo deviations. However, the study also presents the promise of neural networks and the ability and possible advantage of seasonal adjustments that the networks could offer. machine learning neural network random forest k-nearest neighbour Computer Sciences Datavetenskap (datalogi)
52	Artificial intelligence and Machine learning : a diabetic readmission study Forsman, Robin, Jönsson, Jimmy January 2019 (has links) The maturing of Artificial intelligence provides great opportunities for healthcare, but also comes with new challenges. For Artificial intelligence to be adequate a comprehensive analysis of the data is necessary along with testing the data in multiple algorithms to determine which algorithm is appropriate to use. In this study collection of data has been gathered that consists of patients who have either been readmitted or not readmitted to hospital within 30-days after being admitted. The data has then been analyzed and compared in different algorithms to determine the most appropriate algorithm to use. Artificial intelligence Machine learning Logistic regression K-nearest neighbor Boosted decision tree Artificial neural network Computer Sciences Datavetenskap (datalogi)
53	Fraud or Not? Åkerblom, Thea, Thor, Tobias January 2019 (has links) This paper uses statistical learning to examine and compare three different statistical methods with the aim to predict credit card fraud. The methods compared are Logistic Regression, K-Nearest Neighbour and Random Forest. They are applied and estimated on a data set consisting of nearly 300,000 credit card transactions to determine their performance using classification of fraud as the outcome variable. The three models all have different properties and advantages. The K-NN model preformed the best in this paper but has some disadvantages, since it does not explain the data but rather predict the outcome accurately. Random Forest explains the variables but performs less precise. The Logistic Regression model seems to be unfit for this specific data set. Logistic Regression K-Nearest Neighbour classification random forest fraud transactions and statistical learning Probability Theory and Statistics Sannolikhetsteori och statistik
54	Recognizing Combustion Variability for Control of Gasoline Engine Exhaust Gas Recirculation using Information from the Ion Current Holub, Anna, Liu, Jie January 2006 (has links) <p>The ion current measured from the spark plug in a spark ignited combustion engine is used </p><p>as basis for analysis and control of the combustion variability caused by exhaust gas </p><p>recirculation. Methods for extraction of in-cylinder pressure information from the ion </p><p>current are analyzed in terms of reliability and processing efficiency. A model for the </p><p>recognition of combustion variability using this information is selected and tested on both </p><p>simulated and car data.</p> Internal combustion engines ion current exhaust gas recirculation engine control neural networks K-nearest neighbour combustion variability
55	Recognizing Combustion Variability for Control of Gasoline Engine Exhaust Gas Recirculation using Information from the Ion Current Holub, Anna, Liu, Jie January 2006 (has links) The ion current measured from the spark plug in a spark ignited combustion engine is used as basis for analysis and control of the combustion variability caused by exhaust gas recirculation. Methods for extraction of in-cylinder pressure information from the ion current are analyzed in terms of reliability and processing efficiency. A model for the recognition of combustion variability using this information is selected and tested on both simulated and car data. Internal combustion engines ion current exhaust gas recirculation engine control neural networks K-nearest neighbour combustion variability
56	Classification Of Forest Areas By K Nearest Neighbor Method: Case Study, Antalya Ozsakabasi, Feray 01 June 2008 (has links) (PDF) Among the various remote sensing methods that can be used to map forest areas, the K Nearest Neighbor (KNN) supervised classification method is becoming increasingly popular for creating forest inventories in some countries. In this study, the utility of the KNN algorithm is evaluated for forest/non-forest/water stratification. Antalya is selected as the study area. The data used are composed of Landsat TM and Landsat ETM satellite images, acquired in 1987 and 2002, respectively, SRTM 90 meters digital elevation model (DEM) and land use data from the year 2003. The accuracies of different modifications of the KNN algorithm are evaluated using Leave One Out, which is a special case of K-fold cross-validation, and traditional accuracy assessment using error matrices. The best parameters are found to be Euclidean distance metric, inverse distance weighting, and k equal to 14, while using bands 4, 3 and 2. With these parameters, the cross-validation error is 0.009174, and the overall accuracy is around 86%. The results are compared with those from the Maximum Likelihood algorithm. KNN results are found to be accurate enough for practical applicability of this method for mapping forest areas.
57	Time-Series Classification: Technique Development and Empirical Evaluation Yang, Ching-Ting 31 July 2002 (has links) Many interesting applications involve decision prediction based on a time-series sequence or a set of time-series sequences, which are referred to as time-series classification problems. Past classification analysis research predominately focused on constructing a classification model from training instances whose attributes are atomic and independent. Direct application of traditional classification analysis techniques to time-series classification problems requires the transformation of time-series data into non-time-series data attributes by applying some statistical operations (e.g., average, sum, etc). However, such statistical transformation often results in information loss. In this thesis, we proposed the Time-Series Classification (TSC) technique, based on the nearest neighbor classification approach. The result of empirical evaluation showed that the proposed time-series classification technique had better performance than the statistical-transformation-based approach. Telecommunications Data Mining Time-Series Similarity Data Mining k Nearest Neighbor Classification Churn Prediction Time-Series Classification
58	Nearest Neighbor Foreign Exchange Rate Forecasting with Mahalanobis Distance Pathirana, Vindya Kumari 01 January 2015 (has links) Foreign exchange (FX) rate forecasting has been a challenging area of study in the past. Various linear and nonlinear methods have been used to forecast FX rates. As the currency data are nonlinear and highly correlated, forecasting through nonlinear dynamical systems is becoming more relevant. The nearest neighbor (NN) algorithm is one of the most commonly used nonlinear pattern recognition and forecasting methods that outperforms the available linear forecasting methods for the high frequency foreign exchange data. The basic idea behind the NN is to capture the local behavior of the data by selecting the instances having similar dynamic behavior. The most relevant k number of histories to the present dynamical structure are the only past values used to predict the future. Due to this reason, NN algorithm is also known as the k-nearest neighbor algorithm (k-NN). Here k represents the number of chosen neighbors. In the k-nearest neighbor forecasting procedure, similar instances are captured through a distance function. Since the forecasts completely depend on the chosen nearest neighbors, the distance plays a key role in the k-NN algorithm. By choosing an appropriate distance, we can improve the performance of the algorithm significantly. The most commonly used distance for k-NN forecasting in the past was the Euclidean distance. Due to possible correlation among vectors at different time frames, distances based on deterministic vectors, such as Euclidean, are not very appropriate when applying for foreign exchange data. Since Mahalanobis distance captures the correlations, we suggest using this distance in the selection of neighbors. In the present study, we used five different foreign currencies, which are among the most traded currencies, to compare the performances of the k-NN algorithm with traditional Euclidean and Absolute distances to performances with the proposed Mahalanobis distance. The performances were compared in two ways: (i) forecast accuracy and (ii) transforming their forecasts in to a more effective technical trading rule. The results were obtained with real FX trading data, and the results showed that the method introduced in this work outperforms the other popular methods. Furthermore, we conducted a thorough investigation of optimal parameter choice with different distance measures. We adopted the concept of distance based weighting to the NN and compared the performances with traditional unweighted NN algorithm based forecasting. Time series forecasting methods, such as Auto regressive integrated moving average process (ARIMA), are widely used in many ares of time series as a forecasting technique. We compared the performances of proposed Mahalanobis distance based k-NN forecasting procedure with the traditional general ARIM- based forecasting algorithm. In this case the forecasts were also transformed into a technical trading strategy to create buy and sell signals. The two methods were evaluated for their forecasting accuracy and trading performances. Multi-step ahead forecasting is an important aspect of time series forecasting. Even though many researchers claim that the k-Nearest Neighbor forecasting procedure outperforms the linear forecasting methods for financial time series data, and the available work in the literature supports this claim with one step ahead forecasting. One of our goals in this work was to improve FX trading with multi-step ahead forecasting. A popular multi-step ahead forecasting strategy was adopted in our work to obtain more than one day ahead forecasts. We performed a comparative study on the performance of single step ahead trading strategy and multi-step ahead trading strategy by using five foreign currency data with Mahalanobis distance based k-nearest neighbor algorithm. ARIMA Foreign Exchange Trading k-Nearest Neighbor Algorithm Mahalanobis Distance Multi-step ahead Time Series Forecasting Geology Mathematics
59	Using machine learning techniques to simplify mobile interfaces Sigman, Matthew Stephen 19 April 2013 (has links) This paper explores how known machine learning techniques can be applied in unique ways to simplify software and therefore dramatically increase its usability. As software has increased in popularity, its complexity has increased in lockstep, to a point where it has become burdensome. By shifting the focus from the software to the user, great advances can be achieved by way of simplification. The example problem used in this report is well known: suggest local dining choices tailored to a specific person based on known habits and those of similar people. By analyzing past choices and applying likely probabilities, assumptions can be made to reduce user interaction, allowing the user to realize the benefits of the software faster and more frequently. This is accomplished with Java Servlets, Apache Mahout machine learning libraries, and various third party resources to gather dimensions on each recommendation. / text Linear interpolation Machine learning Mahout Software design Mobile apps Prediction Item-based recommendations k-Nearest Neighbor Collaborative filtering Quadratic programming
60	Predicting gene–phenotype associations in humans and other species from orthologous and paralogous phenotypes Woods, John Oates, III 21 February 2014 (has links) Phenotypes and diseases may be related by seemingly dissimilar phenotypes in other species by means of the orthology of underlying genes. Such "orthologous phenotypes," or "phenologs," are examples of deep homology, and one member of the orthology relationship may be used to predict candidate genes for its counterpart. (There exists evidence of "paralogous phenotypes" as well, but validation is non-trivial.) In Chapter 2, I demonstrate the utility of including plant phenotypes in our database, and provide as an example the prediction of mammalian neural crest defects from an Arabidopsis thaliana phenotype, negative gravitropism defective. In the third chapter, I describe the incorporation of additional phenotypes into our database (including chicken, zebrafish, E. coli, and new C. elegans datasets). I present a method, developed in coordination with Martin Singh-Blom, for ranking predicted candidate genes by way of a k nearest neighbors naïve Bayes classifier drawing phenolog information from a variety of species. The fourth chapter relates to a computational method and application for identifying shared and overlapping pathways which contribute to phenotypes. I describe a method for rapidly querying a database of phenotype--gene associations for Boolean combinations of phenotypes which yields improved predictions. This method offers insight into the divergence of orthologous pathways in evolution. I demonstrate connections between breast cancer and zebrafish methylmercury response (through oxidative stress and apoptosis); human myopathy and plant red light response genes, minus those involved in water deprivation response (via autophagy); and holoprosencephaly and an array of zebrafish phenotypes. In the first appendix, I present the SciRuby Project, which I co-founded in order to bring scientific libraries to the Ruby programming language. I describe the motivation behind SciRuby and my role in its creation. Finally in Appendix B, I discuss the first beta release of NMatrix, a dense and sparse matrix library for the Ruby language, which I developed in part to facilitate and validate rapid phenolog searches. In this work, I describe the concept of phenologs as well as the development of the necessary computational tools for discovering phenotype orthology relationships, for predicting associated genes, and for statistically validating the discovered relationships and predicted associations. / text Deep homology Phenologs Phenotype orthology Phenotype paralogy Homology Gene--phenotype associations Ruby Sciruby Nmatrix k nearest neighbors

Search results