Global ETD Search

431	SLA violation prediction : a machine learning perspective Askari Hemmat, Reyhane 10 1900 (has links) No description available. Cloud computing Validation du niveau service Apprentissage automatique Classification déséquilibrée Forêt aléatoire Classification de Bayes Naive Service level agreements Machine learning Unbalanced classification Random forest Naive Bayes
432	Three Essays on the Economics of Food, Health, and Consumer Behavior Panchalingam, Thadchaigeni 01 October 2021 (has links) No description available. Agricultural Economics Preventative non-healthcare consumption Medicaid expansion Ex-ante moral hazard Local food Joint preference elicitation National school lunch programs Parent-child dyads Portion control behavior Regular carbonated beverages Random Forest
433	Prediktor vlivu aminokyselinových substitucí na stabilitu proteinů / Predictor of the Effect of Amino Acid Substitutions on Protein Stability Flax, Michal January 2017 (has links) This paper deals with prediction of influence of amino acids mutations on protein stability. The prediction is based on different methods of machine learning. Protein mutations are classified as mutations that increase or decrease protein stability. The application also predicts the magnitude of change in Gibbs free energy after the mutation.
434	Analýza 3D CT obrazových dat se zaměřením na detekci a klasifikaci specifických struktur tkání / Analysis of 3D CT image data aimed at detection and classification of specific tissue structures Šalplachta, Jakub January 2017 (has links) This thesis deals with the segmentation and classification of paraspinal muscle and subcutaneous adipose tissue in 3D CT image data in order to use them subsequently as internal calibration phantoms to measure bone mineral density of a vertebrae. Chosen methods were tested and afterwards evaluated in terms of correctness of the classification and total functionality for subsequent BMD value calculation. Algorithms were tested in programming environment Matlab® on created patient database which contains lumbar spines of twelve patients. Following sections of this thesis contain theoretical research of the issue of measuring bone mineral density, segmentation and classification methods and description of practical part of this work.
435	Acceleration Strategies of Markov Chain Monte Carlo for Bayesian Computation / Stratégies d'accélération des algorithmes de Monte Carlo par chaîne de Markov pour le calcul Bayésien Wu, Chang-Ye 04 October 2018 (has links) Les algorithmes MCMC sont difficiles à mettre à l'échelle, car ils doivent balayer l'ensemble des données à chaque itération, ce qui interdit leurs applications dans de grands paramètres de données. En gros, tous les algorithmes MCMC évolutifs peuvent être divisés en deux catégories: les méthodes de partage et de conquête et les méthodes de sous-échantillonnage. Le but de ce projet est de réduire le temps de calcul induit par des fonctions complexes ou à grande efficacité. / MCMC algorithms are difficult to scale, since they need to sweep over the whole data set at each iteration, which prohibits their applications in big data settings. Roughly speaking, all scalable MCMC algorithms can be divided into two categories: divide-and-conquer methods and subsampling methods. The aim of this project is to reduce the computing time induced by complex or largelikelihood functions. Chaîne de Markov Monte Carlo Données massives Diviser pour régner Forêt aléatoire Markov chain Monte Carlo Big Data Piecewise deterministic Markov process Divide-and-conquer Random forest 519.2
436	Classification of a Sensor Signal Attained By Exposure to a Complex Gas Mixture Sher, Rabnawaz Jan January 2021 (has links) This thesis is carried out in collaboration with a private company, DANSiC AB This study is an extension of a research work started by DANSiC AB in 2019 to classify a source. This study is about classifying a source into two classes with the sensitivity of one source higher than the other as one source has greater importance. The data provided for this thesis is based on sensor measurements on different temperature cycles. The data is high-dimensional and is expected to have a drift in measurements. Principal component analysis (PCA) is used for dimensionality reduction. “Differential”, “Relative” and “Fractional” drift compensation techniques are used for compensating the drift in data. A comparative study was performed using three different classification algorithms, which are “Linear Discriminant Analysis (LDA)”, “Naive Bayes classifier (NB)” and “Random forest (RF)”. The highest accuracy achieved is 59%,Random forest is observed to perform better than the other classifiers. / <p>This work is done with DANSiC AB in collaboration with Linkoping University.</p> Classification Random Forest Linear Discriminant Analysis Naive Bayes Principal Component Analysis Drift Baseline Compensation Normalization Sensor Signal Preprocessing. Signal Processing Signalbehandling Probability Theory and Statistics Sannolikhetsteori och statistik Remote Sensing Fjärranalysteknik
437	Rule-based analysis of throughfall kinetic energy to evaluate biotic and abiotic factor thresholds to mitigate erosive power Goebes, Philipp, Schmidt, Karsten, Stumpf, Felix, von Oheimb, Goddert, Scholten, Thomas, Härdtle, Werner, Seitz, Steffen 17 September 2019 (has links) Below vegetation, throughfall kinetic energy (TKE) is an important factor to express the potential of rainfall to detach soil particles and thus for predicting soil erosion rates. TKE is affected by many biotic (e.g. tree height, leaf area index) and abiotic (e.g. throughfall amount) factors because of changes in rain drop size and velocity. However, studies modelling TKE with a high number of those factors are lacking. This study presents a new approach to model TKE. We used 20 biotic and abiotic factors to evaluate thresholds of those factors that can mitigate TKE and thus decrease soil erosion. Using these thresholds, an optimal set of biotic and abiotic factors was identified to minimize TKE. The model approach combined recursive feature elimination, random forest (RF) variable importance and classification and regression trees (CARTs). TKE was determined using 1405 splash cup measurements during five rainfall events in a subtropical Chinese tree plantation with five-year-old trees in 2013. Our results showed that leaf area, tree height, leaf area index and crown area are the most prominent vegetation traits to model TKE. To reduce TKE, the optimal set of biotic and abiotic factors was a leaf area lower than 6700mm2, a tree height lower than 290 cm combined with a crown base height lower than 60 cm, a leaf area index smaller than 1, more than 47 branches per tree and using single tree species neighbourhoods. Rainfall characteristics, such as amount and duration, further classified high or low TKE. These findings are important for the establishment of forest plantations that aim to minimize soil erosion in young succession stages using TKE modelling. info:eu-repo/classification/ddc/910 ddc:910
438	A study in alcohol : A comparison of data mining methods for identifying binge drinking risk factors in university students / En studie i alkohol : En jämförelse av dataminingmetoder för identifieringen av bakomliggande riskfaktorer hos universitetsstudenter Lamprou, Sokrates January 2021 (has links) Hazardous alcohol consumption is an issue that affects a lot of university students today. Consuming alcohol tend to have a negative impact on both mental and physical aspects, which can lead to severe alcohol addictions in the future. This study investigates which background factors that causes the phenomenon of binge drinking by collecting and analysing data from Linköping University. The results were analysed with data mining techniques such as: decision trees, random forest, and logistic regression. The results showed that logistic regression were the most reliable method in predicting binge drinking with an accuracy of (86.50 %), precision (92.64 %) and recall (90.96 %). The findings also showed that participation in student events together with higher weekly alcohol consumption predicted binge drinking. Additionally, other risk factors were the amounts of time the students spent with their friends and the students activity in partaking in their programs section (program association). The results from this study suggest that the student culture not only influence alcohol consumption but it induces the habits of binge drinking. binge drinking binge-drinking consumption background factors hazardous data mining alcohol university students logistic regression random forest decision trees Computer and Information Sciences Data- och informationsvetenskap
439	Automatic Patent Classification Yehe, Nala January 2020 (has links) Patents have a great research value and it is also beneficial to the community of industrial, commercial, legal and policymaking. Effective analysis of patent literature can reveal important technical details and relationships, and it can also explain business trends, propose novel industrial solutions, and make crucial investment decisions. Therefore, we should carefully analyze patent documents and use the value of patents. Generally, patent analysts need to have a certain degree of expertise in various research fields, including information retrieval, data processing, text mining, field-specific technology, and business intelligence. In real life, it is difficult to find and nurture such an analyst in a relatively short period of time, enabling him or her to meet the requirement of multiple disciplines. Patent classification is also crucial in processing patent applications because it will empower people with the ability to manage and maintain patent texts better and more flexible. In recent years, the number of patents worldwide has increased dramatically, which makes it very important to design an automatic patent classification system. This system can replace the time-consuming manual classification, thus providing patent analysis managers with an effective method of managing patent texts. This paper designs a patent classification system based on data mining methods and machine learning techniques and use KNIME software to conduct a comparative analysis. This paper will research by using different machine learning methods and different parts of a patent. The purpose of this thesis is to use text data processing methods and machine learning techniques to classify patents automatically. It mainly includes two parts, the first is data preprocessing and the second is the application of machine learning techniques. The research questions include: Which part of a patent as input data performs best in relation to automatic classification? And which of the implemented machine learning algorithms performs best regarding the classification of IPC keywords? This thesis will use design science research as a method to research and analyze this topic. It will use the KNIME platform to apply the machine learning techniques, which include decision tree, XGBoost linear, XGBoost tree, SVM, and random forest. The implementation part includes collection data, preprocessing data, feature word extraction, and applying classification techniques. The patent document consists of many parts such as description, abstract, and claims. In this thesis, we will feed separately these three group input data to our models. Then, we will compare the performance of those three different parts. Based on the results obtained from these three experiments and making the comparison, we suggest using the description part data in the classification system because it shows the best performance in English patent text classification. The abstract can be as the auxiliary standard for classification. However, the classification based on the claims part proposed by some scholars has not achieved good performance in our research. Besides, the BoW and TFIDF methods can be used together to extract efficiently the features words in our research. In addition, we found that the SVM and XGBoost techniques have better performance in the automatic patent classification system in our research. Engineering and Technology Teknik och teknologier Elektroteknik och elektronik
440	News Value Modeling and Prediction using Textual Features and Machine Learning / Modellering och prediktion av nyhetsvärde med textattribut och maskininlärning Lindblom, Rebecca January 2020 (has links) News value assessment has been done forever in the news media industry and is today often done in real-time without any documentation. Editors take a lot of different qualitative aspects into consideration when deciding what news stories will make it to the first page. This thesis explores how the complex news value assessment process can be translated into a quantitative model, and also how those news values can be predicted in an effective way using machine learning and NLP. Two models for news value were constructed, for which the correlation between modeled and manual news values was measured, and the results show that the more complex model gives a higher correlation. For prediction, different types of features are extracted, Random Forest and SVM are used, and the predictions are evaluated with accuracy, F1-score, RMSE, and MAE. Random Forest shows the best results for all metrics on all datasets, the best result being on the largest dataset, probably due to the smaller datasets having a less even distribution between classes. NLP machine learning news value popularity random forest prediction classification modeling read time clicks newspaper swedish nyhetsvärde nyhetsvärdering popularitet tidning maskininlärning Computer and Information Sciences Data- och informationsvetenskap

Search results