Spelling suggestions: "subject:"random forests"" "subject:"random gorests""
71 |
Prediction of Delivered and Ideal Specific Impulse using Random Forest Models and Parsimonious Neural NetworksPeter Joseph Salek (12455760) 29 April 2022 (has links)
<p>Development of complex aerospace systems often takes decades of research and testing. High performing propellants are important to the success of rocket propulsion systems. Development and testing of new propellants can be expensive and dangerous. Full scale tests are often required to understand the performance of new propellants. Many industries have started using data science tools to learn from previous work and conduct smarter tests. Material scientists have started using these tools to speed up the development of new materials. These data science tools can be used to speed up the development and design better propellants. I approach the development of new solid propellants through two steps: Prediction of delivered performance from available literature tests, prediction of ideal performance using physics-based models. Random Forest models are used to correlate the ideal performance to delivered performance of a propellant based on the composition and motor properties. I use Parsimonious Neural Networks (PNNs) to learn interpretable models for the ideal performance of propellants. I find that the available open literature data is too biased for the models to learn from and discover families of interpretable models to predict the ideal performance of propellants. </p>
|
72 |
Regression and time estimation in the manufacturing industryBjernulf, Walter January 2023 (has links)
In this thesis an analysis is performed on operation times for different sized products in a manufacturing company. The thesis will introduce and summarise most of the theory needed to perform regression and also cover a worked example where three different regression models are learned, evaluated and analysed. Conformal prediction, which at the moment is a hot topic in machine learning, will also be introduced and will be used in the worked example.
|
73 |
Maskininlärning för att förutspå churn baserat på diskontinuerlig beteendedata / Machine learning to predict churn based on discontinuous behavioral dataÖbom, Anton, Bratteby, Adrian January 2017 (has links)
This report is about examining the fields of machine learning and digital marketing, using machine learning as a tool to predict churn in a new domain of companies that do not track their customers extensively, i.e where behaviour data is discontinuous. To predict churn relatively simple out of the box models, such as support vector machines and random forests, are used to achieve an acceptable outcome. To be on par with the models used for churn prediction in subscription based services, this report concludes that more research has to be done using more effective evaluation metrics. Finally it is presented how these discoveries can be commercialized and the business related benefits of using churn prediction for the employer Sellpy. / Denna rapport handlar om att utforska fälten maskininlärning och digital marknadsföring, genom att använda maskininlärning som ett redskap för att förutspå churn i en typ av företag med diskontinuerlig beteendedata. För att förutspå churn finns relativt simpla "out of the box"-modeller, som support vector machines och random forests, som används för att nå acceptabla resultat. För att nå liknande resultat som i arbeten där churn utförs på kontinuerlig beteendedata konstaterar denna rapport att framtida arbeten forska på vilka utvärderingsmetriker som är mest lämpade. I rapporten presenteras också hur dessa upptäckter kan kommersialiseras och hur företaget Sellpy kan tjäna på att förutspå churn.
|
74 |
Why we need a token-based typology: A case study of analytic and lexical causatives in fifteen European languagesLevshina, Natalia 26 January 2023 (has links)
This paper investigates variation of lexical and analytic causatives in
15 European languages from the Germanic, Romance, and Slavic genera based
on a multilingual parallel corpus of film subtitles. Using typological parameters
of variation of causatives from the literature, this study tests which parameters
are relevant for the choice between analytic and lexical causatives in the sample
of languages. The main research question is whether the variation is constrained
by one semantic dimension, namely, the conceptual integration of the causing
and caused events, as suggested by previous research on iconicity in language,
or whether several different semantic and syntactic factors are at play. To answer
this question, I use an exploratory multivariate technique for categorical data
(Multiple Correspondence Analysis with supplementary points) and conditional
random forests, a nonparametric regression and classification method. The study
demonstrates the importance of corpus data in testing typological hypotheses.
|
75 |
Predicting basketball performance based on draft pick : A classification analysisHarmén, Fredrik January 2022 (has links)
In this thesis, we will look to predict the performance of a basketball player coming into the NBA depending on where the player was picked in the NBA draft. This will be done by testing different machine learning models on data from the previous 35 NBA drafts and then comparing the models in order to see which model had the highest accuracy of classification. The machine learning methods used are Linear Discriminant Analysis, K-Nearest Neighbors, Support Vector Machines and Random Forests. The results show that the method with the highest accuracy of classification was Random Forests, with an accuracy of 42%.
|
76 |
Credit Scoring Based on Behavioural Data / Kreditvärdering baserat på beteendedataBouvin, Daniel, Hamberg, Erik January 2022 (has links)
Credit modelling has traditionally been done by credit institutes based on financial data about the individuals requesting the credit. While this has been sufficient in lowering risk in developed economies with plenty of financial data it is inefficient in developing economies and fails to reach the unbanked population. As this is both limiting many responsible consumers from getting access to credit as well as limiting companies from reaching paying customers, it is evident that new strategies for credit modelling are needed. This paper explores the usage of behavioural data for credit modelling gathered from users of Klarna’s app. The models are based on the machine learning algorithms logistic regression, random forests, neural networks, and gradient boosted decision trees. In this study, models were trained on Swedish data in multiple timespans and tested in different timespans and countries. The results show that modelling on the data points developed in this study is effective and suggest that in certain cases be used in predicting new and unknown markets by training on similar markets. / Kreditvärderingar har traditionellt sätt utförts av kreditinstitut baserat på existerande finansiella data kring personen i fråga som ansöker om kredit. Denna metod har varit framgångsrik i att minimera risk inom utvecklade ekonomier där finansiella data har varit tillgänglig. Metoden har varit mindre framgångsrik i utvecklingsekonomier och misslyckas att utvärdera befolkningar som saknar finansiella tjänster. Då detta problem begränsar många pålitliga konsumenter att få tillgång till kredit och samtidigt begränsar företagen att nå ut till möjliga betalande kunder, blir det viktigt att ta fram nya strategier för att utvärdera kredit. Denna uppsats utforskar möjligheten att modellera kreditvärdighet baserat på användarbeteende med hjälp av data från Klarnas shopping app. Modellerna är baserade på maskininlärningsalgoritmerna logistisk regression, Random Forests, neurala nätverk och gradient boosted decision trees. I denna studie tränas modellerna på olika tidsspann inom den svenska marknaden och testas på olika tidsspann och marknader. Resultaten från studien visar att det går med hjälp av beteende data från Klarnas app att, under olika omständigheter, förutspå kreditvärdighet i framtiden och på olika marknader.
|
77 |
Exploring the motivational antecedents of Nepalese learners of L2 EnglishSchmidtke-Bode, Karsten, Kachel, Gregor 19 June 2024 (has links)
This paper is the first to examine the motivational disposition of Nepalese
learners of L2 English. Based on an adapted version of the questionnaire in
(Kormos, Judit & Kata Csizér. 2008. Age-related differences in motivation of learning
English as a foreign language: Attitudes, selves, and motivated behavior. Language
Learning 58. 327–355. Doi:10.1111/j.1467-9922.2008.00443.x.), we test the robustness
and culture-specific applicability of well-known motivational antecedents to this
learner population, and we investigate how the effects of these antecedents are
mediated by the learners’ gender, age and regional aspects of the educational
setting. In doing so, we offer novel ways of analyzing the data: Firstly, we employ
random forests and conditional inference trees for assessing the relative importance
of motivational antecedents. Secondly, we complement the traditional ‘scale-based
approach’, which focuses on holistic constructs like the ‘Ideal L2 Self’, with an ‘item- based approach’ that highlights more specific components of such scales. The
results are interpreted with reference to the L2 Motivational Self System (Dörnyei,
Zoltán. 2005. The psychology of the language learner: Individual differences in second
language acquisition. Mahwah, NJ: Lawrence Erlbaum) and to previous studies on
other Asian populations of L2 learners.
|
78 |
Spatial random forests for brain lesions segmentation in MRIs and model-based tumor cell extrapolation / Forêts aléatoires spatiales pour la segmentation de lésions cérébrales et l'estimation de densités cellulaires dans les images par résonance magnétiqueGeremia, Ezequiel 30 January 2013 (has links)
La grande quantité de données issues des l'imagerie médicale contribue au succès des méthodes supervisées pour l'annotation sémantique des images. Notre étude porte sur la détection de lésions cérébrales dans les images par résonance magnétique (IRMs) en utilisant un outil générique et efficace: les forêts aléatoires. Trois contributions majeures se distinguent. D'abord, la segmentation des lésions cérébrales, essentielle pour établir diagnostics, pronostics et le traitement. La conception d'une forêt aléatoire intégrant le contexte spatial cible particulièrement la segmentation automatique de lésions de sclérose en plaques et des gliomes dans les IRMs. La méthode intègre l'information multi-séquences des IRMs, les atlas de répartition des tissus. Deuxième contribution : l'estimation de la densité de cellules tumorales à partir des IRMs. Une méthode de couplage de modèles génératifs et discriminatifs est conçue pour apprendre la densité de cellules tumorales latente à partir de modélisations associées à des images synthétiques. Le modèle génératif est un simulateur bio-physiologique de croissance tumorale en libre accès. Le modèle discriminatif est une forêt aléatoire pour la régression multi-variée de la densité de cellules tumorales à partir des IRMs. Enfin, nous présentons les “forêts aléatoires spatialement adaptables” regroupant les avantages des approches multi-échelles avec ceux de forêts aléatoires, avec une application aux scénarios de classification et de segmentation précédemment cités. Une évaluation quantitative des méthodes proposées sur des bases de données annotées et librement accessibles démontre des résultats comparables à l'état de l'art. / The large size of the datasets produced by medical imaging protocols contributes to the success of supervised discriminative methods for semantic labelling of images. Our study makes use of a general and efficient emerging framework, discriminative random forests, for the detection of brain lesions in multi-modal magnetic resonance images (MRIs). The contribution is three-fold. First, we focus on segmentation of brain lesions which is an essential task to diagnosis, prognosis and therapy planning. A context-aware random forest is designed for the automatic multi-class segmentation of MS lesions, low grade and high grade gliomas in MR images. It uses multi-channel MRIs, prior knowledge on tissue classes, symmetrical and long-range spatial context to discriminate lesions from background. Then, we investigate the promising perspective of estimating the brain tumor cell density from MRIs. A generative-discriminative framework is presented to learn the latent and clinically unavailable tumor cell density from model-based estimations associated with synthetic MRIs. The generative model is a validated and publicly available biophysiological tumor growth simulator. The discriminative model builds on multi-variate regression random forests to estimate the voxel-wise distribution of tumor cell density from input MRIs. Finally, we present the “Spatially Adaptive Random Forests” which merge the benefits of multi-scale and random forest methods and apply it to previously cited classification and regression settings. Quantitative evaluation of the proposed methods are carried out on publicly available labeled datasets and demonstrate state of the art performance.
|
79 |
Aplikace umělé inteligence v řízení kreditních rizik / Artificial Intelligence Approach to Credit RiskŘíha, Jan January 2016 (has links)
This thesis focuses on application of artificial intelligence techniques in credit risk management. Moreover, these modern tools are compared with the current industry standard - Logistic Regression. We introduce the theory underlying Neural Networks, Support Vector Machines, Random Forests and Logistic Regression. In addition, we present methodology for statistical and business evaluation and comparison of the aforementioned models. We find that models based on Neural Networks approach (specifically Multi-Layer Perceptron and Radial Basis Function Network) are outperforming the Logistic Regression in the standard statistical metrics and in the business metrics as well. The performance of the Random Forest and Support Vector Machines is not satisfactory and these models do not prove to be superior to Logistic Regression in our application.
|
80 |
Spatial analysis of factors influencing long-term stress and health of grizzly bears (Ursus arctos) in Alberta, CanadaBourbonnais, Mathieu Louis 04 September 2013 (has links)
A primary focus of wildlife research is to understand how habitat conditions and human activities impact the health of wild animals. External factors, both natural and anthropogenic that impact the ability of an animal to acquire food and build energy reserves have important implications for reproductive success, avoidance of predators, and the ability to withstand disease, and periods of food scarcity. In the analyses presented here, I quantify the impacts of habitat quality and anthropogenic disturbance on indicators of health for individuals in a threatened grizzly bear population in Alberta, Canada.
The first analysis relates spatial patterns of hair cortisol concentrations, a promising indicator of long-term stress in mammals, measured from 304 grizzly bears to a variety of continuous environmental variables representative of habitat quality (e.g., crown closure, landcover, and vegetation productivity), topographic conditions (e.g., elevation and terrain ruggedness), and anthropogenic disturbances (e.g., roads, forest harvest blocks, and oil and gas well-sites). Hair cortisol concentration point data were integrated with continuous variables by creating a stress surface for male and female bears using kernel density estimation validated through bootstrapping. The relationships between hair cortisol concentrations for males and females and environmental variables were quantified using random forests, and landscape scale stress levels for both genders was predicted based on observed relationships. Low female stress levels were found to correspond with regions with high levels of anthropogenic disturbance and activity. High female stress levels were associated primarily with high-elevation parks and protected areas. Conversely, low male stress levels were found to correspond with parks and protected areas and spatially limited moderate to high stress levels were found in regions with greater anthropogenic disturbance. Of particular concern for conservation is the observed relationship between low female stress and sink habitats which have high mortality rates and high energetic costs.
Extending the first analysis, the second portion of this research examined the impacts of scale-specific habitat selection and relationships between biology, habitat quality, and anthropogenic disturbance on body condition in 85 grizzly bears represented using a body condition index. Habitat quality and anthropogenic variables were represented at multiple scales using isopleths of a utilization distribution calculated using kernel density estimation for each bear. Several hypotheses regarding the influence of biology, habitat quality, and anthropogenic disturbance on body condition quantified using linear mixed-effects models were evaluated at each habitat selection scale using the small sample Aikake Information Criterion. Biological factors were influential at all scales as males had higher body condition than females, and body condition increased with age for both genders. At the scale of most concentrated habitat selection, the biology and habitat quality hypothesis had the greatest support and had a positive effect on body condition. A component of biology, the influence of long-term stress, which had a negative impact on body condition, was most pronounced within the biology and habitat quality hypothesis at this scale. As the scale of habitat selection was represented more broadly, support for the biology and anthropogenic disturbance hypothesis increased. Anthropogenic variables of particular importance were distance decay to roads, density of secondary linear features, and density of forest harvest areas which had a negative relationship with body condition. Management efforts aimed to promote landscape conditions beneficial to grizzly bear health should focus on promoting habitat quality in core habitat and limiting anthropogenic disturbance within larger grizzly bear home ranges. / Graduate / 0768 / 0463 / 0478 / mathieub@uvic.ca
|
Page generated in 0.045 seconds