251 |
Multivariat dataanalys för att undersöka temperaturmätningar av pelletskulor : En studie för LKAB med syftet att undersöka temperaturvariationer mellan två mätsystem / Study of temperature measurements using multivariate data analysis : A study written on behalf of LKAB with the purpose of examining temperatures from two systems that measure temperaturesVikman, Hanna January 2018 (has links)
LKAB (Loussavaara-Kiirunavaara AB) är en högteknologisk gruv- och mineralkoncern som främst arbetar med att bryta och förädla järnmalm. Den huvudsakliga delen av järnmalmen blir till pelletskulor. Detta arbete är genomfört på LKAB:s avdelning Forskning och Utbildning i Malmberget. I denna kandidatuppsats behandlas temperaturdata från två olika mätsystem i förädlingsprocessen av pelletskulor. Det ena mätsystemet är väl implementerat och ligger idag till grund för produktionen av pelletskulor. Detta system har dock begränsningar då temperaturen inte mäts i direkt anslutning till de blivande pelletskulorna. Önskvärt är att mäta temperaturen nära produkten och därför har forskare på LKAB tagit fram ett nytt mätsystem som gör detta. Metoder som används i uppsatsen är principalkomponentanalys (PCA), Partial Least Squares(PLS) och multivariat regressionsanalys. Syftet med arbetet är att undersöka mätsystemen och eventuella samband mellan dem. Med PLS undersöks om det är möjligt att prediktera temperaturer i det nyutvecklade mätsystemet med temperaturer från det befintliga mätsystemet. Studien visade inget samband mellan mätsystemen. Resultatet kan dock inte generaliseras, då det enligt sakkunniga på LKAB bör finnas ett samband mellan det nyutvecklade- och det befintliga mätsystemet. Resultatet kan bero på datamaterialets utformning och ett antagande som gjorts kring hastighet i produktionen. Vidare tyder studiens resultat på att värmespridningen i processen inte är jämn vilket är en betydelsefull upptäckt för LKAB, då en ojämn värmespridning kan påverka produktkvalitet och utrustning i produktion negativt. Resultatet visade också att en av mätpunkterna i det nya mätsystemet inte fungerar korrekt. Studiens resultat tyder på att det nyutvecklade mätsystemet kan vara ett exceptionellt verktyg för LKAB i deras produktion av pelletskulor då det ger temperaturdata kontinuerligt från hela produktionsprocessen. Det föreligger dock ett behov av att utveckla det nya mätsystemet så att dess position i förädlingsprocessen kan identifieras. En sådan implementering skulle underlätta vidare studier och vidare studier är något som starkt rekommenderas.
|
252 |
Effekter av svält, krig och epidemi : En studie i överlevnadsanalys / The effects of famines, war and epidemics : A study in survival analysisÖhman, Oscar, Boman, Alexander January 2018 (has links)
Finska kriget mellan 1808 och 1809, nödåren under den senare delen av 1860-talet, samt spanska sjukan mellan 1918 och 1919 var händelser som fick katastrofala konsekvenser för befolkningen i Umeå- samt Skellefteåtrakterna. Denna studie har som syfte att undersöka vilken av dessa händelser som hade störst effekt på dödligheten bland invånarna i dessa områden. Individer delades in i fyra olika grupper efter kön och ålder. För respektive grupp genomfördes lämpliga regressionsmetoder inom överlevnadsanalys. Resultatet visade på att finska kriget hade störst effekt på samtliga fyra grupper.
|
253 |
Clustering using k-means algorithm in multivariate dependent models with factor structureDineff, Dimitris January 2020 (has links)
No description available.
|
254 |
Methods for handling missing values : A simulation study comparing imputation methods for missing values on a Poisson distributed explanatory variableBengtsson, Fanny, Lindblad, Klara January 2021 (has links)
No description available.
|
255 |
Calibration of Probabilistic Predictive ModelsWidmann, David January 2020 (has links)
Predicting unknown and unobserved events is a common task in many domains. Mathematically, the uncertainties arising in such prediction tasks can be described by probabilistic predictive models. Ideally, the model estimates of these uncertainties allow us to distinguish between uncertain and trustworthy predictions. This distinction is particularly important in safety-critical applications such as medical image analysis and autonomous driving. For the probabilistic predictions to be meaningful and to allow this differentiation, they should neither be over- nor underconfident. Models that satisfy this property are called calibrated. In this thesis we study how one can measure, estimate, and statistically reason about the calibration of probabilistic predictive models. In Paper I we discuss existing approaches for evaluating calibration in multi-class classification. We mention potential pitfalls and suggest hypothesis tests for the statistical analysis of model calibration. In Paper II we propose a framework of calibration measures for multi-class classification. It captures common existing measures and includes a new kernel calibration error based on matrix-valued kernels. For the kernel calibration error consistent and unbiased estimators exist and asymptotic hypothesis tests for calibration can be derived. Unfortunately, by construction the framework is limited to prediction problems with finite discrete target spaces. In Paper III we use a different approach to develop a more general framework of calibration errors that applies to any probabilistic predictive model and is not limited to classification. We show that it coincides with the framework presented in Paper II for multi-class classification. Based on scalar-valued kernels, we generalize the kernel calibration error, its estimators, and hypothesis tests to all probabilistic predictive models. For real-valued regression problems we present empirical results.
|
256 |
Treatment-mediator interaction when estimating causal effects / Interaktion mellan exponering och mediator vid skattning av kausala effekterWallmark, Joakim January 2020 (has links)
In a variety of disciplines, causal mediation analysis is routinely conducted by researchers to examine the roles of variables which lie in the causal paths between the treatment and outcome variables. These effects are commonly estimated using a parametric model approach, where one fits regression models for the mediator and outcome variables. The estimated coefficients from the regression models are then used to estimate the direct and indirect effects. When taking this approach, two potential sources of bias are unobserved confounding and model misspecification. In this thesis, the focus lies on unobserved mediator-outcome confounding and model misspecifiation where an existing treatment-mediator interaction is excluded from the outcome model. We compare and evaluate the magnitude of the bias resulting from these sources in different scenarios through simulations. The results show that, in the worst cases, both sources of bias can result in severely biased effect estimators. It is hard to find an overarching conclusion to which source results in a larger bias in general, as it is highly dependent on the scenario at hand. In addition to the above mentioned bias evaluation, we introduce a statistical test with the goal of aiding researchers contemplating whether or not to include an interaction term in the outcome model. The test is based upon the fact that different definitions of the direct and indirect effects result in different effect estimates when an interaction is present. In an attempt to improve the significance level accuracy of the test for smaller samples, we compute p-values based on inverted bootstrap confidence intervals. Simulations show that using these bootstrap methods does improve the accuracy of a chosen significance level in many situations compared to relying on asymptotic normality of the test statistic. Despite this, our proposed test performs worse than more standard test methods, such as a t-test for the regression coefficient, in most examined scenarios.
|
257 |
Hellinger Distance-based Similarity Measures for Recommender Systems / Hellinger distance-baserad similaritetsmått for rekommendationsystemGoussakov, Roma January 2020 (has links)
Recommender systems are used in online sales and e-commerce for recommending potential items/products for customers to buy based on their previous buying preferences and related behaviours. Collaborative filtering is a popular computational technique that has been used worldwide for such personalized recommendations. Among two forms of collaborative filtering, neighbourhood and model-based, the neighbourhood-based collaborative filtering is more popular yet relatively simple. It relies on the concept that a certain item might be of interest to a given customer (active user) if, either he appreciated similar items in the buying space, or if the item is appreciated by similar users (neighbours). To implement this concept different kinds of similarity measures are used. This thesis is set to compare different user-based similarity measures along with defining meaningful measures based on Hellinger distance that is a metric in the space of probability distributions. Data from a popular database MovieLens will be used to show the effectiveness of dierent Hellinger distance-based measures compared to other popular measures such as Pearson correlation (PC), cosine similarity, constrained PC and JMSD. The performance of dierent similarity measures will then be evaluated with the help of mean absolute error, root mean squared error and F-score. From the results, no evidence were found to claim that Hellinger distance-based measures performed better than more popular similarity measures for the given dataset.
|
258 |
N-sphere Clustering / N-sfär klustringPahmp, Oliver January 2020 (has links)
This thesis introduces n-sphere clustering, a new method of cluster analysis, akin to agglomerative hierarchical clustering. It relies on expanding n-spheres around each observation until they intersect. It then clusters observations based on these intersects, the distance between the spheres, and density of observations. Currently, many commonly used clustering methods struggle when clusters have more complex shapes. The aim of n-sphere clustering is to have a method which functions reasonably well, regardless of the shape of the clusters. Accuracy is shown to be low, particularly when clusters overlap, and extremely sensitive to noise. The time complexity of the algorithm is prohibitively large for large datasets, further limiting its potential use.
|
259 |
Conditional mean variables: A method for estimating latent linear relationships with discretized observationsBerggren, Mathias January 2020 (has links)
No description available.
|
260 |
A small sample study of some sandwich estimators to handle heteroscedasticityWestman, Viking January 2021 (has links)
This simulation study sets out to investigate Heteroscedasticity-Consistent Covariance Matrix Estimation using the sandwich method in relatively small sample sizes. The different estimators are evaluated on how accurately they assign confidence intervals around a fixed, true coefficient, in the presence of random sampling and both homo- and heteroscedasticity. A measure of Standard Error is also collected to further analyze the coefficients. All of the HC-estimators seemed to overadjust in most homoscedastic cases, creating intervals that way overlapped their specifications, and the standard procedure that assumes homoscedasticity produced the most consistent intervals towards said specifications. In the presence of heteroscedasticity the comparative accuracy improved for the HC-estimators and they were often better than the non-robust error estimator with the exception of estimating the intercept, which they all heavily underestimated the confidence intervals for. In turn, the constant estimator was subject to a larger mean error for said parameter - the intercept. While it is clear from previous studies that Sandwich estimation is a method that can lead to more accurate results, it was rarely much better than, and sometimes strictly worse than the non-robust, constant variance provided by the OLS-estimation. The conclusion is to stay cautious when applying HC-estimators to your model, and to test and make sure that they do in fact improve the areas where heteroscedasticity presents an issue.
|
Page generated in 0.1223 seconds