Spelling suggestions: "subject:"conlinear regression"" "subject:"collinear regression""
281 |
Predicting consultation durations in a digital primary care settingÅman, Agnes January 2018 (has links)
The aim of this thesis is to develop a method to predict consultation durations in a digital primary care setting and thereby create a tool for designing a more efficient scheduling system in primary care. The ultimate purpose of the work is to contribute to a reduction in waiting times in primary care. Even though no actual scheduling system was implemented, four machine learning models were implemented and compared to see if any of them had better performance. The input data used in this study was a combination of patient and doctor features. The patient features consisted of information extracted from digital symptom forms filled out by a patient before a video consultation with a doctor. These features were combined with doctor's speed, defined as the doctor's average consultation duration for his/her previous meetings. The output was defined as the length of the video consultation including administrative work made by the doctor before and after the meeting. One of the objectives of this thesis was to investigate whether the relationship between input and output was linear or non-linear. Also the problem was formulated both as a regression and a classification problem. The two problem formulations were compared in terms of achieved accuracy. The models chosen for this study was linear regression, linear discriminant analysis and the multi-layer perceptron implemented for both regression and classification. After performing a statistical t-test and a two-way ANOVA test it was concluded that no significant difference could be detected when comparing the models' performances. However, since linear regression is the least computationally heavy it was suggested for future usage until it is proved that any other model achieves better performance. Limitations such as too few models being tested and flaws in the data set were identified and further research is encouraged. Studies implementing an actual scheduling system using the methodology presented in the thesis is recommended as a topic for future research. / Syftet med denna uppsats är att utvärdera olika verktyg för att prediktera längden på ett läkarbesök och därmed göra det möjligt att skapa en mer effektiv schemaläggning i primärvården och på så sätt minska väntetiden för patienterna. Även om inget faktiskt schemaläggningssystem har föreslagits i denna uppsats så har fyra maskininlärningsmodeller implementerats och jämförts. Syftet med detta var bland annat att se om det var möjligt att dra slutsatsen att någon av modellerna gav bättre resultat än de andra. Den indata som använts i denna studie har bestått dels av symptomdata insamlad från symptomformulär ifylld av patienten före ett videomöte med en digital vårdgivare. Denna data har kombinerats med läkarens genomsnittliga mötestid i hens tidigare genomförda möten. Utdatan har definierats som längden av ett videomöte samt den tid som läkaren har behövt för administrativt arbete före och efter själva mötet. Ett av målen med denna studie var att undersöka som sambandet mellan indata och utdata är linjärt eller icke-linjärt. Ett annat mål var att formulera problemet både som ett regressionsproblem och som ett klassifikationsproblem. Syftet med detta var att kunna jämföra och se vilken av problemformuleringarna som gav bäst resultat. De modeller som har implementerats i denna studie är linjär regression, linjär diskriminationsanalys (linear discriminant analysis) och neurala nätverk implementerade för både regression och klassifikation. Efter att ha genomfört ett statistiskt t-test och en två-vägs ANOVA-analys kunde slutsatsen dras att ingen av de fyra studerade modellerna presterade signifikant bättre än någon av de andra. Eftersom linjär regression är enklare och kräver mindre datorkapacitet än de andra modellerna så dras slutsatsen att linjär regression kan rekommenderas för framtida användning tills det har bevisats att någon annan modell ger bättre resultat. De begränsningar som har identifierats hos studien är bland annat att det bara var fyra modeller som implementerats samt att datan som använts har vissa brister. Framtida studier som inkluderar fler modeller och bättre data har därför föreslagits. Dessutom uppmuntras framtida studier där ett faktiskt schemaläggningssystem implementeras som använder den metodik som föreslås i denna studie.
|
282 |
Analysis of Macroeconomic Variables Affecting International Tourism Consumption in Sweden / Analys av Makroekomomiska Variabler som Påverkar Internationell Turismkonsumption i SverigeLee, Jun Ho, Mattar, Noel January 2019 (has links)
There is an evident trend of growing tourism in the world. Tourism in Sweden is gaining more economic and social attention. The main purpose of this thesis is to discover what macroeconomic variables contribute to the annual international tourism income in Sweden. A multiple linear regression approach over a time period of 1978-2017 is used for the analysis. The final results show that GDP is the major macroeconomic factor that drives the annual international tourism income in Sweden across all time periods. NOK-SEK exchange rate seem to another relevant variable in the long term from 1978-2017, but not in shorter periods of time. USD-SEK exchange rate and unemployment rate hold no significant relevance to the international tourism consumption in Sweden for all time. The devaluation of Swedish krona in 1992 did not change the relationship between these variables and the response variable. However, these results can be unstable due to the limited number of observations used in the analysis, and therefore, we recommend other regression approaches, such as panel data regression, for this subject.There is an evident trend of growing tourism in the world. Tourism in Sweden is gaining more economic and social attention. The main purpose of this thesis is to discover what macroeconomic variables contribute to the annual international tourism income in Sweden. A multiple linear regression approach over a time period of 1978-2017 is used for the analysis. The final results show that GDP is the major macroeconomic factor that drives the annual international tourism income in Sweden across all time periods. NOK-SEK exchange rate seem to another relevant variable in the long term from 1978-2017, but not in shorter periods of time. USD-SEK exchange rate and unemployment rate hold no significant relevance to the international tourism consumption in Sweden for all time. The devaluation of Swedish krona in 1992 did not change the relationship between these variables and the response variable. However, these results can be unstable due to the limited number of observations used in the analysis, and therefore, we recommend other regression approaches, such as panel data regression, for this subject. / Det finns en märkbar trend av växande turism världen över. Turismen in i Sverige får allt mer ekonomisk och social uppmärksamhet. Syftet med detta arbete är att finna vilka makroekonomiska variabler som bidrar till de årliga intäkterna av internationell turism i Sverige. För analysen används multipel linjär regression över tidsperioden 1978-2017. Det slutgiltiga resultatet visar att BNP är den dominanta makroekonomiska faktorn som är drivande i de årliga intäkterna av internationell turism i Sverige, detta oavsett tidsperiod. Valutakursen NOK/SEK verkar vara signifikant i det långa loppet, från 1978-2017, men inte under de kortare tidsperioderna. Valutakursen USD/SEK och arbetslösheten är båda icke signifikanta variabler för internationell turism konsumtion i Sverige över alla tidsperioder. Devalveringen av den svenska kronan år 1992, förändrade inte relationen mellan de sistnämnda variablerna och y -variabeln. Dock kan dessa resultat vara ostabila på grund av de begränsade antalet observationer som använts i analysen och därför rekommenderar vi andra regressions modeller till detta ämne, såsom "panel data regression".
|
283 |
Multiples for Valuation Estimates of Life Science Companies in Sweden / Multiplar för värdering av Life Science Företag i SverigeErnstsson, Hampus, Börjes Liljesvan, Max January 2019 (has links)
Market multiples are a common and simple tool for estimation of corporate value. It can express temporal dynamics and differences in markets, industries and firms. Despite their practical usefulness, some critical problems remains which continue to be debated. This thesis investigates if there exists characteristics for explaining market capitalization by market multiples within the life science industry in Sweden. The approach follows well known theory of multiple linear regression analysis. The results indicated only a linear relationship between the market cap and the R\&D expenditures of a company. This does not mean that the other explanatory variables does not have effect on market cap only that there is no linear relationship that could be statistically proven. / Värderingsmultiplar är ett vanligt och enkelt verktyg för att approximera företags värde. Det kan beskriva temporär dynamik och skillnader hos marknader, industrier och bolag. Trots dess praktiska användbarhet finns en del kritiska problem som fortfarande debatteras. Denna uppsats undersöker om det existerar några egenskaper för att förklara marknadsvärdet med hjälp av värderingsmultiplar inom life science industrin i Sverige. Tillvägagångssättet följer välkänd teori om multipel linjär regressions analys. Resultaten visade att det endast finns ett samband mellan marknadsvärdet och utgifter för forskning och utveckling för ett bolag. Detta innebär inte att de andra variablerna inte har någon effekt på marknadsvärdet, utan att det inte finns ett linjärt samband som kan bevisas på ett statistiskt vis.
|
284 |
The development and analysis of a computationally efficient data driven suit jacket fit recommendation systemBogdanov, Daniil January 2017 (has links)
In this master thesis work we design and analyze a data driven suit jacket fit recommendation system which aim to guide shoppers in the process of assessing garment fit over the web. The system is divided into two stages. In the first stage we analyze labelled customer data, train supervised learning models as to be able to predict optimal suit jacket dimensions of unseen shoppers and determine appropriate models for each suit jacket dimension. In stage two the recommendation system uses the results from stage one and sorts a garment collection from best fit to least fit. The sorted collection is what the fit recommendation system is to return. In this thesis work we propose a particular design of stage two that aim to reduce the complexity of the system but at a cost of reduced quality of the results. The trade-offs are identified and weighed against each other. The results in stage one show that simple supervised learning models with linear regression functions suffice when the independent and dependent variables align at particular landmarks on the body. If style preferences are also to be incorporated into the supervised learning models, non-linear regression functions should be considered as to account for increased complexity. The results in stage two show that the complexity of the recommendation system can be made independent from the complexity of how fit is assessed. And as technology is enabling for more advanced ways of assessing garment fit, such as 3D body scanning techniques, the proposed design of reducing the complexity of the recommendation system enables for highly complex techniques to be utilized without affecting the responsiveness of the system in run-time. / I detta masterexamensarbete designar och analyserar vi ett datadrivet rekommendationssystem för kavajer med mål att vägleda nät-handlare i deras process i att bedöma passform över internet. Systemet är uppdelat i två steg. I det första steget analyserar vi märkt data och tränar modeller i att lära sig att framställa prognoser av optimala kavajmått för shoppare som inte systemet har tidigare exponeras för. I steg två tar rekommendationssystemet resultatet ifrån steg ett och sorterar plaggkollektionen från bästa till sämsta passform. Den sorterade kollektionen är vad systemet är tänkt att retunera. I detta arbete föreslåar vi en specifik utformning gällande steg två med mål att reducera komplexiteten av systemet men till en kostnad i noggrannhet vad det gäller resultat. För- och nackdelar identifieras och vägs mot varandra. Resultatet i steg två visar att enkla modeller med linjära regressionsfunktioner räcker när de obereoende och beroende variabler sammanfaller på specifika punkter på kroppen. Om stil-preferenser också vill inkorpereras i dessa modeller bör icke-linjära regressionsfunktioner betraktas för att redogöra för den ökade komplexitet som medföljer. Resultaten i steg två visar att komplexiteten av rekommendationssystemet kan göras obereoende av komplexiteten för hur passform bedöms. Och då teknologin möjliggör för allt mer avancerade sätt att bedöma passform, såsom 3D-scannings tekniker, kan mer komplexa tekniker utnyttjas utan att påverka responstiden för systemet under körtid.
|
285 |
Batch and Online Implicit Weighted Gaussian Processes for Robust Novelty DetectionRamirez, Padron Ruben 01 January 2015 (has links)
This dissertation aims mainly at obtaining robust variants of Gaussian processes (GPs) that do not require using non-Gaussian likelihoods to compensate for outliers in the training data. Bayesian kernel methods, and in particular GPs, have been used to solve a variety of machine learning problems, equating or exceeding the performance of other successful techniques. That is the case of a recently proposed approach to GP-based novelty detection that uses standard GPs (i.e. GPs employing Gaussian likelihoods). However, standard GPs are sensitive to outliers in training data, and this limitation carries over to GP-based novelty detection. This limitation has been typically addressed by using robust non-Gaussian likelihoods. However, non-Gaussian likelihoods lead to analytically intractable inferences, which require using approximation techniques that are typically complex and computationally expensive. Inspired by the use of weights in quasi-robust statistics, this work introduces a particular type of weight functions, called here data weighers, in order to obtain robust GPs that do not require approximation techniques and retain the simplicity of standard GPs. This work proposes implicit weighted variants of batch GP, online GP, and sparse online GP (SOGP) that employ weighted Gaussian likelihoods. Mathematical expressions for calculating the posterior implicit weighted GPs are derived in this work. In our experiments, novelty detection based on our weighted batch GPs consistently and significantly outperformed standard batch GP-based novelty detection whenever data was contaminated with outliers. Additionally, our experiments show that novelty detection based on online GPs can perform similarly to batch GP-based novelty detection. Membership scores previously introduced by other authors are also compared in our experiments.
|
286 |
The Design of GLR Control Charts for Process MonitoringXu, Liaosa 27 February 2013 (has links)
Generalized likelihood ratio (GLR) control charts are investigated for two types of statistical process monitoring (SPC) problems.
The first part of this dissertation considers the problem of monitoring a normally distributed process variable when a special cause may produce a time varying linear drift in the mean. The design and application of a GLR control chart for drift detection is investigated. The GLR drift chart does not require specification of any tuning parameters by the practitioner, and has the advantage that, at the time of the signal, estimates of both the change point and the drift rate are immediately available. An equation is provided to accurately approximate the control limit. The performance of the GLR drift chart is compared to other control charts such as a standard CUSUM chart and a CUSCORE chart designed for drift detection. We also compare the GLR chart designed for drift detection to the GLR chart designed for sustained shift detection since both of them require only a control limit to be specified. In terms of the expected time for detection and in terms of the bias and mean squared error of the change-point estimators, the GLR drift chart has better performance for a wide range of drift rates relative to the GLR shift chart when the out-of-control process is truly a linear drift.
The second part of the dissertation considers the problem of monitoring a linear functional relationship between a response variable and one or more explanatory variables (a linear profile). The design and application of GLR control charts for this problem are investigated. The likelihood ratio test of the GLR chart is generalized over the regression coefficients, the variance of the error term, and the possible change-point. The performance of the GLR chart is compared to various existing control charts. We show that the overall performance of the GLR chart is much better than other options in detecting a wide range of shift sizes. The existing control charts designed for certain shifts that may be of particular interest have several chart parameters that need to be specified by the user, which makes the design of such control charts more difficult. The GLR chart is very simple to design, as it is invariant to the choice of design matrix and the values of in-control parameters. Therefore there is only one design parameter (the control limit) that needs to be specified. Especially, the GLR chart can be constructed based on the sample size of n=1 at each sampling point, whereas other charts cannot be applied. Another advantage of the GLR chart is its built-in diagnostic aids that provide estimates of both the change-point and the values of linear profile parameters. / Ph. D.
|
287 |
Formulering av HPWS i ideella föreningar : En studie om implementationsmöjligheter av HPWS i ensvensk esportföreningLinnarsson, Rasmus January 2023 (has links)
This study aims to examine and identify possibilities andprerequisites for implementing a High Performance Work System(HPWS) in a Swedish non-profit association’s operations. Theessay assumes a set of frameworks in the form of HPWS practiceswhich are defined in previous research in order to determine andevaluate the implementation possibilities in the association. Thestudy takes place in “SRL Spelförening”, a non-profit associationwhich conduct welfare processes in the form of esports in theSwedish grassroot scene. The association has during its lifespanundergone large changes within all aspects of its operation – notleast the structure of work processes, something that has meantthat many parts of the workflow are unorganized and unstructured.This study analyzes the current state of the association’sprocesses with a quantitative interview approach focused on a setof work-related variables. The data that is generated is then usedas a control tool when the association’s prevalent prerequisites ofHPWS are to be determined. The data was also applied in a linearregression model generating correlation coefficients among thework-related variables – information which in turn gets applied inorder to decide the association’s individual attitude to HPWS. Theanalysis points to the association generally meeting the HPWSrequirements which were cited in earlier research. Some aspectswere however determined to lie beneath the bar and thereforerequire changes in accordance with earlier similar HPWS relatedcases. Furthermore, the correlation results show that theassociation’s workers’ overall attitude towards HPWS was positive,where a higher appreciation of a HPWS was related to higherengagement and lower turnover intention.
|
288 |
Modeling Organic Installs in a Free-to-Play Game / Modellering av organiska nedladdningar i ett Free-to-Play Spel.Prudhomme, Maxime January 2022 (has links)
The Free-To-Play industry relies on getting a huge inflow of new players that might result in future gross bookings. Consequently, getting organic new players is crucial to ensure its health, especially as they have no direct associated acquisition cost. In addition, forecasting helps business planning as future gross bookings result from those news installs. This thesis investigates methods such as Linear Regression, Ridge, Lasso regularization, time-series analysis, and Prophet to forecast the inflow of organic installs and try to understand the factors impacting it. Using the data from 3 games for two platforms and 15 countries, it investigates the differences in behavior observed over the segments. This thesis first focuses on a specific segment by modeling the inflow of organic installs for the game number 17 on iOS in the United States of America. On this segment, the best model is the Lasso model using, among others, a Prophet model as a variable. However, the generalization to all segments is difficult. On average, exponential decay over time is the best way to forecast the future inflow of organic as it presents the more consistent performances over all segments. / Free-To-Play-branschen är beroende av att få ett stort inflöde av nya spelare, som sedan eventuellt kan generera framtida intäkter. För att kunna säkerställa ett spels fortsatta hälsa är det därför avgörande att få nya spelare organiskt. Detta är särskilt viktigt då det inte innebär någon anskaffningskostnad. Då framtida intäkter är beroende av nya nedladdningar är prognostisering till stor nytta i företagsplanering. Denna uppsats använder metoder som linjär regression, Ridge, Lasso-regularization, tidsserieanalys och Prophet för att förutspå inflödet av organiska nedladdningar och förstå vilka faktorer som påverkar detta inflöde.Genom användningen av data från tre spel från två plattformar och 15 länder undersöks skillnader i beteende för olika segment. Denna uppsats fokuserar på ett specifikt segment genom att modellera inflödet av organiska nedladdningar för spel nummer 17 på iOS i USA. För detta segment är Lasso-modellen bäst, som bland annat använder Prophet-modellen som variabel. Det är dock svårt att överföra slutsatserna på andra segment. Istället är det bättre att anta en exponentiell nedgång över tid när man förutspår framtida inflöden av organiska nedladdningar, då det ger mer konsekventa resultat för alla segment.
|
289 |
Regularization: Stagewise Regression and BaggingEhrlinger, John M. 31 March 2011 (has links)
No description available.
|
290 |
Astrostatistics: Statistical Analysis of Solar Activity from 1939 to 2008Yousef, Mohammed A. 10 April 2014 (has links)
No description available.
|
Page generated in 0.0984 seconds