Spelling suggestions: "subject:"linearregression"" "subject:"multilinearregression""
601 |
Price Prediction of Vinyl Records Using Machine Learning AlgorithmsJohansson, David January 2020 (has links)
Machine learning algorithms have been used for price prediction within several application areas. Examples include real estate, the stock market, tourist accommodation, electricity, art, cryptocurrencies, and fine wine. Common approaches in studies are to evaluate the accuracy of predictions and compare different algorithms, such as Linear Regression or Neural Networks. There is a thriving global second-hand market for vinyl records, but the research of price prediction within the area is very limited. The purpose of this project was to expand on existing knowledge within price prediction in general to evaluate some aspects of price prediction of vinyl records. That included investigating the possible level of accuracy and comparing the efficiency of algorithms. A dataset of 37000 samples of vinyl records was created with data from the Discogs website, and multiple machine learning algorithms were utilized in a controlled experiment. Among the conclusions drawn from the results was that the Random Forest algorithm generally generated the strongest results, that results can vary substantially between different artists or genres, and that a large part of the predictions had a good accuracy level, but that a relatively small amount of large errors had a considerable effect on the general results.
|
602 |
Time series monitoring and prediction of data deviations in a manufacturing industryLantz, Robin January 2020 (has links)
An automated manufacturing industry makes use of many interacting moving parts and sensors. Data from these sensors generate complex multidimensional data in the production environment. This data is difficult to interpret and also difficult to find patterns in. This project provides tools to get a deeper understanding of Swedsafe’s production data, a company involved in an automated manufacturing business. The project is based on and will show the potential of the multidimensional production data. The project mainly consists of predicting deviations from predefined threshold values in Swedsafe’s production data. Machine learning is a good method of finding relationships in complex datasets. Supervised machine learning classification is used to predict deviation from threshold values in the data. An investigation is conducted to identify the classifier that performs best on Swedsafe's production data. The technique sliding window is used for managing time series data, which is used in this project. Apart from predicting deviations, this project also includes an implementation of live graphs to easily get an overview of the production data. A steady production with stable process values is important. So being able to monitor and predict events in the production environment can provide the same benefit for other manufacturing companies and is therefore suitable not only for Swedsafe. The best performing machine learning classifier tested in this project was the Random Forest classifier. The Multilayer Perceptron did not perform well on Swedsafe’s data, but further investigation in recurrent neural networks using LSTM neurons would be recommended. During the projekt a web based application displaying the sensor data in live graphs is also developed.
|
603 |
Les variables associées à la collaboration interprofessionnelle dans les équipes interdisciplinaires de santé mentaleNdibu, Muntu Keba Kebe 08 1900 (has links)
Plusieurs études ont montré que la collaboration interprofessionnelle (CIP) produit des retombées positives pour les usagers, les professionnels de la santé et les organisations de soins. Cependant, les chercheurs estiment que son adoption dans les organisations et les services de santé est insuffisante. Cette situation conduit à des conflits souvent nuisibles entre les professionnels, à des erreurs médicales, à une augmentation des coûts de soins de santé et à des taux de mortalité élevés. Il existe un besoin de recherche pour identifier les variables associées à la CIP, particulièrement dans le domaine de la santé mentale (SM). La présente thèse vise à combler les lacunes susmentionnées et à permettre d’approfondir les connaissances que nous avons à l’heure actuelle sur la CIP.
Trois cent quinze (315) professionnels œuvrant dans les équipes interdisciplinaires de soins primaires (N=101) et spécialisés (N=214) de SM, localisées dans quatre réseaux locaux de services (RLS) du Québec, ont participé à l’étude. Plusieurs variables reconnues comme étant fortement associées à la CIP dans la littérature scientifique du domaine de la santé ont été prises en compte et catégorisées dans un cadre conceptuel inspiré du modèle de Bronstein (2003). Trois objectifs spécifiques ont été fixés, et chacun a fait l’objet d’un article scientifique.
Le premier article visait à identifier les variables associées à la CIP dans les équipes interdisciplinaires de SM implantées dans les RLS. Des analyses de régression linéaire ont été effectuées. Cinq variables liées aux caractéristiques interpersonnelles (l’engagement affectif envers l'équipe, le climat d'équipe, l’autonomie de l'équipe, le partage et l’intégration des connaissances), une variable liée au rôle professionnel (l’identification multifocale) et une autre liée aux caractéristiques personnelles (l’âge) étaient associées à la CIP.
Le deuxième article visait à identifier les profils de professionnels de la SM selon leurs perceptions de la CIP ainsi que les variables associées pouvant les différencier. À l'aide de l’analyse typologique, quatre profils de professionnels en SM ont été identifiés. Deux profils présentaient un niveau élevé de perception de la CIP, un profil présentait un niveau moyen et un autre présentait un niveau faible. Le support organisationnel, la participation à la prise de décisions, la confiance mutuelle, l’engagement affectif envers l’équipe, les croyances aux bénéfices de la collaboration interdisciplinaire, le partage et l’intégration des connaissances étaient associés aux profils ayant des scores élevés de la CIP.
Enfin, le troisième article a porté sur la comparaison des variables associées à la CIP selon le contexte de soins, à savoir : les soins primaires de SM (SP-SM) et les services spécialisés. Deux modèles de régression multivariée ont été réalisés, et ont permis d’identifier les variables significativement associées à chacun des contextes. Il s’agit du partage des connaissances pour les équipes de SP-SM, du soutien organisationnel et de l’âge pour les services spécialisés.
Au regard de ce qui précède, des recommandations ont été formulées à l’intention des gestionnaires des services de SM, aux CSSS et organisations de soins. / Studies have shown that interprofessional collaboration (IPC) has a positive impact on service users, health professionals and healthcare organizations. However, researchers believe that the adoption of IPC in organizations and health services is insufficient, leading to conflict among professionals, medical errors, increased costs of care and higher mortality rates. While IPC has emerged over the past several years as a best practice, research is needed to identify variables associated with IPC, particularly in mental health (MH) which has received relatively little attention. The present thesis aims to fill these gaps and to deepen the present state of knowledge about IPC, particularly in the MH field.
Three hundred and fifteen (315) MH professionals working in interdisciplinary primary care teams (N = 101) and specialized MH teams (N = 214) located in four Quebec local service networks (RLS) participated in the study. Many of the variables recognized as strongly associated with IPC in the health sciences literature, were integrated and categorized within a conceptual framework inspired by the Bronstein model (2003). Three specific study objectives were established, with each one the subject of a scientific article.
The first article aimed to identify variables associated with IPC in interdisciplinary MH teams. Linear regression analyzes were performed. Five variables related to interpersonal characteristics (emotional commitment to the team, team climate, team autonomy, knowledge sharing and integration), one variable related to professional role (identification multifocal) and another related to personal characteristics (age) were associated with IPC.
The second article aimed to identify profiles of MH professionals according to their perception of IPC as well as other distinguishing variables. Using Cluster Analysis, four profiles of MH professionals were identified. Two profiles had high levels of IPC, one profile an average level, and the other profile a low level of IPC. Organizational support, participation in decision-making, mutual trust, emotional commitment to the team, belief in the benefits of IPC, knowledge sharing, and knowledge integration were associated with the profiles that revealed high IPC scores. By contrast, team conflicts were associated with the profile of MH professionals with the lowest IPC score.
Finally, the third article focused on a comparison of IPC-related variables by care settings: primary health care (PHC) and specialized MH care. These two contexts of care differ in terms of their activities, clients served, the actors involved in care episodes of care and the roles of team members. Two multivariate regression models were performed, identifying the following variables as significantly associated with each of the care settings: knowledge sharing for PHC teams, and organizational support and age for specialized MH teams.
Considering the above, recommendations have been made to managers, health and social service centers and care organizations for promoting IPC in interdisciplinary MH teams.
|
604 |
Vybrané transformace náhodných veličin užívané v klasické lineární regresi / Selected random variables transformations used in classical linear regressionTejkal, Martin January 2017 (has links)
Klasická lineární regrese a z ní odvozené testy hypotéz jsou založeny na předpokladu normálního rozdělení a shodnosti rozptylu závislých proměnných. V případě že jsou předpoklady normality porušeny, obvykle se užívá transformací závisle proměnných. První část této práce se zabývá transformacemi stabilizujícími rozptyl. Značná pozornost je udělena náhodným veličinám s Poissonovým a negativně binomickým rozdělením, pro které jsou studovány zobecněné transformace stabilizující rozptyl obsahující parametry v argumentu navíc. Pro tyto parametry jsou stanoveny jejich optimální hodnoty. Cílem druhé části práce je provést srovnání transformací uvedených v první části a dalších často užívaných transformací. Srovnání je provedeno v rámci analýzy rozptylu testováním hypotézy shodnosti středních hodnot p nezávislých náhodných výběrů s pomocí F testu. V této části jsou nejprve studovány vlastnosti F testu za předpokladu shodných a neshodných rozptylů napříč výběry. Následně je provedeno srovnání silofunkcí F testu aplikovaného pro p výběrů z Poissonova rozdělení transformovanými odmocninovou, logaritmickou a Yeo Johnsnovou transformací a z negativně binomického rozdělení transformovaného argumentem hyperbolického sinu, logaritmickou a Yeo-Johnsnovou transformací.
|
605 |
Kalibrace nepřímých metod pro zjišťování vlastností alkalicky aktivovaných betonů / Calibration of indirect methods for maesurement of properties of alkali activated concretesVrba, Pavel January 2014 (has links)
This work solves creation of calibration relations to determine cube compressive strength, dynamic and static elastic modulus of alkali-activated concrete by non-destructive methods. Alkali-activated concrete is spoken of as a new material used in civil engineering. It shows different properties than normal concrete based on Portland cement. That's why the modification of common calibration relation seems necessary. Fresh concrete was made in the concrete plan ŽPSV a.s., Uherský Ostroh in three mixtures and always in the number of 18 cubes and 3 prisms. The samples were tested by impact hammer Schmidt type L, type N, SilverSchmidt PC-N and by ultrasound in 6 time periods of three specimens. After that, the cube compressive strength was determined. Status of static elastic modulus was determined in a time period of 28 days. The results are calibration relations to determine the progress of compressive strength and modulus of elasticity for each method and their combination.
|
606 |
Approximation of Terrain Data Utilizing Splines / Approximation of Terrain Data Utilizing SplinesTomek, Peter January 2012 (has links)
Pro optimalizaci letových trajektorií ve velmi malé nadmorské výšce, terenní vlastnosti musí být zahrnuty velice přesne. Proto rychlá a efektivní evaluace terenních dat je velice důležitá vzhledem nato, že čas potrebný pro optimalizaci musí být co nejkratší. Navyše, na optimalizaci letové trajektorie se využívájí metody založené na výpočtu gradientu. Proto musí být aproximační funkce terenních dat spojitá do určitého stupne derivace. Velice nádejná metoda na aproximaci terenních dat je aplikace víceroměrných simplex polynomů. Cílem této práce je implementovat funkci, která vyhodnotí dané terenní data na určitých bodech spolu s gradientem pomocí vícerozměrných splajnů. Program by měl vyčíslit více bodů najednou a měl by pracovat v $n$-dimensionálním prostoru.
|
607 |
Development of a Software Reliability Prediction Method for Onboard European Train Control SystemLongrais, Guillaume Pierre January 2021 (has links)
Software prediction is a complex area as there are no accurate models to represent reliability throughout the use of software, unlike hardware reliability. In the context of the software reliability of on-board train systems, ensuring good software reliability over time is all the more critical given the current density of rail traffic and the risk of accidents resulting from a software malfunction. This thesis proposes to use soft computing methods and historical failure data to predict the software reliability of on-board train systems. For this purpose, four machine learning models (Multi-Layer Perceptron, Imperialist Competitive Algorithm Multi-Layer Perceptron, Long Short-Term Memory Network and Convolutional Neural Network) are compared to determine which has the best prediction performance. We also study the impact of having one or more features represented in the dataset used to train the models. The performance of the different models is evaluated using the Mean Absolute Error, Mean Squared Error, Root Mean Squared Error and the R Squared. The report shows that the Long Short-Term Memory Network is the best performing model on the data used for this project. It also shows that datasets with a single feature achieve better prediction. However, the small amount of data available to conduct the experiments in this project may have impacted the results obtained, which makes further investigations necessary. / Att förutsäga programvara är ett komplext område eftersom det inte finns några exakta modeller för att representera tillförlitligheten under hela programvaruanvändningen, till skillnad från hårdvarutillförlitlighet. När det gäller programvarans tillförlitlighet i fordonsbaserade tågsystem är det ännu viktigare att säkerställa en god tillförlitlighet över tiden med tanke på den nuvarande tätheten i järnvägstrafiken och risken för olyckor till följd av ett programvarufel. I den här avhandlingen föreslås att man använder mjuka beräkningsmetoder och historiska data om fel för att förutsäga programvarans tillförlitlighet i fordonsbaserade tågsystem. För detta ändamål jämförs fyra modeller för maskininlärning (Multi-Layer Perceptron, Imperialist Competitive Algorithm Mult-iLayer Perceptron, Long Short-Term Memory Network och Convolutional Neural Network) för att fastställa vilken som har den bästa förutsägelseprestandan. Vi undersöker också effekten av att ha en eller flera funktioner representerade i den datamängd som används för att träna modellerna. De olika modellernas prestanda utvärderas med hjälp av medelabsolut fel, medelkvadratfel, rotmedelkvadratfel och R-kvadrat. Rapporten visar att Long Short-Term Memory Network är den modell som ger bäst resultat på de data som använts för detta projekt. Den visar också att dataset med en enda funktion ger bättre förutsägelser. Den lilla mängd data som fanns tillgänglig för att genomföra experimenten i detta projekt kan dock ha påverkat de erhållna resultaten, vilket gör att ytterligare undersökningar är nödvändiga.
|
608 |
Radar based tank level measurement using machine learning : Agricultural machines / Nivåmätning av tank med radar sensorer och maskininlärningThorén, Daniel January 2021 (has links)
Agriculture is becoming more dependent on computerized solutions to make thefarmer’s job easier. The big step that many companies are working towards is fullyautonomous vehicles that work the fields. To that end, the equipment fitted to saidvehicles must also adapt and become autonomous. Making this equipment autonomoustakes many incremental steps, one of which is developing an accurate and reliable tanklevel measurement system. In this thesis, a system for tank level measurement in a seedplanting machine is evaluated. Traditional systems use load cells to measure the weightof the tank however, these types of systems are expensive to build and cumbersome torepair. They also add a lot of weight to the equipment which increases the fuel consump-tion of the tractor. Thus, this thesis investigates the use of radar sensors together witha number of Machine Learning algorithms. Fourteen radar sensors are fitted to a tankat different positions, data is collected, and a preprocessing method is developed. Then,the data is used to test the following Machine Learning algorithms: Bagged RegressionTrees (BG), Random Forest Regression (RF), Boosted Regression Trees (BRT), LinearRegression (LR), Linear Support Vector Machine (L-SVM), Multi-Layer Perceptron Re-gressor (MLPR). The model with the best 5-fold crossvalidation scores was Random For-est, closely followed by Boosted Regression Trees. A robustness test, using 5 previouslyunseen scenarios, revealed that the Boosted Regression Trees model was the most robust.The radar position analysis showed that 6 sensors together with the MLPR model gavethe best RMSE scores.In conclusion, the models performed well on this type of system which shows thatthey might be a competitive alternative to load cell based systems.
|
609 |
Bid Forecasting in Public Procurement / Budgivningsmodeller i offentliga upphandlingarStiti, Karim, Yape, Shih Jung January 2019 (has links)
Public procurement amounts to a significant part of Sweden's GDP. Nevertheless, it is an overlooked sector characterized by low digitization and inefficient competition where bids are not submitted based on proper mathematical tools. This Thesis seeks to create a structured approach to bidding in cleaning services by determining factors affecting the participation and pricing decision of potential buyers. Furthermore, we assess price prediction by comparing multiple linear regression models (MLR) to support vector regression (SVR). In line with previous research in the construction sector, we find significance for several factors such as project duration, location and type of contract on the participation decision in the cleaning sector. One notable deviant is that we do not find contract size to have an impact on the pricing decision. Surprisingly, the performance of MLR are comparable to more advanced SVR models. Stochastic dominance tests on price performance concludes that experienced bidders perform better than their inexperienced counterparts and companies place more competitive bids in lowest price tenders compared to economically most advantageous tenders (EMAT) indicating that EMAT tenders are regarded as unstructured. However, no significance is found for larger actors performing better in bidding than smaller companies. / Offentliga upphandlingar utgör en signifikant del av Sveriges BNP. Trots detta är det en förbisedd sektor som karakteriseras av låg digitalisering och ineffektiv konkurrens där bud läggs baserat på intuition snarare än matematiska modeller. Denna avhandling ämnar skapa ett strukturerat tillvägagångssätt för budgivning inom städsektorn genom att bestämma faktorer som påverkar deltagande och prissättning. Vidare undersöker vi prisprediktionsmodeller genom att jämföra multipel linjära regressionsmodeller med en maskininlärningsmetod benämnd support vector regression. I enlighet med tidigare forskning i byggindustrin finner vi att flera faktorer som typ av kontrakt, projekttid och kontraktsplats har en statistisk signifikant påverkan på deltagande i kontrakt i städindustrin. En anmärkningsvärd skillnad är att kontraktsvärdet inte påverkar prissättning som tidigare forskning visat i andra områden. För prisprediktionen är det överraskande att den enklare linjära regressionsmodellen presterar jämlikt till den mer avancerade maskininlärningsmodellen. Stokastisk dominanstest visar att erfarna företag har en bättre precision i sin budgivning än mindre erfarna företag. Därtill lägger företag överlag mer konkurrenskraftiga bud i kontrakt där kvalitetsaspekter tas i beaktning utöver priset. Vilket kan indikera att budgivare upplever dessa kontrakt som mindre strukturerade. Däremot finner vi ingen signifikant skillnad mellan större och mindre företag i denna bemärkning.
|
610 |
Feature selection in short-term load forecasting / Val av attribut vid kortvarig lastprognos för energiförbrukningSöderberg, Max Joel, Meurling, Axel January 2019 (has links)
This paper investigates correlation between energy consumption 24 hours ahead and features used for predicting energy consumption. The features originate from three categories: weather, time and previous energy. The correlations are calculated using Pearson correlation and mutual information. This resulted in the highest correlated features being those representing previous energy consumption, followed by temperature and month. Two identical feature sets containing all attributes1 were obtained by ranking the features according to correlation. Three feature sets were created manually. The first set contained seven attributes representing previous energy consumption over the course of seven days prior to the day of prediction. The second set consisted of weather and time attributes. The third set consisted of all attributes from the first and second set. These sets were then compared on different machine learning models. It was found the set containing all attributes and the set containing previous energy attributes yielded the best performance for each machine learning model. 1In this report, the words ”attribute” and ”feature” are used interchangeably. / I denna rapport undersöks korrelation och betydelsen av olika attribut för att förutspå energiförbrukning 24 timmar framåt. Attributen härstammar från tre kategorier: väder, tid och tidigare energiförbrukning. Korrelationerna tas fram genom att utföra Pearson Correlation och Mutual Information. Detta resulterade i att de högst korrelerade attributen var de som representerar tidigare energiförbrukning, följt av temperatur och månad. Två identiska attributmängder erhölls genom att ranka attributen över korrelation. Tre attributmängder skapades manuellt. Den första mängden innehåll sju attribut som representerade tidigare energiförbrukning, en för varje dag, sju dagar innan datumet för prognosen av energiförbrukning. Den andra mängden bestod av väderoch tidsattribut. Den tredje mängden bestod av alla attribut från den första och andra mängden. Dessa mängder jämfördes sedan med hjälp av olika maskininlärningsmodeller. Resultaten visade att mängden med alla attribut och den med tidigare energiförbrukning gav bäst resultat för samtliga modeller.
|
Page generated in 0.0955 seconds