Global ETD Search

331	Magnificent beasts of the Milky Way: Hunting down stars with unusual infrared properties using supervised machine learning Ahlvind, Julia January 2021 (has links) The significant increase of astronomical data necessitates new strategies and developments to analyse a large amount of information, which no longer is efficient if done by hand. Supervised machine learning is an example of one such modern strategy. In this work, we apply the classification technique on Gaia+2MASS+WISE data to explore the usage of supervised machine learning on large astronomical archives. The idea is to create an algorithm that recognises entries with unusual infrared properties which could be interesting for follow-up observations. The programming is executed in MATLAB and the training of the algorithms in the classification learner application of MATLAB. Each catalogue; Gaia+2MASS+WISE contains ~109, 5×108 and 7×108 (The European Space Agency 2019, Skrutskie et al. 2006, R. M. Cutri IPAC/Caltech) entries respectively. The algorithms searches through a sample from these archives consisting of 765266 entries, corresponding to objects within a <500 pc range. The project resulted in a list of 57 entries with unusual infrared properties, out of which 8 targets showed none of the four common features that provide a natural physical explanation to the unconventional energy distribution. After more comprehensive studies of the aforementioned targets, we deem it necessary for further studies and observations on 2 out of the 8 targets (Nr.1 and Nr.8 in table 3) to establish their true nature. The results demonstrate the applicability of machine learning in astronomy as well as suggesting a sample of intriguing targets for further studies. / Inom astronomi samlas stora mängder data in kontinuerligt och dess tillväxt ökar snabbt för varje år. Detta medför att manuella analyser av datan blir mindre och mindre lönsama och kräver istället nya strategier och metoder där stora datamängder snabbare kan analyseras. Ett exempel på en sådan strategi är vägledd maskininlärning. I detta arbete utnyttjar vi en vägled maskininlärnings teknik kallad klassificering. Vi använder klassificerings tekniken på data från de tre stora astronomiska katalogerna Gaia+2MASS+WISE för att undersöka användningen av denna teknik på just stora astronomiska arkiv. Idén är att skapa en algorithm som identifierar objekt med okontroversiella infraröda egenskaper som kan vara intressanta för vidare observationer och analyser. Dessa ovanliga objekt är förväntade att ha en lägre emission i det optiska våglängdsområdet och en högre emission i det infraröda än vad vanligtvis är observerad för en stjärna. Programmeringen sker i MATLAB och träningsprocessen av algoritmerna i MATLABs applikation classification learner. Algoritmerna söker igenom en samling data bestående av 765266 objekt, från katalogerna Gaia+2MASS+WISE. Dessa kataloger innehåller totalt ~109, 5×108 och 7×108 (The European Space Agency 2019, Skrutskie et al. 2006, R. M. Cutri IPAC/Caltech) objekt vardera. Det begränsade dataset som algoritmerna söker igenom motsvarar objekt inom en radie av <500 pc. Många av de objekt som algoritmerna identifierade som ”ovanliga” tycks i själva verket vara nebulösa objekt. Den naturliga förklaringen för dess infraröda överskott är det omslutande stoft som ger upphov till värmestrålning i det infraröda. För att eliminera denna typ av objekt och fokusera sökningen på mer okonventionella objekt gjordes modifieringar av programmen. En av de huvudsakliga ändringarna var att introducera en tredje klass bestående av stjärnor inneslutna av stoft som vi kallar "YSO"-klassen. Ytterligare en ändring som medförde förbättrade resultat var att introducera koordninaterna i träningen samt vid den slutgiltiga klassificeringen och på så vis, identifiering av intressanta kandidater. Dessa justeringar resulterade i en minskad andelen nebulösa objekt i klassen av ”ovanliga” objekt som algoritmerna identifierade. Projektet resulterade i en lista av 57 objekt med ovanliga infraröda egenskaper. 8 av dessa objekt påvisade ingen av det fyra vanligt förekommande egenskaperna som kan ge en naturlig förklaring på dess överflöd av infraröd strålning. Dessa egenskaper är; nebulös omgivning eller påvisad stoft, variabilitet, Hα emission eller maser strålning. Efter vidare undersökning av de 8 tidigare nämnda objekt anser vi att 2 av dessa behöver vidare observationer och analys för att kunna fastslå dess sanna natur (Nr.1 och Nr.8 i tabell 3). Den infraröda strålningen är alltså inte enkelt förklarad för dessa 2 objekt. Resultaten av intressanta objekt samt övriga resultat från maskininlärningen, visar på att klassificeringstekniken inom maskininlärning är användbart på stora astronomiska datamängder. machine learning supervised machine learning support vector machine photometry stars stellar physics astronomical archives Physical Sciences Fysik Astronomy, Astrophysics and Cosmology Astronomi, astrofysik och kosmologi
332	Adaptive Anomaly Detection for Large IoT Datasets with Machine Learning and Transfer Learning Negus, Andra Stefania January 2020 (has links) As more IoT devices enter the market it becomes increasingly important to develop reliable and adaptive ways of dealing with the data they generate. These must address data quality and reliability. Such solutions could benefit both the device producers and their customers who, as a result, could receive faster and better customer support services. Thus, this project's goal is twofold. First, it is to identify faulty data points generated by such devices. Second, it is to evaluate whether the knowledge gained from available/known sensors and appliances is transferable to other sensors on similar devices. This would make it possible to evaluate the behaviour of new appliances as soon as they are first switched on, rather than after sufficient data from them has been collected. This project uses time series data from three appliances: washing machine, washer&dryer and refrigerator. For these, two solutions are developed and tested: one for categorical and another for numerical variables. Categorical variables are analysed using the Average Value Frequency and the pure frequency of state-transition methods. Due to the limited number of possible states, the pure frequency proves to be the better solution, and the knowledge gained is transferred from the source device to the target one, with moderate success. Numerical variables are analysed using a One-class Support Vector Machine pipeline, with very promising results. Further, learning and forgetting mechanisms are developed to allow for the pipelines to adapt to changes in appliance patterns of behaviour. This includes a decay function for the numerical variables solution. Interestingly, the different weights for the source and target have little to no impact on the quality of the classification. / Nya IoT-enheter träder in på marknaden så det blir allt viktigare att utveckla tillförlitliga och anpassningsbara sätt att hantera de data de genererar. Dessa bör hantera datakvalitet och tillförlitlig- het. Sådana lösningar kan gynna båda tillverkarna av apparater och deras kunder som som ett resultat kan dra nytta av snabbare och bättre kundsupport / tjänster. Således har detta projekt två mål. Det första är att identifiera felaktiga datapunkter som genereras av sådana enheter. För det andra är det att utvärdera om kunskapen från tillgängliga / kända sensorer och apparater kan överföras till andra sensorer på liknande enheter. Detta skulle göra det möjligt att utvärdera beteendet hos nya apparater så snart de slås på första gången, snarare än efter att tillräcklig information från dem har samlats in. Detta projekt använder tidsseriedata från tre apparater: tvättmaskin, tvättmaskin och torktumlare och kylskåp. För dessa utvecklas och testas två lösningar: en för kategoriska variabler och en annan för numeriska variabler. De kategoriska variablerna analyseras med två metoder: Average Value Frequency och den rena frekvensen för tillståndsövergång. På grund av det begränsade antalet möjliga tillstånd visar sig den rena frekvensen vara den bättre lösningen, och kunskapen som erhålls överförs från källanordningen till målet, med måttlig framgång. De numeriska variablerna analyseras med hjälp av en One-class Support Vector Machine-pipeline, med mycket lovande resultat. Vidare utvecklas inlärnings- och glömningsmekanismer för att möjliggöra för rörledningarna att anpassa sig till förändringar i apparatens beteendemönster. Detta inkluderar en sönderfallningsfunktion för den numeriska variabellösningen. Intressant är att de olika vikterna för källan och målet har liten eller ingen inverkan på kvaliteten på klassificeringen. machine learning transfer learning internet of things anomaly detection time series one class support vector machine svm Computer and Information Sciences Data- och informationsvetenskap
333	Maskininlärning: avvikelseklassificering på sekventiell sensordata. En jämförelse och utvärdering av algoritmer för att klassificera avvikelser i en miljövänlig IoT produkt med sekventiell sensordata Heidfors, Filip, Moltedo, Elias January 2019 (has links) Ett företag har tagit fram en miljövänlig IoT produkt med sekventiell sensordata och vill genom maskininlärning kunna klassificera avvikelser i sensordatan. Det har genom åren utvecklats ett flertal väl fungerande algoritmer för klassificering men det finns emellertid ingen algoritm som fungerar bäst för alla olika problem. Syftet med det här arbetet var därför att undersöka, jämföra och utvärdera olika klassificerare inom "supervised machine learning" för att ta reda på vilken klassificerare som ger högst träffsäkerhet att klassificera avvikelser i den typ av IoT produkt som företaget tagit fram. Genom en litteraturstudie tog vi först reda på vilka klassificerare som vanligtvis använts och fungerat bra i tidigare vetenskapliga arbeten med liknande applikationer. Vi kom fram till att jämföra och utvärdera Random Forest, Naïve Bayes klassificerare och Support Vector Machines ytterligare. Vi skapade sedan ett dataset på 513 exempel som vi använde för träning och validering för respektive klassificerare. Resultatet visade att Random Forest hade betydligt högre träffsäkerhet med 95,7% jämfört med Naïve Bayes klassificerare (81,5%) och Support Vector Machines (78,6%). Slutsatsen för arbetet är att Random Forest med sina 95,7% ger en tillräckligt hög träffsäkerhet så att företaget kan använda maskininlärningsmodellen för att förbättra sin produkt. Resultatet pekar också på att Random Forest, för det här arbetets specifika klassificeringsproblem, är den klassificerare som fungerar bäst inom "supervised machine learning" men att det eventuellt finns möjlighet att få ännu högre träffsäkerhet med andra tekniker som till exempel "unsupervised machine learning" eller "semi-supervised machine learning". / A company has developed a environment-friendly IoT device with sequential sensor data and want to use machine learning to classify anomalies in their data. Throughout the years, several well working algorithms for classifications have been developed. However, there is no optimal algorithm for every problem. The purpose of this work was therefore to investigate, compare and evaluate different classifiers within supervised machine learning to find out which classifier that gives the best accuracy to classify anomalies in the kind of IoT device that the company has developed. With a literature review we first wanted to find out which classifiers that are commonly used and have worked well in related work for similar purposes and applications. We concluded to further compare and evaluate Random Forest, Naïve Bayes and Support Vector Machines. We created a dataset of 513 examples that we used for training and evaluation for each classifier. The result showed that Random Forest had superior accuracy with 95.7% compared to Naïve Bayes (81.5%) and Support Vector Machines (78.6%). The conclusion for this work is that Random Forest, with 95.7%, gives a high enough accuracy for the company to have good use of the machine learning model. The result also indicates that Random Forest, for this thesis specific classification problem, is the best classifier within supervised machine learning but that there is a potential possibility to get even higher accuracy with other techniques such as unsupervised machine learning or semi-supervised machine learning. Machine learning Supervised learning Classifying algorithms Classifiers Random Forest Naïve bayes Support vector machine Sensor data Sequential data Engineering and Technology Teknik och teknologier
334	Litteraturstudie: Tillämpningen av maskininlärning vid algoritmisk handel Larsson, Therése, Paradis, Karl January 2019 (has links) Vi genomför en litteraturstudie där vi studerar och analyserar publikationer inom maskininlärning i kombination med algoritmisk handel. I denna studie undersöker vi vilka typer av data samt vilka maskininlärningstekniker som kunnat visas vara tillämpningsbara vid system för algoritmisk handel. Till vår litteraturstudie använder vi oss av publikationer som är peer-reviewed från trovärdiga databaser. Resultatet visar att det huvudsakligen finns tre typer av data som är av betydelse för algoritmisk handel. Dessa är historisk prisdata, tekniska indikatorer samt den typ av data som ingår i fundamental analys. Historisk prisdata tycks ofta användas som bas för att sedan bearbetas om till andra typer av data. Det vanligaste exemplet på detta är tekniska indikatorer som ofta förekommer som datakälla i system för algoritmisk handel.Vi finner även ett antal maskininlärningstekniker som av tidigare publikationer demonstreras vara tillämpningsbara för algoritmisk handel. Publikationer påvisar att en maskininlärningsteknik kallad SVM (support vector machine) kan tillämpas på tekniska indikatorer och även analys av nyhetsrubriker. Vi påträffar även publikationer som demonstrerar tillämpningen av två typer av neurala nätverk, klassifikationsnätverk samt regressionsnätverk. Dessa nyttjas för att generera trade signals i ett algoritmiskt handelssystem. I vår studie hittar vi också en tillämpning av evolutionär maskininlärning som används för att approximera en lösning på det optimala orderexekveringsproblemet.Vi diskuterar även ett ekonomiskt incitament som missgynnar akademisk öppenhet och publikation av nya upptäckter inom området. Detta existerar på grund av att fördelaktiga resultat kan vara finansiellt gynnsamma att undanhålla. / We conduct a literature review in which we study and analyze publications in the area of machine learning in combination with algorithmic trading. In this study we investigate what types of data and which machine learning techniques that are shown to be applicable to systems used for algorithmic trading. For our literature review we use peer-reviewed publications from trustworthy databases. The result shows that we find mainly three types of data that are relevant for algorithmic trading. These are financial data quotes, technical indicators and the types of data that is relevant for fundamental analysis. Financial data quotes often seem to be used as a basis for later processing into other types of data. The most common example of this is technical indicators that are frequently used as a source of data in systems for algorithmic trading.We also find a number of machine learning techniques that have been demonstrated by previous publications to be applicable for algorithmic trading. Publications show that a machine learning technique called SVM (support vector machine) can be applied on technical indicators as well as for analysis of news headlines. We also find publications that demonstrate the application of two types of neural networks, classification and regression network. These are used in order to generate trade signals in an algorithmic trading system. In our study we also find an application of evolutionary machine learning which is used to approximate an optimal solution to the order execution problem. Moreover, we also discuss a financial incentive that disadvantage academic openness and the publications of new discoveries in the relevant area of research. This financial incentive exists because advantageous results may be financially beneficial to withhold. maskininlärning algoritmisk handel litteraturstudie tekniska indikatorer orderexekveringsproblem support vector machine historisk prisdata efficient-market hypotesen VaR neurala nätverk Engineering and Technology Teknik och teknologier
335	Sentimentanalys av svenskt aktieforum för att förutspå aktierörelse / Sentiment analysis of Swedish stock trading forum for predicting stock market movement Ouadria, Michel Sebastian, Ciobanu, Ann-Stephanie January 2020 (has links) Förevarande studie undersöker möjligheten att förutsäga aktierörelse på en dagligbasis med sentimentanalys av inlägg från ett svenskt aktieforum. Sentimentanalys används för att finna subjektivitet i form av känslor (sentiment) ur text. Textdata extraherades från ett svenskt aktieforum för att förutsäga aktierörelsen för den relaterade aktien. All data aggregerades inom en bestämd tidsperiod på två år. Undersökningen utnyttjade maskininlärning för att träna tre maskininlärningsmodeller med textdata och aktiedata. Resultatet påvisade ingen tydlig korrelation mellan sentiment och aktierörelse. Vidare uppnåddes inte samma resultat som tidigare arbeten inom området. Den högst uppnådda noggrannheten med modellerna beräknades till 64%. / The present study examines the possibility of predicting stock movement on a daily basis with sentiment analysis of posts in a swedish stock trading forum. Sentiment analysis is used to find subjectivity in the form of emotions (sentiment) from text. Textdata was extracted from a stock forum to predict the share movement of the related share. All data was aggregated within a fixed period of two years. The analysis utilizes machine learning to train three machine learning models with textdata and stockdata. The result showed no clear correlation between sentiment and stock movement. Furthermore, the result was not able to replicate accuracy as previous work in the field. The highest accuracy achieved with the models was calculated at 64%. Sentiment analysis Stock market Machine Learning Support Vector Machine Naive Bayes Extreme Gradient Boosting Sentimentanalys Aktiemarknad Maskininlärning Stödvektormaskin Naive Bayes Extreme Gradient Boosting Computer and Information Sciences Data- och informationsvetenskap
336	Performance evaluation of security mechanisms in Cloud Networks Kannan, Anand January 2012 (has links) Infrastructure as a Service (IaaS) is a cloud service provisioning model which largely focuses on data centre provisioning of computing and storage facilities. The networking aspects of IaaS beyond the data centre are a limiting factor preventing communication services that are sensitive to network characteristics from adopting this approach. Cloud networking is a new technology which integrates network provisioning with the existing cloud service provisioning models thereby completing the cloud computing picture by addressing the networking aspects. In cloud networking, shared network resources are virtualized, and provisioned to customers and end-users on-demand in an elastic fashion. This technology allows various kinds of optimization, e.g., reducing latency and network load. Further, this allows service providers to provision network performance guarantees as a part of their service offering. However, this new approach introduces new security challenges. Many of these security challenges are addressed in the CloNe security architecture. This thesis presents a set of potential techniques for securing different resource in a cloud network environment which are not addressed in the existing CloNe security architecture. The thesis begins with a holistic view of the Cloud networking, as described in the Scalable and Adaptive Internet Solutions (SAIL) project, along with its proposed architecture and security goals. This is followed by an overview of the problems that need to be solved and some of the different methods that can be applied to solve parts of the overall problem, specifically a comprehensive, tightly integrated, and multi-level security architecture, a key management algorithm to support the access control mechanism, and an intrusion detection mechanism. For each method or set of methods, the respective state of the art is presented. Additionally, experiments to understand the performance of these mechanisms are evaluated on a simple cloud network test bed. The proposed key management scheme uses a hierarchical key management approach that provides fast and secure key update when member join and member leave operations are carried out. Experiments show that the proposed key management scheme enhances the security and increases the availability and integrity. A newly proposed genetic algorithm based feature selection technique has been employed for effective feature selection. Fuzzy SVM has been used on the data set for effective classification. Experiments have shown that the proposed genetic based feature selection algorithm reduces the number of features and hence decreases the classification time, while improving detection accuracy of the fuzzy SVM classifier by minimizing the conflicting rules that may confuse the classifier. The main advantages of this intrusion detection system are the reduction in false positives and increased security. / Infrastructure as a Service (IaaS) är en Cloudtjänstmodell som huvudsakligen är inriktat på att tillhandahålla ett datacenter för behandling och lagring av data. Nätverksaspekterna av en cloudbaserad infrastruktur som en tjänst utanför datacentret utgör en begränsande faktor som förhindrar känsliga kommunikationstjänster från att anamma denna teknik. Cloudnätverk är en ny teknik som integrerar nätverkstillgång med befintliga cloudtjänstmodeller och därmed fullbordar föreställningen av cloud data genom att ta itu med nätverkaspekten. I cloudnätverk virtualiseras delade nätverksresurser, de avsätts till kunder och slutanvändare vid efterfrågan på ett flexibelt sätt. Denna teknik tillåter olika typer av möjligheter, t.ex. att minska latens och belastningen på nätet. Vidare ger detta tjänsteleverantörer ett sätt att tillhandahålla garantier för nätverksprestandan som en del av deras tjänsteutbud. Men denna nya strategi introducerar nya säkerhetsutmaningar, exempelvis VM migration genom offentligt nätverk. Många av dessa säkerhetsutmaningar behandlas i CloNe’s Security Architecture. Denna rapport presenterar en rad av potentiella tekniker för att säkra olika resurser i en cloudbaserad nätverksmiljö som inte behandlas i den redan existerande CloNe Security Architecture. Rapporten inleds med en helhetssyn på cloudbaserad nätverk som beskrivs i Scalable and Adaptive Internet Solutions (SAIL)-projektet, tillsammans med dess föreslagna arkitektur och säkerhetsmål. Detta följs av en översikt över de problem som måste lösas och några av de olika metoder som kan tillämpas för att lösa delar av det övergripande problemet. Speciellt behandlas en omfattande och tätt integrerad multi-säkerhetsarkitektur, en nyckelhanteringsalgoritm som stödjer mekanismens åtkomstkontroll och en mekanism för intrångsdetektering. För varje metod eller för varje uppsättning av metoder, presenteras ståndpunkten för respektive teknik. Dessutom har experimenten för att förstå prestandan av dessa mekanismer utvärderats på testbädd av ett enkelt cloudnätverk. Den föreslagna nyckelhantering system använder en hierarkisk nyckelhantering strategi som ger snabb och säker viktig uppdatering när medlemmar ansluta sig till och medlemmarna lämnar utförs. Försöksresultat visar att den föreslagna nyckelhantering system ökar säkerheten och ökar tillgänglighet och integritet. En nyligen föreslagna genetisk algoritm baserad funktion valet teknik har använts för effektiv funktion val. Fuzzy SVM har använts på de uppgifter som för effektiv klassificering. Försök har visat att den föreslagna genetiska baserad funktion selekteringsalgoritmen minskar antalet funktioner och därmed minskar klassificering tiden, och samtidigt förbättra upptäckt noggrannhet fuzzy SVM klassificeraren genom att minimera de motstående regler som kan förvirra klassificeraren. De främsta fördelarna med detta intrångsdetekteringssystem är den minskning av falska positiva och ökad säkerhet. Cloud Networks Key management mobile agent telco cloud open flow Intrusion Detection System (IDS) Genetic Algorithm (GA) Fuzzy Support Vector Machine (FSVM) tenfold cross validation Communication Systems Kommunikationssystem
337	Low Cost Open Source Modal Virtual Environment Interfaces Using Full Body Motion Tracking and Hand Gesture Recognition Marangoni, Matthew J. 25 May 2013 (has links) No description available. Computer Science Computer Engineering HCI Kinect accelerometer VR OSS gestural interface gesture recognition SVM support vector machine modal interface body tracking virtual environment interaction
338	Using Data Science and Predictive Analytics to Understand 4-Year University Student Churn Whitlock, Joshua Lee 01 May 2018 (has links) (PDF) The purpose of this study was to discover factors about first-time freshmen that began at one of the six 4-year universities in the former Tennessee Board of Regents (TBR) system, transferred to any other institution after their first year, and graduated with a degree or certificate. These factors would be used with predictive models to identify these students prior to their initial departure. Thirty-four variables about students and the institutions that they attended and graduated from were used to perform principal component analysis to examine the factors involved in their decisions. A subset of 18 variables about these students in their first semester were used to perform principal component analysis and produce a set of 4 factors that were used in 5 predictive models. The 4 factors of students who transferred and graduated elsewhere were “Institutional Characteristics,” “Institution’s Focus on Academics,” “Student Aptitude,” and “Student Community.” These 4 factors were combined with the additional demographic variables of gender, race, residency, and initial institution to form a final dataset used in predictive modeling. The predictive models used were a logistic regression, decision tree, random forest, artificial neural network, and support vector machine. All models had predictive power beyond that of random chance. The logistic regression and support vector machine models had the most predictive power, followed by the artificial neural network, random forest, and decision tree models respectively. student retention transfer students data mining support vector machine artificial neural network logistic regression Educational Leadership Higher Education Administration
339	Spatio-temporal Traffic Flow Prediction Gebresilassie, Mesele Atsbeha January 2017 (has links) The advancement in computational intelligence and computational power and the explosionof traffic data continues to drive the development and use of Intelligent TransportSystem and smart mobility applications. As one of the fundamental components of IntelligentTransport Systems, traffic flow prediction research has been advancing from theclassical statistical and time-series based techniques to data–driven methods mainly employingdata mining and machine learning algorithms. However, significant number oftraffic flow prediction studies have overlooked the impact of road network topology ontraffic flow. Thus, the main objective of this research is to show that traffic flow predictionproblems are not only affected by temporal trends of flow history, but also by roadnetwork topology by developing prediction methods in the spatio-temporal.In this study, time–series operators and data mining techniques are used by definingfive partially overlapping relative temporal offsets to capture temporal trends in sequencesof non-overlapping history windows defined on stream of historical record of traffic flowdata. To develop prediction models, two sets of modeling approaches based on LinearRegression and Support Vector Machine for Regression are proposed. In the modelingprocess, an orthogonal linear transformation of input data using Principal ComponentAnalysis is employed to avoid any potential problem of multicollinearity and dimensionalitycurse. Moreover, to incorporate the impact of road network topology in thetraffic flow of individual road segments, shortest path network–distance based distancedecay function is used to compute weights of neighboring road segment based on theprinciple of First Law of Geography. Accordingly, (a) Linear Regression on IndividualSensors (LR-IS), (b) Joint Linear Regression on Set of Sensors (JLR), (c) Joint LinearRegression on Set of Sensors with PCA (JLR-PCA) and (d) Spatially Weighted Regressionon Set of Sensors (SWR) models are proposed. To achieve robust non-linear learning,Support Vector Machine for Regression (SVMR) based models are also proposed.Thus, (a) SVMR for Individual Sensors (SVMR-IS), (b) Joint SVMR for Set of Sensors(JSVMR), (c) Joint SVMR for Set of Sensors with PCA (JSVMR-PCA) and (d) SpatiallyWeighted SVMR (SWSVMR) models are proposed. All the models are evaluatedusing the data sets from 2010 IEEE ICDM international contest acquired from TrafficSimulation Framework (TSF) developed based on the NagelSchreckenberg model.Taking the competition’s best solutions as a benchmark, even though different setsof validation data might have been used, based on k–fold cross validation method, withthe exception of SVMR-IS, all the proposed models in this study provide higher predictionaccuracy in terms of RMSE. The models that incorporated all neighboring sensorsdata into the learning process indicate the existence of potential interdependence amonginterconnected roads segments. The spatially weighted model in SVMR (SWSVMR) revealedthat road network topology has clear impact on traffic flow shown by the varyingand improved prediction accuracy of road segments that have more neighbors in a closeproximity. However, the linear regression based models have shown slightly low coefficientof determination indicating to the use of non-linear learning methods. The resultsof this study also imply that the approaches adopted for feature construction in this studyare effective, and the spatial weighting scheme designed is realistic. Hence, road networktopology is an intrinsic characteristic of traffic flow so that prediction models should takeit into consideration. ITS principal component analysis spatio-temporal traffic flow spatially weighted regression traffic flow prediction support vector machine for regression Engineering and Technology Teknik och teknologier
340	Injury Prediction in Elite Ice Hockey using Machine Learning / Riskanalys och Prediktion av Skador i Elitishockey med Maskininlärning Staberg, Pontus, Häglund, Emil, Claesson, Jakob January 2018 (has links) Sport clubs are always searching for innovative ways to improve performance and obtain a competitive edge. Sports analytics today is focused primarily on evaluating metrics thought to be directly tied to performance. Injuries indirectly decrease performance and cost substantially in terms of wasted salaries. Existing sports injury research mainly focuses on correlating one specific feature at a time to the risk of injury. This paper provides a multidimensional approach to non-contact injury prediction in Swedish professional ice hockey by applying machine learning on historical data. Several features are correlated simultaneously to injury probability. The project’s aim is to create an injury predicting algorithm which ranks the different features based on how they affect the risk of injury. The paper also discusses the business potential and strategy of a start-up aiming to provide a solution for predicting injury risk through statistical analysis. / Idrottsklubbar letar ständigt efter innovativa sätt att förbättra prestation och erhålla konkurrensfördelar. Idag fokuserar data- analys inom idrott främst på att utvärdera mätvärden som tros vara direkt korrelerade med prestation. Skador sänker indirekt prestationen och kostar markant i bortslösade spelarlöner. Tidigare studier på skador inom idrotten fokuserar huvudsakligen på att korrelera ett mätvärde till en skada i taget. Den här rapporten ger ett multidimensionellt angreppssätt till att förutse skador inom svensk elitishockey genom att applicera maskininlärning på historisk data. Flera attribut korreleras samtidigt för att få fram en skadesannolikhet. Målet med den här rapporten är att skapa en algoritm för att förutse skador och även ranka olika attribut baserat på hur de påverkar skaderisken. I rapporten diskuteras även affärsmöjligheterna för en sådan lösning och hur en potentiell start-up ska positionera sig på marknaden. Sports analytics computer science machine learn- ing ice hockey non-contact injuries predictive analytics support vector machine random forest SHL Computer and Information Sciences Data- och informationsvetenskap

Search results