191 |
Inomhuspositionering med bredbandig radioGustavsson, Oscar, Miksits, Adam January 2019 (has links)
In this report it is evaluated whether a higher dimensional fingerprint vector increases accuracy of an algorithm for indoor localisation. Many solutions use a Received Signal Strength Indicator (RSSI) to estimate a position. It was studied if the use of the Channel State Information (CSI), i.e. the channel’s frequency response, is beneficial for the accuracy.The localisation algorithm estimates the position of a new measurement by comparing it to previous measurements using k-Nearest Neighbour (k-NN) regression. The mean power was used as RSSI and 100 samples of the frequency response as CSI. Reduction of the dimension of the CSI vector with statistical moments and Principal Component Analysis (PCA) was tested. An improvement in accuracy could not be observed by using a higher dimensional fingerprint vector than RSSI. A standardised Euclidean or Mahalanobis distance measure in the k-NN algorithm seemed to perform better than Euclidean distance. Taking the logarithm of the frequency response samples before doing any calculation also seemed to improve accuracy. / I denna rapport utvärderas huruvida data av högre dimension ökar noggrannheten hos en algoritm för inomhuspositionering. Många lösningar använder en indikator för mottagen signalstyrka (RSSI) för att skatta en position. Det studerades studerade om användningen av kanalens fysikaliska tillstånd (CSI), det vill säga kanalens frekvenssvar, är fördelaktig för noggrannheten.Positioneringsalgoritmen skattar positionen för en ny mätning genom att jämföra den med tidigare mätningar med k-Nearest Neighbour (k-NN)-regression. Medeleffekten användes som RSSI och 100 sampel av frekvenssvaret som CSI. Reducering av CSI vektornsdimension med statistiska moment och Principalkomponentanalys(PCA) testades. En förbättring av noggrannheten kunde inte observeras genom att använda data med högre dimension än RSSI. Ett standardiserat Euklidiskt eller Mahalanobis avståndsåatt i k-NN-algoritmen verkade prestera bättre än Euklidiskt avstånd. Att ta logaritmen av frekvenssvarets sampel innan andra beräkningar gjordes verkade också förbättra noggrannheten.
|
192 |
Prediktion av efterfrågan i filmbranschen baserat på maskininlärningLiu, Julia, Lindahl, Linnéa January 2018 (has links)
Machine learning is a central technology in data-driven decision making. In this study, machine learning in the context of demand forecasting in the motion picture industry from film exhibitors’ perspective is investigated. More specifically, it is investigated to what extent the technology can assist estimation of public interest in terms of revenue levels of unreleased movies. Three machine learning models are implemented with the aim to forecast cumulative revenue levels during the opening weekend of various movies which were released in 2010-2017 in Sweden. The forecast is based on ten attributes which range from public online user-generated data to specific movie characteristics such as production budget and cast. The results indicate that the choice of attributes as well as models in this study were not optimal on the Swedish market as the retrieved values from relevant precision metrics were inadequate, however with valid underlying reasons. / Maskininlärning är en central teknik i datadrivet beslutsfattande. I den här rapporten utreds maskininlärning isammanhanget av efterfrågeprediktion i filmbranschen från biografers perspektiv. Närmare bestämt undersöks det i vilken utsträckningtekniken kan bistå uppskattning av publikintresse i termer av intäkter vad gäller osläppta filmer hos biografer. Tremaskininlärningsmodeller implementeras i syfte att göra en prognos på kumulativa intäktsnivåer under premiärhelgen för filmer vilkahade premiär 2010-2017 i Sverige. Prognostiseringen baseras på varierande attribut som sträcker sig från publik användargenererad data på nätet till filmspecifika variabler så som produktionsbudget och uppsättning av skådespelare. De erhållna resultaten visar att valen av attribut och modeller inte var optimala på den svenska marknaden då erhållna precisionsmått från modellerna antog låga värden, med relevanta underliggande skäl.
|
193 |
Identifying the beginning of a kayak race using velocity signal dataKvedaraite, Indre January 2023 (has links)
A kayak is a small watercraft that moves over the water. The kayak is propelled by a person sitting inside of the hull and paddling using a double-bladed paddle. While kayaking can be casual, it is used as a competitive sport in races and even the Olympic games. Therefore, it is important to be able to analyse athletes’ performance during the race. To study the race better, some kayaking teams and organizations have attached sensors to their kayaks. These sensors record various data, which is later used to generate performance reports. However, to generate such reports, the coach must manually pinpoint the beginning of the race because the sensors collect data before the actual race begins, which may include practice runs, warming-up sessions, or just standing and waiting position. The identification of the race start and the race sequence in the data is tedious and time-consuming work and could be automated. This project proposes an approach to identify kayak races from velocity signal data with the help of a machine learning algorithm. The proposed approach is a combination of several techniques: signal preprocessing, a machine learning algorithm, and a programmatic approach. Three machine learning algorithms were evaluated to detect the race sequence, which are Support Vector Machine (SVM), k-Nearest Neighbour (kNN), and Random Forest (RF). SVM outperformed other algorithms with an accuracy of 95%. Programmatic approach was proposed to identify the start time of the race. The average error of the proposed approach is 0.24 seconds. The proposed approach was utilized in the implemented web-based application with a user interface for coaches to automatically detect the beginning of a kayak race and race signal sequence.
|
194 |
Data mining inom tillverkningsindustrin : En fallstudie om möjligheten att förutspå kvalitetsutfall i produktionslinjerJanson, Lisa, Mathisson, Minna January 2021 (has links)
I detta arbete har en fallstudie utförts på Volvo Group i Köping. I takt med ¨övergången till industri 4.0, ökar möjligheterna att använda maskininlärning som ett verktyg i analysen av industriell data och vidareutvecklingen av industriproduktionen. Detta arbete syftar till att undersöka möjligheten att förutspå kvalitetsutfall vid sammanpressning av nav och huvudaxel. Metoden innefattar implementering av tre maskininlärningsmodeller samt evaluering av dess prestation i förhållande till varandra. Vid applicering av modellerna på monteringsdata från fabriken erhölls ett bristfälligt resultat, vilket indikerar att det utifrån de inkluderade variablerna inte är möjligt att förutspå kvalitetsutfallet. Orsakerna som låg till grund för resultatet granskades, och det resulterade i att det förmodligen berodde på att modellerna var oförmögna att finna samband i datan eller att det inte fanns något samband i datasetet. För att avgöra vilken av dessa två faktorer som var avgörande skapades ett fabricerat dataset där tre nya variabler introducerades. De fabricerade värdena på dessa variabler skapades på sådant sätt att det fanns syntetisk kausalitet mellan två av variablerna och kvalitetsutfallet. Vid applicering av modellerna på den fabricerade datan, lyckades samtliga modeller identifiera det syntetiska sambandet. Utifrån det drogs slutsatsen att det bristfälliga resultatet inte berodde på modellernas prestation utan att det inte fanns något samband i datasetet bestående av verklig monteringsdata. Det här bidrog till bedömningen att om spårbarheten på komponenterna hade ökat i framtiden, i kombination med att fler maskiner i produktionslinjen genererade data till ett sammankopplat system, skulle denna studie kunna utföras igen, men med fler variabler och ett större dataset. Support vector machine var den modell som presterade bäst, givet de prestationsmått som användes i denna studie. Det faktum att modellerna som inkluderats i den här studien lyckades identifiera sambandet i datan, när det fanns vetskap om att sambandet existerade, motiverar användandet av dessa modeller i framtida studier. Avslutningsvis kan det konstateras att med förbättrad spårbarhet och en allt mer uppkopplad fabrik, finns det möjlighet att använda maskininlärningsmodeller som komponenter i större system för att kunna uppnå effektiviseringar. / As the adaptation towards Industry 4.0 proceeds, the possibility of using machine learning as a tool for further development of industrial production, becomes increasingly profound. In this paper, a case study has been conducted at Volvo Group in Köping, in order to investigate the wherewithals of predicting quality outcomes in the compression of hub and mainshaft. In the conduction of this study, three different machine learning models were implemented and compared amongst each other. A dataset containing data from Volvo’s production site in Köping was utilized when training and evaluating the models. However, the low evaluation scores acquired from this, indicate that the quality outcome of the compression could not be predicted given solely the variables included in that dataset. Therefore, a dataset containing three additional variables consisting of fabricated values and a known causality between two of the variables and the quality outcome, was also utilized. The purpose of this was to investigate whether the poor evaluation metrics resulted from a non-existent pattern between the included variables and the quality outcome, or from the models not being able to find the pattern. The performance of the models, when trained and evaluated on the fabricated dataset, indicate that the models were in fact able to find the pattern that was known to exist. Support vector machine was the model that performed best, given the evaluation metrics that were chosen in this study. Consequently, if the traceability of the components were to be enhanced in the future and an additional number of machines in the production line would transmit production data to a connected system, it would be possible to conduct the study again with additional variables and a larger data set. The fact that the models included in this study succeeded in finding patterns in the dataset when such patterns were known to exist, motivates the use of the same models. Furthermore, it can be concluded that with enhanced traceability of the components and a larger amount of machines transmitting production data to a connected system, there is a possibility that machine learning models could be utilized as components in larger business monitoring systems, in order to achieve efficiencies.
|
195 |
Efficient Algorithms for Data Mining with Federated DatabasesYoung, Barrington R. St. A. 03 July 2007 (has links)
No description available.
|
196 |
A Parallel Algorithm for Query Adaptive, Locality Sensitive Hash SearchCarraher, Lee A. 17 September 2012 (has links)
No description available.
|
197 |
Statistics of Quantum Energy Levels of Integrable Systems and a Stochastic Network Model with Applications to Natural and Social SciencesMa, Tao 18 October 2013 (has links)
No description available.
|
198 |
Predicting basketball performance based on draft pick : A classification analysisHarmén, Fredrik January 2022 (has links)
In this thesis, we will look to predict the performance of a basketball player coming into the NBA depending on where the player was picked in the NBA draft. This will be done by testing different machine learning models on data from the previous 35 NBA drafts and then comparing the models in order to see which model had the highest accuracy of classification. The machine learning methods used are Linear Discriminant Analysis, K-Nearest Neighbors, Support Vector Machines and Random Forests. The results show that the method with the highest accuracy of classification was Random Forests, with an accuracy of 42%.
|
199 |
Investigating the performance of matrix factorization techniques applied on purchase data for recommendation purposesHolländer, John January 2015 (has links)
Automated systems for producing product recommendations to users is a relatively new area within the field of machine learning. Matrix factorization techniques have been studied to a large extent on data consisting of explicit feedback such as ratings, but to a lesser extent on implicit feedback data consisting of for example purchases.The aim of this study is to investigate how well matrix factorization techniques perform compared to other techniques when used for producing recommendations based on purchase data. We conducted experiments on data from an online bookstore as well as an online fashion store, by running algorithms processing the data and using evaluation metrics to compare the results. We present results proving that for many types of implicit feedback data, matrix factorization techniques are inferior to various neighborhood- and association rules techniques for producing product recommendations. We also present a variant of a user-based neighborhood recommender system algorithm \textit{(UserNN)}, which in all tests we ran outperformed both the matrix factorization algorithms and the k-nearest neighbors algorithm regarding both accuracy and speed. Depending on what dataset was used, the UserNN achieved a precision approximately 2-22 percentage points higher than those of the matrix factorization algorithms, and 2 percentage points higher than the k-nearest neighbors algorithm. The UserNN also outperformed the other algorithms regarding speed, with time consumptions 3.5-5 less than those of the k-nearest neighbors algorithm, and several orders of magnitude less than those of the matrix factorization algorithms.
|
200 |
IMPROVING THE UTILIZATION AND PERFORMANCE OF SPECIALIZED GPU CORESAaron M Barnes (20767127) 26 February 2025 (has links)
<p dir="ltr">Specialized hardware accelerators are becoming increasingly common to provide application performance gain despite the slowing trend of transistor scaling. Accelerators can adapt to the compute and data dependency patterns of an application to fully exploit the parallelism of the application and reduce data movement. However, specialized hardware is often limited by the application it was tailored to, which can lead to idle or inactive silicon in computations that do not match the exact patterns it was designed for. In this work I study two cases of GPU specialization and techniques that can be used to improve performance in a broader domain of applications. </p><p dir="ltr">First, I examine the effects of GPU core partitioning, a trend in contemporary GPUs to sub-divide core components to reduce area and energy overheads. Core partitioning is essentially a specialization of the hardware towards balanced applications, wherein the intra-core connectivity provides minimal benefit but takes up valuable on-chip area. I identify four orthogonal performance effects of GPU core sub-division, two of which have significant impact in practice: a bottleneck in the read operand stage caused by the reduced number of collector units and register banks allocated to each sub-core, and an instruction issue imbalance across sub-core schedulers caused by a simple round robin assignment of threads to sub-cores. To alleviate these issues I propose a Register Bank Aware (RBA) warp scheduler, which uses feedback from current register bank contention to inform thread scheduling decisions, and a hashed sub-core work scheduler to prevent pathological issue imbalances caused by round robin scheduling. I rigorously evaluate these designs in simulation and show they are able to capture 81% of the performance lost due to core subdivision. Further, I evaluate my techniques using synthesis tools and find that RBA is able to achieve performance equivalent to doubling the number of operand Collector Units (CUs) per sub-core with only a 1% increase in area and power.</p><p dir="ltr">Second, I study the inclusion of specialized ray tracing accelerator cores on GPUs. Specialized ray-tracing acceleration units have become a common feature in GPU hardware, enabling real-time ray-tracing of complex scenes for the first time. The ray-tracing unit accelerates the traversal of a hierarchical tree data structure called a bounding volume hierarchy to determine whether rays have intersected triangle primitives. Hierarchical search algorithms are a fundamental software pattern common in many important domains, such as recommendation systems and point cloud registration, but are difficult for GPUs to accelerate because they are characterized by extensive branching and recursion. The ray-tracing unit overcomes these limitations with specialized hardware to traverse hierarchical data structures efficiently, but is mired by a highly specialized graphics API, which is not readily adaptable to general-purpose computation. In this work I present the Hierarchical Search Unit (HSU), a flexible datapath to accelerate a more general class of hierarchical search algorithms, of which ray-tracing is one. I synthesize a baseline ray-intersection datapath and maximize functional unit reuse while extending the ray-tracing unit to support additional computations and a more general set of instructions. I demonstrate that the unit can improve the performance of three hierarchical search data structures in approximate nearest neighbors search algorithms and a B-tree key-value store index. For a minimal extension to the existing unit, HSU improves the state-of-the-art GPU approximate nearest neighbor implementation by an average of 24.8% using the GPU's general computing interface.</p>
|
Page generated in 0.0359 seconds