Spelling suggestions: "subject:"nearest neighbour"" "subject:"nearest neighbourg""
41 |
Optimalizace trasy při revizích elektrospotřebičů / Route optimalization of inspectory technicianRusín, Michal January 2008 (has links)
Objective of this thesis is optimalization of route for inspectory technician. There were described traveling Salesman problem, vehicle Routing problem and it's modifications. Problem was solved by this three heuristics: nearest neighbour algorithm, savings method and insert method.
|
42 |
Detektion och klassificering av äppelmognad i hyperspektrala bilder / Detection And Classification Of Apple Ripening In Hyperspectral ImagesAndersson, Fanny, Furugård, Anna January 2021 (has links)
Detta arbete presenterar en icke-destruktiv metod för att detektera och klassificera mognadsgraden hos äpplen med användning av hyperspektrala bilder. Fastställning av mognadsgraden hos äpplen är intressant för bland annat äppelodlare och musterier vid lagring och beredning. Äpplens mognadsgrad är även intressant inom växtförädling. För att fastställa mognadsgraden idag krävs att det skärs i frukten, en så kallad destruktiv metod. Hyperspektrala bilder kan idag användas inom områden som jordbruk, miljöövervakning och militär spaning. / <p>Examensarbetet är utfört vid Institutionen för teknik och naturvetenskap (ITN) vid Tekniska fakulteten, Linköpings universitet</p>
|
43 |
Sledování pohybu objektů v obrazovém signálu / Tracking the movement of objects in the video signalŠidó, Balázs January 2017 (has links)
Tato diplomova prace se zameruje na sledovani pohybu vice objektu. Prace popisuje dve implementace filtru, ktere jsou v podstate zalozeny na principu Kalmanova filtru. Obe implementace jsou zalozeny na principu sledovani vice objektu, na zaklade znalosti pozic vsech objektu v kazdem snimku. Prvni implementace je smisena verze Globalniho a Standardniho filtru nejblizsich sousedu. Druha implementace je postavena na pravde- podobnostnim pristupu k procesu sdruzeni. Posledni kapitola poskytuje srovnani mezi temito filtry a Zakladnim filtrem. Algoritmy byly realizovany v jave.
|
44 |
Credit Scoring using Machine Learning ApproachesChitambira, Bornvalue January 2022 (has links)
This project will explore machine learning approaches that are used in creditscoring. In this study we consider consumer credit scoring instead of corporatecredit scoring and our focus is on methods that are currently used in practiceby banks such as logistic regression and decision trees and also compare theirperformance against machine learning approaches such as support vector machines (SVM), neural networks and random forests. In our models we addressimportant issues such as dataset imbalance, model overfitting and calibrationof model probabilities. The six machine learning methods we study are support vector machine, logistic regression, k-nearest neighbour, artificial neuralnetworks, decision trees and random forests. We implement these models inpython and analyse their performance on credit dataset with 30000 observations from Taiwan, extracted from the University of California Irvine (UCI)machine learning repository.
|
45 |
Forecasting hourly electricity consumption for sets of households using machine learning algorithmsLinton, Thomas January 2015 (has links)
To address inefficiency, waste, and the negative consequences of electricity generation, companies and government entities are looking to behavioural change among residential consumers. To drive behavioural change, consumers need better feedback about their electricity consumption. A monthly or quarterly bill provides the consumer with almost no useful information about the relationship between their behaviours and their electricity consumption. Smart meters are now widely dispersed in developed countries and they are capable of providing electricity consumption readings at an hourly resolution, but this data is mostly used as a basis for billing and not as a tool to assist the consumer in reducing their consumption. One component required to deliver innovative feedback mechanisms is the capability to forecast hourly electricity consumption at the household scale. The work presented by this thesis is an evaluation of the effectiveness of a selection of kernel based machine learning methods at forecasting the hourly aggregate electricity consumption for different sized sets of households. The work of this thesis demonstrates that k-Nearest Neighbour Regression and Gaussian process Regression are the most accurate methods within the constraints of the problem considered. In addition to accuracy, the advantages and disadvantages of each machine learning method are evaluated, and a simple comparison of each algorithms computational performance is made. / För att ta itu med ineffektivitet, avfall, och de negativa konsekvenserna av elproduktion så vill företag och myndigheter se beteendeförändringar bland hushållskonsumenter. För att skapa beteendeförändringar så behöver konsumenterna bättre återkoppling när det gäller deras elförbrukning. Den nuvarande återkopplingen i en månads- eller kvartalsfaktura ger konsumenten nästan ingen användbar information om hur deras beteenden relaterar till deras konsumtion. Smarta mätare finns nu överallt i de utvecklade länderna och de kan ge en mängd information om bostäders konsumtion, men denna data används främst som underlag för fakturering och inte som ett verktyg för att hjälpa konsumenterna att minska sin konsumtion. En komponent som krävs för att leverera innovativa återkopplingsmekanismer är förmågan att förutse elförbrukningen på hushållsskala. Arbetet som presenteras i denna avhandling är en utvärdering av noggrannheten hos ett urval av kärnbaserad maskininlärningsmetoder för att förutse den sammanlagda förbrukningen för olika stora uppsättningar av hushåll. Arbetet i denna avhandling visar att "k-Nearest Neighbour Regression" och "Gaussian Process Regression" är de mest exakta metoder inom problemets begränsningar. Förutom noggrannhet, så görs en utvärdering av fördelar, nackdelar och prestanda hos varje maskininlärningsmetod.
|
46 |
Power Studies of Multivariate Two-Sample Tests of ComparisonSiluyele, Ian John January 2007 (has links)
Masters of Science / The multivariate two-sample tests provide a means to test the match between two multivariate distributions. Although many tests exist in the literature, relatively little is known about the relative power of these procedures. The studies reported in the thesis contrasts the effectiveness, in terms of power, of seven such tests with a Monte Carlo study. The relative power of the tests was investigated against location, scale, and correlation alternatives. Samples were drawn from bivariate exponential, normal and uniform populations. Results
from the power studies show that there is no single test which is the most powerful in all situations. The use of particular test statistics is recommended for specific alternatives. A possible supplementary non-parametric graphical procedure, such as the Depth-Depth plot, can be recommended for diagnosing possible differences between the multivariate samples, if the null hypothesis is rejected. As an example of the utility of the procedures for real data, the multivariate two-sample tests were applied to photometric data of twenty galactic globular
clusters. The results from the analyses support the recommendations associated with specific test statistics.
|
47 |
Evaluating Random Forest and k-Nearest Neighbour Algorithms on Real-Life Data Sets / Utvärdering av slumpmässig skog och k-närmaste granne algoritmer på verkliga datamängderSalim, Atheer, Farahani, Milad January 2023 (has links)
Computers can be used to classify various types of data, for example to filter email messages, detect computer viruses, detect diseases, etc. This thesis explores two classification algorithms, random forest and k-nearest neighbour, to understand how accurately and how quickly they classify data. A literature study was conducted to identify the various prerequisites and to find suitable data sets. Five different data sets, leukemia, credit card, heart failure, mushrooms and breast cancer, were gathered and classified by each algorithm. A train split and a 4-fold cross-validation for each data set was used. The Rust library SmartCore, which included numerous classification methods and tools, was used to perform the classification. The results gathered indicated that using the train split resulted in better classification results, as opposed to 4-fold cross-validation. However, it could not be determined if any attributes of a data set affect the classification accuracy. Random forest managed to achieve the best classification results on the two data sets heart failure and leukemia, whilst k-nearest neighbour achieved the best classification results on the remaining three data sets. In general the classification results on both algorithms were similar. Based on the results, the execution time of random forest was dependent on the number of trees in the ”forest”, in which a greater number of trees resulted in an increased execution time. In contrast, a higher k value did not increase the execution time of k-nearest neighbour. It was also found that data sets with only binary values (0 and 1) run much faster than a data set with arbitrary values when using random forest. The number of instances in a data set also leads to an increased execution time for random forest despite a small number of features. The same applied to k-nearest neighbour, but with the number of features also affecting the execution since time is needed to compute distances between data points. Random forest managed to achieve the fastest execution time on the two data sets credit card and mushrooms, whilst k-nearest neighbour executed faster on the remaining three data sets. The difference in execution time between the algorithms varied a lot and this depends on the parameter value chosen for the respective algorithm. / Datorer kan användas för att klassificera olika typer av data, t.ex att filtrera e-postmeddelanden, upptäcka datorvirus, upptäcka sjukdomar, etc. Denna avhandling utforskar två klassificeringsalgoritmer, slumpmässiga skogar och k-närmaste grannar, för att förstå hur precist och hur snabbt de klassificerar data. En litteraturstudie genomfördes för att identifiera de olika förutsättningarna och för att hitta lämpliga datamängder. Fem olika datamängder, leukemia, credit card, heart failure, mushrooms och breast cancer, samlades in och klassificerades av varje algoritm. En träningsfördelning och en 4-faldig korsvalidering för varje datamängd användes. Rust-biblioteket SmartCore, som inkluderade många klassificeringsmetoder och verktyg, användes för att utföra klassificeringen. De insamlade resultaten visade att användningen av träningsfördelning resulterade i bättre klassificeringsresultat i motsats till 4-faldig korsvalidering. Det gick dock inte att fastställa om några attribut för en datamängd påverkar klassificeringens noggrannhet. Slumpmässiga skogar lyckades uppnå det bästa klassificeringsresultaten på de två datamängderna heart failure och leukemia, medan k-närmaste granne uppnådde det bästa klassificeringsresultaten på de återstående tre datamängderna. I allmänhet var klassificeringsresultaten för båda algoritmerna likartade. Utifrån resultaten var utförandetiden för slumpmässiga skogar beroende av antalet träd i ”skogen”, då ett större antal träd resulterade i en ökad utförandetid. Däremot ökade inte ett högre k-värde exekveringstiden för k-närmaste grannar. Det upptäcktes även att datamängder med endast binära värden (0 och 1) körs mycket snabbare än datamängder med godtyckliga värden när man använder slumpmässiga skogar. Antalet instanser i en datamängd leder också till en ökad exekveringstid för slumpmässiga skogar trots ett litet antal egenskaper. Detsamma gällde för k-närmaste granne, men även antalet egenskaper påverkade exekveringstiden då tid behövs för att beräkna avstånd mellan datapunkter. Slumpmässiga skogar lyckades uppnå den snabbaste exekveringstiden på de två datamängderna credit card och mushrooms, medan k-närmaste granne exekverades snabbare på de återstående tre datamängderna. Skillnaden i exekveringstid mellan algoritmerna varierade mycket och detta beror på vilket parametervärde som valts för respektive algoritm.
|
48 |
Bedömning av elevuppsatser genom maskininlärning / Essay Scoring for Swedish using Machine LearningDyremark, Johanna, Mayer, Caroline January 2019 (has links)
Betygsättning upptar idag en stor del av lärares arbetstid och det finns en betydande inkonsekvens vid bedömning utförd av olika lärare. Denna studie ämnar undersöka vilken träffsäkerhet som en automtiserad bedömningsmodell kan uppnå. Tre maskininlärningsmodeller för klassifikation i form av Linear Discriminant Analysis, K-Nearest Neighbor och Random Forest tränas och testas med femfaldig korsvalidering på uppsatser från nationella prov i svenska. Klassificeringen baseras på språk och formrelaterade attribut inkluderande ord och teckenvisa längdmått, likhet med texter av olika formalitetsgrad och grammatikrelaterade mått. Detta utmynnar i ett maximalt quadratic weighted kappa-värde på 0,4829 och identisk överensstämmelse med expertgivna betyg i 57,53 % av fallen. Dessa resultat uppnåddes av en modell baserad på Linear Discriminant Analysis och uppvisar en högre korrelation med expertgivna betyg än en ordinarie lärare. Trots pågående digitalisering inom skolväsendet kvarstår ett antal hinder innan fullständigt maskininlärningsbaserad bedömning kan realiseras, såsom användarnas inställning till tekniken, etiska dilemman och teknikens svårigheter med förståelse av semantik. En delvis integrerad automatisk betygssättning har dock potential att identifiera uppsatser där behov av dubbelrättning föreligger, vilket kan öka överensstämmelsen vid storskaliga prov till en låg kostnad. / Today, a large amount of a teacher’s workload is comprised of essay scoring and there is a large variability between teachers’ gradings. This report aims to examine what accuracy can be acceived with an automated essay scoring system for Swedish. Three following machine learning models for classification are trained and tested with 5-fold cross-validation on essays from Swedish national tests: Linear Discriminant Analysis, K-Nearest Neighbour and Random Forest. Essays are classified based on 31 language structure related attributes such as token-based length measures, similarity to texts with different formal levels and use of grammar. The results show a maximal quadratic weighted kappa value of 0.4829 and a grading identical to expert’s assessment in 57.53% of all tests. These results were achieved by a model based on Linear Discriminant Analysis and showed higher inter-rater reliability with expert grading than a local teacher. Despite an ongoing digitilization within the Swedish educational system, there are a number of obstacles preventing a complete automization of essay scoring such as users’ attitude, ethical issues and the current techniques difficulties in understanding semantics. Nevertheless, a partial integration of automatic essay scoring has potential to effectively identify essays suitable for double grading which can increase the consistency of large-scale tests to a low cost.
|
49 |
Simulating ADS-B vulnerabilities by imitating aircrafts : Using an air traffic management simulator / Simulering av ADS-B sårbarheter genom imitering av flygplan : Med hjälp av en flyglednings simulatorBoström, Axel, Börjesson, Oliver January 2022 (has links)
Air traffic communication is one of the most vital systems for air traffic management controllers. It is used every day to allow millions of people to travel safely and efficiently across the globe. But many of the systems considered industry-standard are used without any sort of encryption and authentication meaning that they are vulnerable to different wireless attacks. In this thesis vulnerabilities within an air traffic management system called ADS-B will be investigated. The structure and theory behind this system will be described as well as the reasons why ADS-B is unencrypted. Two attacks will then be implemented and performed in an open-source air traffic management simulator called openScope. ADS-B data from these attacks will be gathered and combined with actual ADS-B data from genuine aircrafts. The collected data will be cleaned and used for machine learning purposes where three different algorithms will be applied to detect attacks. Based on our findings, where two out of the three machine learning algorithms used were able to detect 99.99% of the attacks, we propose that machine learning algorithms should be used to improve ADS-B security. We also think that educating air traffic controllers on how to detect and handle attacks is an important part of the future of air traffic management.
|
50 |
MIMO block-fading channels with mismatched CSIAsyhari, A.Taufiq, Guillen i Fabregas, A. 23 August 2014 (has links)
Yes / We study transmission over multiple-input multiple-output (MIMO) block-fading channels with
imperfect channel state information (CSI) at both the transmitter and receiver. Specifically, based on
mismatched decoding theory for a fixed channel realization, we investigate the largest achievable rates
with independent and identically distributed inputs and a nearest neighbor decoder. We then study the
corresponding information outage probability in the high signal-to-noise ratio (SNR) regime and analyze
the interplay between estimation error variances at the transmitter and at the receiver to determine
the optimal outage exponent, defined as the high-SNR slope of the outage probability plotted in a
logarithmic-logarithmic scale against the SNR. We demonstrate that despite operating with imperfect
CSI, power adaptation can offer substantial gains in terms of outage exponent. / A. T. Asyhari was supported in part by the Yousef Jameel Scholarship, University of Cambridge, Cambridge, U.K., and the National Science Council of Taiwan under grant NSC 102-2218-E-009-001. A. Guillén i Fàbregas was supported in part by the European Research Council under ERC grant agreement 259663 and the Spanish Ministry of Economy and Competitiveness under grant TEC2012-38800-C03-03.
|
Page generated in 0.0615 seconds