Global ETD Search

Return to search

Flight Sorting Algorithm Based on Users’ Behaviour

The model predicts the best flight order and recommend best flight to users. The thesis could be divided into the following three parts: Feature choosing, data-preprocessing, and various algorithms experiment. For feature choosing, besides the original information of flight itself, we add the user’s selection status into our model, which the flight class is, together with children or not. In the data preprocessing stage, data cleaning is used to process incomplete and repeated data. Then a normalization method removes the noise in the data. After various balancing processing, the class-imbalance data is corrected best with SMOTE method. Based on our existing data, I choose the classification model and Sequential ranking algorithm. Use price, direct flight or not, travel time, etc. as features, and click or not as label. The classification algorithms I used includes Logistic Regression, Gradient Boosting, KNN, Decision Tree, Random Forest, Gaussian Process Classifier, Gaussian NB Bayesian and Quadratic Discriminant Analysis. In addition, we also adopted Sequential ranking algorithm. The results show that Random Forest-SMOTE performs best with AUC of ROC=0.94, accuracy=0.8998. / Modellen förutsäger den bästa flygordern och rekommenderar bästa flyg till användarna. Avhandlingen kan delas in i följande tre delar: Funktionsval, databehandling och olika algoritms experiment. För funktionsval, förutom den ursprungliga informationen om själva flygningen, lägger vi till användarens urvalsstatus i vår modell, vilken flygklassen är , tillsammans med barn eller inte. Datarengöring används för att hantera dubbletter och ofullständiga data. Därefter tar en normaliserings metod bort bruset i data. Efter olika balanserings behandlingar är SMOTE-metoden mest lämplig för att korrigera klassobalans flyg data. Baserat på våra befintliga data väljer jag klassificerings modell och sekventiell ranknings algoritm. Använd pris, direktflyg eller inte, restid etc. som funktioner, och klicka eller inte som etikett. Klassificerings algoritmerna som jag använde inkluderar Logistic Regression, Gradient Boost, KNN, Decision Tree, Random Forest, Gaussian Process Classifier, Gaussian NB Bayesian and Quadratic Discriminant Analysis. Dessutom antog vi också Sequential ranking algoritm. Resultaten visar att Random Forest-SMOTE presterar bäst med AUC för ROC = 0.94, noggrannhet = 0.8998.

http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-294132

Ranking systems

recommendation systems

rekommendationssystem

maskininlärning

djupinlärning

databalans

Elektroteknik och elektronik

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-294132
Date	January 2021
Creators	Ben, Qingyan
Publisher	KTH, Skolan för elektroteknik och datavetenskap (EECS)
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess
Relation	TRITA-EECS-EX ; 2021:119

Page generated in 0.0331 seconds

Flight Sorting Algorithm Based on Users’ Behaviour

Description

Links & Downloads

Tags

Additional Fields