Global ETD Search

41	Modeling Melodic Accents in Jazz Solos / Modellering av melodiska accenter i jazzsolon Berrios Salas, Misael January 2023 (has links) This thesis looks at how accurately one can model accents in jazz solos, more specifically the sound level. Further understanding the structure of jazz solos can give a way of pedagogically presenting differences within music styles and even between performers. Some studies have tried to model perceived accents in different music styles. In other words, model how listeners perceive some tones as somehow accentuated and more important than others. Other studies have looked at how the sound level correlates to other attributes of the tone. But to our knowledge, no other studies have been made modeling actual accents within jazz solos, nor have other studies had such a big amount of training data. The training data used is a set of 456 solos from the Weimar Jazz Database. This is a database containing tone data and metadata from monophonic solos performed with multiple instruments. The features used for the training algorithms are features obtained from the software Director Musices created at the Royal Institute of Technology in Sweden; features obtained from the software "melfeature" created at the University of Music Franz Liszt Weimar in Germany; and features built upon tone data or solo metadata from the Weimar Jazz Database. A comparison between these is made. Three learning algorithms are used, Multiple Linear Regression (MLR), Support Vector Regression (SVR), and eXtreme Gradient Boosting (XGBoost). The first two are simpler regression models while the last is an award-winning tree boosting algorithm. The tests resulted in eXtreme Gradient Boosting (XGBoost) having the highest accuracy when combining all the available features minus some features that were removed since they did not improve the accuracy. The accuracy was around 27% with a high standard deviation. This tells that there was quite some difference when predicting the different solos, some had an accuracy of about 67% while others did not predict one tone correctly in the entire solo. But as a general model, the accuracy is too low for actual practical use. Either the methods were not the optimal ones or jazz solos differ too much to find a general pattern. / Detta examensarbete undersöker hur väl man kan modellera accenter i jazz-solos, mer specifikt ljudnivån. En bredare förståelse för strukturen i jazzsolos kan ge ett sätt att pedagogiskt presentera skillnaderna mellan olika musikstilar och även mellan olika artister. Andra studier har försökt modellera uppfattade accenter inom olika musik-stilar. Det vill säga, modellera hur åhörare upplever vissa toner som accentuerade och viktigare än andra. Andra studier har undersökt hur ljudnivån är korrelerad till andra attribut hos tonen. Men såvitt vi vet, så finns det inga andra studier som modellerar faktiska accenter inom jazzsolos, eller som haft samma stora mängd träningsdata. Träningsdatan som använts är ett set av 456 solos tagna från Weimar Jazz Database. Databasen innehåller data på toner och metadata från monofoniska solos genomförda med olika instrument. Särdragen som använts för tränings-algoritmerna är särdrag erhållna från mjukvaran Director Musices skapad på Kungliga Tekniska Högskolan i Sverige; särdrag erhållna från mjukvaran ”melfeature” skapad på University of Music Franz Liszt Weimar i Tyskland; och särdrag skapade utifrån datat i Weimar Jazz Database. En jämförelse mellan dessa har också gjorts. Tre inlärningsalgoritmer har använts, Multiple Linear Regression (MLR), Support Vector Regression (SVR), och eXtreme Gradient Boosting (XGBoost). De första två är enklare regressionsalgoritmer, medan den senare är en prisbelönt trädförstärkningsalgoritm. Testen resulterade i att eXtreme Gradient Boosting (XGBoost) skapade en modell med högst noggrannhet givet alla tillgängliga särdrag som träningsdata minus vissa särdrag som tagits bort då de inte förbättrar noggrannheten. Den erhållna noggrannheten låg på runt 27% med en hög standardavvikelse. Detta pekar på att det finns stora skillnader mellan att förutsäga ljudnivån mellan de olika solin. Vissa solin gav en noggrannhet på runt 67% medan andra erhöll inte en endaste ljudnivå korrekt i hela solot. Men som en generell modell är noggrannheten för låg för att användas i praktiken. Antingen är de valda metoderna inte de bästa, eller så är jazzsolin för olika för att hitta ett generellt mönster som går att förutsäga. Accents Jazz Solo Support Vector Regression (SVR) eXtreme Gradient Boosting (XGBoost) Multiple Linear Regression (MLR) Dynamic Accenter Jazz Solos Support Vector Regression (SVR) eXtreme Gradient Boosting (XGBoost) Multiple Linear Regression (MLR) Dynamisk Computer and Information Sciences Data- och informationsvetenskap
42	Using Machine Learning to Detect Customer Acquisition Opportunities and Evaluating the Required Organizational Prerequisites Malmberg, Olle, Zhou, Bobby January 2019 (has links) This paper aims to investigate whether or not it is possible to identify users who are about change provider of service with machine learning. It is believed that the Consumer Decision Journey is a better model than traditional funnel models when it comes to depicting the processes which consumers go through, leading up to a purchase. Analytical and operational Customer Relationship Management are presented as possible fields where such implementations can be useful. Based on previous studies, Random Forest and XGBoost were chosen as algorithms to be further evaluated because of its general high performance. The final results were produced by an iterative process which began with data processing followed by feature selection, training of model and testing the model. Literature review and unstructured and semi-structured interviews with the employer Growth Hackers Sthlm were also used as methods in a complementary fashion, with the purpose of gaining a wider perspective of the state-of-the-art of ML-implementations. The final results showed that Random Forest could identify the sought-after users (positive) while XGBoost was inferior to Random Forest in terms of distinguishing between positive and negative classes. An implementation of such model could support and benefit an organization’s customer acquisition operations. However, organizational prerequisites regarding the data infrastructure and the level of AI and machine learning integration in the organization’s culture are the most important ones and need to be considered before such implementations. / I det här arbetet undersöks huruvida det är möjligt att identifiera ett beteende bland användare som innebär att användaren snart ska byta tillhandahållare av tjänst med hjälp av maskininlärning. Målet är att kunna bidra till ett maskininlärningsverktyg i kundförvärvningssyfte, såsom analytical och operational Customer Relationship Management. Det sökta beteendet i rapporten utgår från modellen ”the Consumer Decision Journey”. I modellen beskrivs fyra faser där fas två innebär att konsumenten aktivt söker samt är mer mottaglig för information kring köpet. Genom tidigare studier och handledning av uppdragsgivare valdes algoritmerna RandomForest och XGBoost som huvudsakliga algoritmer som skulle testas. Resultaten producerades genom en iterativ process. Det första steget var att städa data. Därefter valdes parametrar och viktades. Sedan testades algoritmerna mot testdata och utvärderades. Detta gjordes i loopar tills förbättringar endast var marginella. De slutliga resultaten visade att framförallt Random Forest kunde identifiera ett beteende som innebär att en användare är i fas 2, medan XGBoost presterade sämre när det kom till att urskilja bland positiva och negativa användare. Dock fångade XGBoost fler positiva användare än vad Random Forest gjorde. I syfte att undersöka de organisatoriska förutsättningarna för att implementera maskininlärning och AI gjordes litteraturstudier och uppdragsgivaren intervjuades kontinuerligt. De viktigaste förutsättningarna fastställdes till två kategorier, datainfrastruktur och hur väl AI och maskininlärning är integrerat i organisationens kultur. Consumer decision journey Machine Learning Organizational prerequisites User activity Random forest XGBoost Användaraktivitet Consumer decision journey Customer relationship management Kundförvärv Maskininlärning Organisatoriska förutsättningar Random forest XGBoost Computer and Information Sciences Data- och informationsvetenskap
43	Segmentation and Valuation in Stockholm Housing Market : Spatial Continuous and Discontinuous Submarkets Evaluating by Hedonic Price Model and XGBoost Model / Segmentering och värdering på Stockholms bostadsmarknad : Rumsliga kontinuerliga och diskontinuerliga delmarknader som utvärderas med hedonisk prismodell och XGBoost-modell Sun, Xianglin January 2023 (has links) The housing market segmentation could provide a reference for more targeted policymaking and investment strategies. Although there have been many studies, there are no consistent submarkets delineating methods because of a lack of theoretical support and subjective evaluation. In this paper, two market segmentation methods are introduced. The continuous spatial segmentation divides properties into submarkets according to their coordinates, while the discontinuous spatial segmentation creates submarkets according to the variable having the most significant impact on the price index, which is the construction year of properties. Two valuation methods, the hedonic price model and the XGBoost regression model, are applied to evaluate the overall Stockholm housing markets and the created. The results proved that both market segmentation methods could improve the valuation prediction accuracy compared to the valuation under the overall Stockholm housing market. The non-spatial continuous market segmentation approach delivers more improvement in valuation accuracy but also has greater volatility. As for the two valuation models, no single valuation method can be absolutely advantageous in any market context. / Segmenteringen av bostadsmarknaden skulle kunna utgöra en referens för mer målinriktade politiska beslut och investeringsstrategier. Även om det har gjorts många studier finns det inga konsekventa metoder för att avgränsa delmarknader på grund av brist på teoretiskt stöd och subjektiv utvärdering. I detta dokument presenteras två metoder för marknadssegmentering. Den kontinuerliga rumsliga segmenteringen delar in fastigheter i delmarknader utifrån deras koordinater, medan den diskontinuerliga rumsliga segmenteringen skapar delmarknader utifrån den variabel som har störst inverkan på prisindexet, vilket är fastigheternas byggnadsår. Två värderingsmetoder, den hedoniska prismodellen och XGBoost-regressionsmodellen, används för att utvärdera Stockholms bostadsmarknad och den skapade marknaden. Resultaten visade att båda marknadssegmenteringsmetoderna kunde förbättra värderingens prediktionsnoggrannhet jämfört med värderingen under den övergripande bostadsmarknaden i Stockholm. Den icke-rumsliga kontinuerliga marknadssegmenteringsmetoden ger större förbättringar i värderingsnoggrannheten men har också större volatilitet. Vad gäller de två värderingsmodellerna kan ingen enskild värderingsmetod vara helt fördelaktig i något marknadssammanhang. "housing market segmentation" " spatial continuity" "hedonic price model" "XGBoost model" "rumslig kontinuitet" "hedonisk prismodell" "XGBoost modell" Civil Engineering Samhällsbyggnadsteknik
44	Evaluating machine learning models for time series forecasting in smart buildings / Utvärdera maskininlärningsmodeller för tidsserieprognos inom smarta byggnader Balachandran, Sarugan, Perez Legrand, Diego January 2023 (has links) Temperature regulation in buildings can be tricky and expensive. A common problem when heating buildings is that an unnecessary amount of energy is supplied. This waste of energy is often caused by a faulty regulation system. This thesis presents a machine learning ap- proach, using time series data, to predict the energy supply needed to keep the inside tem- perature at around 21 degrees Celsius. The machine learning models LSTM, Ensemble LSTM, AT-LSTM, ARIMA, and XGBoost were used for this project. The validation showed that the ensemble LSTM model gave the most accurate predictions with the Mean Absolute Error of 22486.79 (Wh) and Symmetric Mean Absolute Percentage Error of 5.41 % and was the model used for comparison with the current system. From the performance of the different models, the conclusion is that machine learning can be a useful tool to pre- dict the energy supply. But on the other hand, there exist other complex factors that need to be given more attention to, to evaluate the model in a better way. / Temperaturreglering i byggnader kan vara knepigt och dyrt. Ett vanligt problem vid upp- värmning av byggnader är att det tillförs onödigt mycket energi. Detta energispill orsakas oftast av ett felaktigt regleringssystem. Denna rapport studerar möjligheten att, med hjälp av tidsseriedata, kunna träna olika maskininlärningmodeller för att förutsäga den energitill- försel som behövs för att hålla inomhustemperaturen runt 21 grader Celsius. Maskininlär- ningsmodellerna LSTM, Ensemble LSTM, AT-LSTM, ARIMA och XGBoost användes för detta projekt. Valideringen visade att ensemble LSTM-modellen gav den mest exakta förut- sägelserna med Mean Absolute Error på 22486.79 (Wh) och Symmetric Mean Absolute Percentage Error på 5.41% och var modellen som användes för att jämföra med det befint- liga systemet. Från modellernas prestation är slutsatsen att maskininlärning kan vara ett an- vändbart verktyg för att förutsäga energitillförseln. Men å andra sidan finns det andra kom- plexa faktorer som bör tas hänsyn till så att modellen kan evalueras på ett bättre sätt. Machine learning supervised learning time series forecasting neural networks LSTM En- semble LSTM AT-LSTM ARIMA XGBoost Maskininlärning övervakad inlärning tidsserie-prognostisering neurala nätverk LSTM En- semble LSTM AT-LSTM ARIMA XGBoost Computer Engineering Datorteknik
45	Forecasting Daily Supermarkets Sales with Machine Learning / Dagliga Försäljningsprognoser för Livsmedel med Maskininlärning Fredén, Daniel, Larsson, Hampus January 2020 (has links) Improved sales forecasts for individual products in retail stores can have a positive effect both environmentally and economically. Historically these forecasts have been done through a combination of statistical measurements and experience. However, with the increased computational power available in modern computers, there has been an interest in applying machine learning for this problem. The aim of this thesis was to utilize two years of sales data, yearly calendar events, and weather data to investigate which machine learning method could forecast sales the best. The investigated methods were XGBoost, ARIMAX, LSTM, and Facebook Prophet. Overall the XGBoost and LSTM models performed the best and had a lower mean absolute value and symmetric mean percentage absolute error compared to the other models. However, Facebook Prophet performed the best in regards to root mean squared error and mean absolute error during the holiday season, indicating that Facebook Prophet was the best model for the holidays. The LSTM model could however quickly adapt during the holiday season improved the performance. Furthermore, the inclusion of weather did not improve the models significantly, and in some cases, the results were worsened. Thus, the results are inconclusive but indicate that the best model is dependent on the time period and goal of the forecast. / Förbättrade försäljningsprognoser för individuella produkter inom detaljhandeln kan leda till både en miljömässig och ekonomisk förbättring. Historiskt sett har dessa utförts genom en kombination av statistiska metoder och erfarenhet. Med den ökade beräkningskraften hos dagens datorer har intresset för att applicera maskininlärning på dessa problem ökat. Målet med detta examensarbete är därför att undersöka vilken maskininlärningsmetod som kunde prognostisera försäljning bäst. De undersökta metoderna var XGBoost, ARIMAX, LSTM och Facebook Prophet. Generellt presterade XGBoost och LSTM modellerna bäst då dem hade ett lägre mean absolute value och symmetric mean percentage absolute error jämfört med de andra modellerna. Dock, gällande root mean squared error hade Facebook Prophet bättre resultat under högtider, vilket indikerade att Facebook Prophet var den bäst lämpade modellen för att förutspå försäljningen under högtider. Dock, kunde LSTM modellen snabbt anpassa sig och förbättrade estimeringarna. Inkluderingen av väderdata i modellerna resulterade inte i några markanta förbättringar och gav i vissa fall även försämringar. Övergripande, var resultaten tvetydiga men indikerar att den bästa modellen är beroende av prognosens tidsperiod och mål. Statistics applied mathematics machine learning retail industry time-series forecasts neural network XGBoost ARIMA ARIMAX Facebook Prophet Prophet LSTM Statistik tillämpad matematik maskininlärning livsmedelsindustrin prognostisering tidsserier neurala nätverk XGBoost ARIMA ARIMAX Facebook Prophet Prophet LSTM Mathematics Matematik
46	A Study of an Iterative User-Specific Human Activity Classification Approach Fürderer, Niklas January 2019 (has links) Applications for sensor-based human activity recognition use the latest algorithms for the detection and classiﬁcation of human everyday activities, both for online and ofﬂine use cases. The insights generated by those algorithms can in a next step be used within a wide broad of applications such as safety, ﬁtness tracking, localization, personalized health advice and improved child and elderly care.In order for an algorithm to be performant, a signiﬁcant amount of annotated data from a speciﬁc target audience is required. However, a satisfying data collection process is cost and labor intensive. This also may be unfeasible for speciﬁc target groups as aging effects motion patterns and behaviors. One main challenge in this application area lies in the ability to identify relevant changes over time while being able to reuse previously annotated user data. The accurate detection of those user-speciﬁc patterns and movement behaviors therefore requires individual and adaptive classiﬁcation models for human activities.The goal of this degree work is to compare several supervised classiﬁer performances when trained and tested on a newly iterative user-speciﬁc human activity classiﬁcation approach as described in this report. A qualitative and quantitative data collection process was applied. The tree-based classiﬁcation algorithms Decision Tree, Random Forest as well as XGBoost were tested on custom based datasets divided into three groups. The datasets contained labeled motion data of 21 volunteers from wrist worn sensors.Computed across all datasets, the average performance measured in recall increased by 5.2% (using a simulated leave-one-subject-out cross evaluation) for algorithms trained via the described approach compared to a random non-iterative approach. / Sensorbaserad aktivitetsigenkänning använder sig av det senaste algoritmerna för detektion och klassiﬁcering av mänskliga vardagliga aktiviteter, både i uppoch frånkopplat läge. De insikter som genereras av algoritmerna kan i ett nästa steg användas inom en mängd nya applikationer inom områden så som säkerhet, träningmonitorering, platsangivelser, personiﬁerade hälsoråd samt inom barnoch äldreomsorgen.För att en algoritm skall uppnå hög prestanda krävs en inte obetydlig mängd annoterad data, som med fördel härrör från den avsedda målgruppen. Dock är datainsamlingsprocessen kostnadsoch arbetsintensiv. Den kan dessutom även vara orimlig att genomföra för vissa speciﬁka målgrupper, då åldrandet påverkar rörelsemönster och beteenden. En av de största utmaningarna inom detta område är att hitta de relevanta förändringar som sker över tid, samtidigt som man vill återanvända tidigare annoterad data. För att kunna skapa en korrekt bild av det individuella rörelsemönstret behövs därför individuella och adaptiva klassiﬁceringsmodeller.Målet med detta examensarbete är att jämföra ﬂera olika övervakade klassiﬁcerares (eng. supervised classiﬁers) prestanda när dem tränats med hjälp av ett iterativt användarspeciﬁkt aktivitetsklassiﬁceringsmetod, som beskrivs i denna rapport. En kvalitativ och kvantitativ datainsamlingsprocess tillämpades. Trädbaserade klassiﬁceringsalgoritmerna Decision Tree, Random Forest samt XGBoost testades utifrån speciﬁkt skapade dataset baserade på 21 volontärer, som delades in i tre grupper. Data är baserad på rörelsedata från armbandssensorer.Beräknat över samtlig data, ökade den genomsnittliga sensitiviteten med 5.2% (simulerad korsvalidering genom utelämna-en-individ) för algoritmer tränade via beskrivna metoden jämfört med slumpvis icke-iterativ träning. human activity recognition classification random forest xgboost decision tree iterative learning approach user-specific aktivitetsigenkänning övervakade klassificerares random forest xgboost beslutsträd iterativt lärometod användarspecifik Computer and Information Sciences Data- och informationsvetenskap
47	Evaluation of Machine Learning Methods for Time Series Forecasting on E-commerce Data / Utvärdering av Maskininlärningsmodeller för tidsserie-prognotisering på e-handels data Abrahamsson, Peter, Ahlqvist, Niklas January 2022 (has links) Within demand forecasting, and specifically within the field of e-commerce, the provided data often contains erratic behaviours which are difficult to explain. This induces contradictions to the common assumptions within classical approaches for time series analysis. Yet, classical and naive approaches are still commonly used. Machine learning could be used to alleviate such problems. This thesis evaluates four models together with Swedish fin-tech company QLIRO AB. More specifically, a MLR (Multiple Linear Regression) model, a classic Box-Jenkins model (SARIMAX), an XGBoost model, and a LSTM-network (Long Short-Term Memory). The provided data consists of aggregated total daily reservations by e-merchants within the Nordic market from 2014. Some data pre processing was required and a smoothed version of the data set was created for comparison. Each model was constructed according to their specific requirements but with similar feature engineering. Evaluation was then made on a monthly level with a forecast horizon of 30 days during 2021. The results shows that both the MLR and the XGBoost provides the most consistent results together with perks for being easy to use. After these two, the LSTM-network showed the best results for November and December on the original data set but worst overall. Yet it had good performance on the smoothed data set and was then comparable to the first two. The SARIMAX was the worst performing of all the models considered in this thesis and was not as easy to implement. / Inom efterfrågeprognoser, och specifikt inom området e-handel, innehåller den tillhandahållna informationen ofta oberäkneliga beteenden som är svåra att förklara. Detta motsäger vanliga antaganden inom tidsserier som används för de mer klassiska tillvägagångssätten. Ändå är klassiska och naiva metoder fortfarande vanliga. Maskininlärning skulle kunna användas för att lindra sådana problem. Detta examensarbete utvärderar fyra modeller tillsammans med det svenska fintechföretaget QLIRO AB. Mer specifikt en MLR-modell (Multiple Linear Regression), en klassisk Box-Jenkins-modell (SARIMAX), en XGBoost-modell och ett LSTM-nätverk (Long Short-Term Memory). Den tillhandahållna informationen består av aggregerade dagliga reservationer från e-handlare inom den nordiska marknaden från 2014. Viss dataförbehandling krävdes och en utjämnad version av datamängden skapades för jämförelse. Varje modell konstruerades enligt deras specifika krav men med liknande \textit{feature engineering}. Utvärderingen gjordes sedan på månadsnivå med en prognoshorisont på 30 dagar under 2021. Resultaten visar att både MLR och XGBoost ger de mest pålitliga resultaten tillsammans med fördelar som att vara lätta att använda. Efter dessa visar LSTM-nätverket de bästa resultaten för november och december på den ursprungliga datamängden men sämst totalt sett. Ändå visar den god prestanda på den utjämnade datamängden och var sedan jämförbar med de två första modellerna. SARIMAX var den sämst presterande av alla jämförda modeller och inte lika lätt att implementera. Thesis Time Series Machine Learning E-commerce Demand Forecasting Multiple Linear Regression SARIMAX XGBoost LSTM Model Evaluation Examensarbete tidsserier maskininlärning e-handel efterfrågeprognoser multipel linjär regression SARIMAX XGBoost LSTM modellutvärdering Other Mathematics Annan matematik
48	Predicting House Prices on the Countryside using Boosted Decision Trees / Förutseende av huspriser på landsbygden genom boostade beslutsträd Revend, War January 2020 (has links) This thesis intends to evaluate the feasibility of supervised learning models for predicting house prices on the countryside of South Sweden. It is essential for mortgage lenders to have accurate housing valuation algorithms and the current model offered by Booli is not accurate enough when evaluating residence prices on the countryside. Different types of boosted decision trees were implemented to address this issue and their performances were compared to traditional machine learning methods. These different types of supervised learning models were implemented in order to find the best model with regards to relevant evaluation metrics such as root-mean-squared error (RMSE) and mean absolute percentage error (MAPE). The implemented models were ridge regression, lasso regression, random forest, AdaBoost, gradient boosting, CatBoost, XGBoost, and LightGBM. All these models were benchmarked against Booli's current housing valuation algorithms which are based on a k-NN model. The results from this thesis indicated that the LightGBM model is the optimal one as it had the best overall performance with respect to the chosen evaluation metrics. When comparing the LightGBM model to the benchmark, the performance was overall better, the LightGBM model had an RMSE score of 0.330 compared to 0.358 for the Booli model, indicating that there is a potential of using boosted decision trees to improve the predictive accuracy of residence prices on the countryside. / Denna uppsats ämnar utvärdera genomförbarheten hos olika övervakade inlärningsmodeller för att förutse huspriser på landsbygden i Södra Sverige. Det är viktigt för bostadslånsgivare att ha noggranna algoritmer när de värderar bostäder, den nuvarande modellen som Booli erbjuder har dålig precision när det gäller värderingar av bostäder på landsbygden. Olika typer av boostade beslutsträd implementerades för att ta itu med denna fråga och deras prestanda jämfördes med traditionella maskininlärningsmetoder. Dessa olika typer av övervakad inlärningsmodeller implementerades för att hitta den bästa modellen med avseende på relevanta prestationsmått som t.ex. root-mean-squared error (RMSE) och mean absolute percentage error (MAPE). De övervakade inlärningsmodellerna var ridge regression, lasso regression, random forest, AdaBoost, gradient boosting, CatBoost, XGBoost, and LightGBM. Samtliga algoritmers prestanda jämförs med Boolis nuvarande bostadsvärderingsalgoritm, som är baserade på en k-NN modell. Resultatet från denna uppsats visar att LightGBM modellen är den optimala modellen för att värdera husen på landsbygden eftersom den hade den bästa totala prestandan med avseende på de utvalda utvärderingsmetoderna. LightGBM modellen jämfördes med Booli modellen där prestandan av LightGBM modellen var i överlag bättre, där LightGBM modellen hade ett RMSE värde på 0.330 jämfört med Booli modellen som hade ett RMSE värde på 0.358. Vilket indikerar att det finns en potential att använda boostade beslutsträd för att förbättra noggrannheten i förutsägelserna av huspriser på landsbygden. Machine Learning Predicting House Prices Shrinkage Methods Random Forest Decision Tree AdaBoost Gradient Boosting LightGBM CatBoost XGBoost Maskininlärning Förutseende av Huspriser Krympningsmetoder Random Forest Beslutsträd AdaBoost Gradient Boosting LightGBM CatBoost XGBoost Probability Theory and Statistics Sannolikhetsteori och statistik
49	Hantering av brandväggsregler med generativ AI: möjligheter och utmaningar / Managing firewall rules with generative AI: opportunities and challenges El Khadam, Youssef, Yusuf, Ahmed Adan January 2024 (has links) Brandväggar är en kritisk komponent i nätverkssäkerhet som kontrollerar och filtrerar nätverkstrafik för att skydda mot obehörig åtkomst och cyberhot. Effektiv hantering av brandväggsregler är avgörande för att säkerställa att ett nätverk fungerar smidigt och säkert. I stora företagsnätverk som Scania kan hanteringen av dessa regler bli komplex och resurskrävande, vilket kan leda till duplicerade och överlappande regler som försämrar systemets prestanda.Detta examensarbete undersöker tillämpningen av generativ artificiell intelligens (GAI) och maskininlärning för att hantera och optimera brandväggsregler, med fokus på identifiering och hantering av duplicerade och överlappande regler. Problemställningen adresserar de växande utmaningarna med att underhålla effektiva brandväggsregler i stora företagsnätverk som Scania. Genom att implementera och utvärdera en prototyp baserad på XGBoost, utforskar arbetet potentialen hos AI-tekniker för att förbättra hanteringen och säkerheten av nätverkstrafik. Resultaten visar att AI kan spela en kritisk roll i automatiseringen av processer för upptäckt och korrigering av felaktiga regler, vilket bidrar till ökad nätverkssäkerhet och optimerad resursanvändning. Studien bekräftar att användningen av AI inom brandväggshantering erbjuder betydande fördelar, men lyfter också fram behovet av fortsatt forskning för att adressera säkerhetsutmaningar relaterade till AI-lösningar. / Firewalls are a critical component of network security, controlling and filtering network traffic to protect against unauthorized access and cyber threats. Effective management of firewall rules is essential to ensure that a network operates smoothly and securely. In large enterprise networks like Scania, managing these rules can become complex and resourceintensive, leading to duplicate and overlapping rules that degrade system performance and security.This thesis investigates the application of generative AI (GAI) and machine learning to manage and optimize firewall rules, focusing on the identification and handling of duplicate and overlapping rules. The problem addresses the growing challenges of maintaining effective firewall rules in large enterprise networks like Scania. By implementing and evaluating a prototype based on XGBoost, this work explores the potential of AI techniques to improve the management and security of network traffic. The results demonstrate that AI can play a critical role in automating the processes for detecting and correcting faulty rules, contributing to increased network security and optimized resource usage. The study confirms that the use of AI in firewall management offers significant benefits but also highlights the need for further research to address security challenges related to AI solutions. firewall generative AI firewall rules network security XGBoost AI in cybersecurity duplicated rules machine learning rule management systems brandvägg generativ AI brandväggsregler nätverkssäkerhet XGBoost AI i cybersäkerhet duplicerade regler maskininlärning regelhanteringssystem
50	Comparative analysis of XGBoost, MLP and LSTM techniques for the problem of predicting fire brigade Iiterventions / Cerna Ñahuis, Selene Leya January 2019 (has links) Orientador: Anna Diva Plasencia Lotufo / Abstract: Many environmental, economic and societal factors are leading fire brigades to be increasingly solicited, and, as a result, they face an ever-increasing number of interventions, most of the time on constant resource. On the other hand, these interventions are directly related to human activity, which itself is predictable: swimming pool drownings occur in summer while road accidents due to ice storms occur in winter. One solution to improve the response of firefighters on constant resource is therefore to predict their workload, i.e., their number of interventions per hour, based on explanatory variables conditioning human activity. The present work aims to develop three models that are compared to determine if they can predict the firefighters' response load in a reasonable way. The tools chosen are the most representative from their respective categories in Machine Learning, such as XGBoost having as core a decision tree, a classic method such as Multi-Layer Perceptron and a more advanced algorithm like Long Short-Term Memory both with neurons as a base. The entire process is detailed, from data collection to obtaining the predictions. The results obtained prove a reasonable quality prediction that can be improved by data science techniques such as feature selection and adjustment of hyperparameters. / Resumo: Muitos fatores ambientais, econômicos e sociais estão levando as brigadas de incêndio a serem cada vez mais solicitadas e, como consequência, enfrentam um número cada vez maior de intervenções, na maioria das vezes com recursos constantes. Por outro lado, essas intervenções estão diretamente relacionadas à atividade humana, o que é previsível: os afogamentos em piscina ocorrem no verão, enquanto os acidentes de tráfego, devido a tempestades de gelo, ocorrem no inverno. Uma solução para melhorar a resposta dos bombeiros com recursos constantes é prever sua carga de trabalho, isto é, seu número de intervenções por hora, com base em variáveis explicativas que condicionam a atividade humana. O presente trabalho visa desenvolver três modelos que são comparados para determinar se eles podem prever a carga de respostas dos bombeiros de uma maneira razoável. As ferramentas escolhidas são as mais representativas de suas respectivas categorias em Machine Learning, como o XGBoost que tem como núcleo uma árvore de decisão, um método clássico como o Multi-Layer Perceptron e um algoritmo mais avançado como Long Short-Term Memory ambos com neurônios como base. Todo o processo é detalhado, desde a coleta de dados até a obtenção de previsões. Os resultados obtidos demonstram uma previsão de qualidade razoável que pode ser melhorada por técnicas de ciência de dados, como seleção de características e ajuste de hiperparâmetros. / Mestre Firefighters Previsão. XGBoost Long short-term memory Multi-layer perceptron Mutual information regression Principal component analysis Bombeiros. Previsão.

Search results