Global ETD Search

301	Building Predictive Models for Stock Market Performance : En studie om maskininlärning och deras prestanda Wennmark, Gabriel, Lindgren, Felix January 2023 (has links) Today it is important for investors to identify which stocks that will result in positive returns in order for the right decision to be made when trading on the stock market. For decades it has been an area of interest for academics, and it is still challenging due to many difficulties and problems. A large number of studies has been carried out in machine learning and stock trading,where many of the studies has resulted in promising results despite these challenges. The aim of this study was to develop and evaluate predictive models for identifying stocks that outperform the Swedish market index OMXSPI. The research utilized a dataset of historical stock data and applied three various machine learning algorithms, Support Vector Machine, Logistic Regression and Decision Trees to predict if excess performance was met. With the help of ten-fold cross-validation and hyperparameter tuning the results were an IT-artefact that produced satisfying results. The results showed that hyperparameter tuning techniques marginally improved the metrics focused-on, namely accuracy and precision. The support vector machine model achieved an accuracy of 58,52% and a precision of 57,51%. The logistic regression model achieved an accuracy of 55,75% and a precision of 54,81%. Finally, the decision tree model which was the best performer, achieved an accuracy of 64,84% and a precision of 65,00%. machine learning classification stock market OMXSPI support vector machine logistic regression decision tree prediction model maskininlärning klassifikation aktiemarknad OMXSPI support vector machine logistisk regression beslutsträd prediktionsmodell Information Systems
302	Spam Analysis and Detection for User Generated Content in Online Social Networks Tan, Enhua 23 July 2013 (has links) No description available. Computer Engineering Computer Science user generated content online social networks user behavior stretched exponential distribution spam filtering spam detection spam classification decision tree social graph user-link graph Sybil attack community detection BARS UNIK
303	[en] AUTOMATED SYNTHESIS OF OPTIMAL DECISION TREES FOR SMALL COMBINATORIAL OPTIMIZATION PROBLEMS / [pt] SÍNTESE AUTOMATIZADA DE ÁRVORES DE DECISÃO ÓTIMAS PARA PEQUENOS PROBLEMAS DE OTIMIZAÇÃO COMBINATÓRIA CLEBER OLIVEIRA DAMASCENO 24 August 2021 (has links) [pt] A análise de complexidade clássica para problemas NP-difíceis é geralmente orientada para cenários de pior caso, considerando apenas o comportamento assintótico. No entanto, existem algoritmos práticos com execução em um tempo razoável para muitos problemas clássicos. Além disso, há evidências que apontam para algoritmos polinomiais no modelo de árvore de decisão linear para resolver esses problemas, embora não muito explorados. Neste trabalho, exploramos esses resultados teóricos anteriores. Mostramos que a solução ótima para problemas combinatórios 0-1 pode ser encontrada reduzindo esses problemas para uma Busca por Vizinho Mais Próximo sobre o conjunto de vértices de Voronoi correspondentes. Utilizamos os hiperplanos que delimitam essas regiões para gerar sistematicamente uma árvore de decisão que repetidamente divide o espaço até que possa separar todas as soluções, garantindo uma resposta ótima. Fazemos experimentos para testar os limites de tamanho para os quais podemos construir essas árvores para os casos do 0-1 knapsack, weighted minimum cut e symmetric traveling salesman. Conseguimos encontrar as árvores desses problemas com tamanhos até 10, 5 e 6, respectivamente. Obtemos também as relações de adjacência completas para os esqueletos dos politopos do knapsack e do traveling salesman até os tamanhos 10 e 7. Nossa abordagem supera consistentemente o método de enumeração e os métodos baseline para o weighted minimum cut e symmetric traveling salesman, fornecendo soluções ótimas em microssegundos. / [en] Classical complexity analysis for NP-hard problems is usually oriented to worst-case scenarios, considering only the asymptotic behavior. However, there are practical algorithms running in a reasonable time for many classic problems. Furthermore, there is evidence pointing towards polynomial algorithms in the linear decision tree model to solve these problems, although not explored much. In this work, we explore previous theoretical results. We show that the optimal solution for 0-1 combinatorial problems can be found by reducing these problems into a Nearest Neighbor Search over the set of corresponding Voronoi vertices. We use the hyperplanes delimiting these regions to systematically generate a decision tree that repeatedly splits the space until it can separate all solutions, guaranteeing an optimal answer. We run experiments to test the size limits for which we can build these trees for the cases of the 0-1 knapsack, weighted minimum cut, and symmetric traveling salesman. We manage to find the trees of these problems with sizes up to 10, 5, and 6, respectively. We also obtain the complete adjacency relations for the skeletons of the knapsack and traveling salesman polytopes up to size 10 and 7. Our approach consistently outperforms the enumeration method and the baseline methods for the weighted minimum cut and symmetric traveling salesman, providing optimal solutions within microseconds. [pt] OTIMIZACAO COMBINATORIA [pt] BUSCA POR VIZINHO MAIS PROXIMO [pt] DIAGRAMAS DE VORONOI [pt] POLITOPOS [en] COMBINATORIAL OPTIMIZATION [en] LINEAR DECISION TREE MODEL [en] NEAREST NEIGHBOR SEARCH [en] VORONOI DIAGRAMS [en] POLYTOPES
304	A Study of an Iterative User-Specific Human Activity Classification Approach Fürderer, Niklas January 2019 (has links) Applications for sensor-based human activity recognition use the latest algorithms for the detection and classiﬁcation of human everyday activities, both for online and ofﬂine use cases. The insights generated by those algorithms can in a next step be used within a wide broad of applications such as safety, ﬁtness tracking, localization, personalized health advice and improved child and elderly care.In order for an algorithm to be performant, a signiﬁcant amount of annotated data from a speciﬁc target audience is required. However, a satisfying data collection process is cost and labor intensive. This also may be unfeasible for speciﬁc target groups as aging effects motion patterns and behaviors. One main challenge in this application area lies in the ability to identify relevant changes over time while being able to reuse previously annotated user data. The accurate detection of those user-speciﬁc patterns and movement behaviors therefore requires individual and adaptive classiﬁcation models for human activities.The goal of this degree work is to compare several supervised classiﬁer performances when trained and tested on a newly iterative user-speciﬁc human activity classiﬁcation approach as described in this report. A qualitative and quantitative data collection process was applied. The tree-based classiﬁcation algorithms Decision Tree, Random Forest as well as XGBoost were tested on custom based datasets divided into three groups. The datasets contained labeled motion data of 21 volunteers from wrist worn sensors.Computed across all datasets, the average performance measured in recall increased by 5.2% (using a simulated leave-one-subject-out cross evaluation) for algorithms trained via the described approach compared to a random non-iterative approach. / Sensorbaserad aktivitetsigenkänning använder sig av det senaste algoritmerna för detektion och klassiﬁcering av mänskliga vardagliga aktiviteter, både i uppoch frånkopplat läge. De insikter som genereras av algoritmerna kan i ett nästa steg användas inom en mängd nya applikationer inom områden så som säkerhet, träningmonitorering, platsangivelser, personiﬁerade hälsoråd samt inom barnoch äldreomsorgen.För att en algoritm skall uppnå hög prestanda krävs en inte obetydlig mängd annoterad data, som med fördel härrör från den avsedda målgruppen. Dock är datainsamlingsprocessen kostnadsoch arbetsintensiv. Den kan dessutom även vara orimlig att genomföra för vissa speciﬁka målgrupper, då åldrandet påverkar rörelsemönster och beteenden. En av de största utmaningarna inom detta område är att hitta de relevanta förändringar som sker över tid, samtidigt som man vill återanvända tidigare annoterad data. För att kunna skapa en korrekt bild av det individuella rörelsemönstret behövs därför individuella och adaptiva klassiﬁceringsmodeller.Målet med detta examensarbete är att jämföra ﬂera olika övervakade klassiﬁcerares (eng. supervised classiﬁers) prestanda när dem tränats med hjälp av ett iterativt användarspeciﬁkt aktivitetsklassiﬁceringsmetod, som beskrivs i denna rapport. En kvalitativ och kvantitativ datainsamlingsprocess tillämpades. Trädbaserade klassiﬁceringsalgoritmerna Decision Tree, Random Forest samt XGBoost testades utifrån speciﬁkt skapade dataset baserade på 21 volontärer, som delades in i tre grupper. Data är baserad på rörelsedata från armbandssensorer.Beräknat över samtlig data, ökade den genomsnittliga sensitiviteten med 5.2% (simulerad korsvalidering genom utelämna-en-individ) för algoritmer tränade via beskrivna metoden jämfört med slumpvis icke-iterativ träning. human activity recognition classification random forest xgboost decision tree iterative learning approach user-specific aktivitetsigenkänning övervakade klassificerares random forest xgboost beslutsträd iterativt lärometod användarspecifik Computer and Information Sciences Data- och informationsvetenskap
305	Predicting House Prices on the Countryside using Boosted Decision Trees / Förutseende av huspriser på landsbygden genom boostade beslutsträd Revend, War January 2020 (has links) This thesis intends to evaluate the feasibility of supervised learning models for predicting house prices on the countryside of South Sweden. It is essential for mortgage lenders to have accurate housing valuation algorithms and the current model offered by Booli is not accurate enough when evaluating residence prices on the countryside. Different types of boosted decision trees were implemented to address this issue and their performances were compared to traditional machine learning methods. These different types of supervised learning models were implemented in order to find the best model with regards to relevant evaluation metrics such as root-mean-squared error (RMSE) and mean absolute percentage error (MAPE). The implemented models were ridge regression, lasso regression, random forest, AdaBoost, gradient boosting, CatBoost, XGBoost, and LightGBM. All these models were benchmarked against Booli's current housing valuation algorithms which are based on a k-NN model. The results from this thesis indicated that the LightGBM model is the optimal one as it had the best overall performance with respect to the chosen evaluation metrics. When comparing the LightGBM model to the benchmark, the performance was overall better, the LightGBM model had an RMSE score of 0.330 compared to 0.358 for the Booli model, indicating that there is a potential of using boosted decision trees to improve the predictive accuracy of residence prices on the countryside. / Denna uppsats ämnar utvärdera genomförbarheten hos olika övervakade inlärningsmodeller för att förutse huspriser på landsbygden i Södra Sverige. Det är viktigt för bostadslånsgivare att ha noggranna algoritmer när de värderar bostäder, den nuvarande modellen som Booli erbjuder har dålig precision när det gäller värderingar av bostäder på landsbygden. Olika typer av boostade beslutsträd implementerades för att ta itu med denna fråga och deras prestanda jämfördes med traditionella maskininlärningsmetoder. Dessa olika typer av övervakad inlärningsmodeller implementerades för att hitta den bästa modellen med avseende på relevanta prestationsmått som t.ex. root-mean-squared error (RMSE) och mean absolute percentage error (MAPE). De övervakade inlärningsmodellerna var ridge regression, lasso regression, random forest, AdaBoost, gradient boosting, CatBoost, XGBoost, and LightGBM. Samtliga algoritmers prestanda jämförs med Boolis nuvarande bostadsvärderingsalgoritm, som är baserade på en k-NN modell. Resultatet från denna uppsats visar att LightGBM modellen är den optimala modellen för att värdera husen på landsbygden eftersom den hade den bästa totala prestandan med avseende på de utvalda utvärderingsmetoderna. LightGBM modellen jämfördes med Booli modellen där prestandan av LightGBM modellen var i överlag bättre, där LightGBM modellen hade ett RMSE värde på 0.330 jämfört med Booli modellen som hade ett RMSE värde på 0.358. Vilket indikerar att det finns en potential att använda boostade beslutsträd för att förbättra noggrannheten i förutsägelserna av huspriser på landsbygden. Machine Learning Predicting House Prices Shrinkage Methods Random Forest Decision Tree AdaBoost Gradient Boosting LightGBM CatBoost XGBoost Maskininlärning Förutseende av Huspriser Krympningsmetoder Random Forest Beslutsträd AdaBoost Gradient Boosting LightGBM CatBoost XGBoost Probability Theory and Statistics Sannolikhetsteori och statistik
306	Estimating Per-pixel Classification Confidence of Remote Sensing Images Jiang, Shiguo 19 December 2012 (has links) No description available. Geographic Information Science Geography Remote Sensing spatial data quality GIS remote sensing image classification classification confidence sample design classification error posterior probability entropy maximum likelihood support vector machine neural network boosted decision tree
307	A Comparative Study of Machine Learning Algorithms Le Fort, Eric January 2018 (has links) The selection of machine learning algorithm used to solve a problem is an important choice. This paper outlines research measuring three performance metrics for eight different algorithms on a prediction task involving under- graduate admissions data. The algorithms that were tested are k-nearest neighbours, decision trees, random forests, gradient tree boosting, logistic regression, naive bayes, support vector machines, and artificial neural net- works. These algorithms were compared in terms of accuracy, training time, and execution time. / Thesis / Master of Applied Science (MASc) Machine Learning Comparative Study Data Science University Admissions Software Engineering Computer Science K-Nearest Neighbours Decision Tree Random Forest Gradient Tree Boosting Logistic Regression Naive Bayes Support Vector Machine Neural Network
308	Grön AI : En analys av maskininlärningsalgoritmers prestanda och energiförbrukning Berglin, Caroline, Ellström, Julia January 2024 (has links) Trots de framsteg som gjorts inom artificiell intelligens (AI) och maskininlärning (ML), uppkommer utmaningar gällande deras miljöpåverkan. Fokuset på att skapa avancerade och träffsäkra modeller innebär ofta att omfattande beräkningsresurser krävs, vilket leder till en hög energiförbrukning. Syftet med detta arbete är att undersöka ämnet grön AI och sambandet mellan prestanda och energiförbrukning hos två ML-algoritmer. De algoritmer som undersöks är beslutsträd och stödvektormaskin (SVM), med hjälp av två dataset: Bank Marketing och MNIST. Prestandan mäts med utvärderingsmåtten noggrannhet, precision, recall och F1-poäng, medan energiförbrukningen mäts med verktyget Intel VTune Profiler. Arbetets resultat visar att en högre prestanda resulterade i en högre energiförbrukning, där SVM presterade bäst men också förbrukade mest energi i samtliga tester. Vidare visar resultatet att optimering av modellerna resulterade både i en förbättrad prestanda men också i en ökad energiförbrukning. Samma resultat kunde ses när ett större dataset användes. Arbetet anses inte bidra med resultat eller riktlinjer som går att generalisera till andra arbeten. Däremot bidrar arbetet med en förståelse och medvetenhet kring miljöaspekterna gällande AI, vilket kan användas som en grund för att undersöka ämnet vidare. Genom en ökad medvetenhet kan ett gemensamt ansvar tas för att utveckla AI-lösningar som inte bara är kraftfulla och effektiva, utan också hållbara. / Despite the advancements made in artificial intelligence (AI) and machine learning (ML), challenges regarding their environmental impact arise. The focus on creating advanced and accurate models often requires extensive computational resources, leading to a high energy consumption. The purpose of this work is to explore the topic of green AI and the relationship between performance and energy consumption of two ML algorithms. The algorithms being evaluated are decision trees and support vector machines (SVM), using two datasets: Bank Marketing and MNIST. Performance is measured using the evaluation metrics accuracy, precision, recall, and F1-score, while energy consumption is measured using the Intel VTune Profiler tool. The results show that higher performance resulted in higher energy consumption, with SVM performing the best but also consuming the most energy in all tests. Furthermore, the results show that optimizing the models resulted in both improved performance and increased energy consumption. The same results were observed when a larger dataset was used. This work is not considered to provide results or guidelines that can be generalized to other studies. However, it contributes to an understanding and awareness of the environmental aspects of AI, which can serve as a foundation for further exploration of the topic. Through increased awareness, shared responsibility can be taken to develop AI solutions that are not only powerful and efficient but also sustainable. Green AI artificial intelligence (AI) machine learning (ML) performance energy consumption decision tree support vector machine (SVM). Grön AI artificiell intelligens (AI) maskininlärning (ML) prestanda energiförbrukning beslutsträd stödvektormaskin (SVM). Software Engineering Programvaruteknik
309	Digitalt beslutsträd för val av Tobii Dynavox-produkter : Optimering av produktval för förskrivningsprocessen / Digital Decision Tree for Selecting Tobii Dynavox Products : Optimization of Product Selection for the Prescribing Process Yazdi, Anna, Nasrin, Yaguobi January 2024 (has links) Denna studie fokuserar på utvecklingen av en interaktiv chatbot för att underlätta valet av kommunikationsenheter för personer med alternativa och kompletterande kommunikationsbehov (AKK). Genom att integrera MIRO för att skapa beslutsträd och Flowchart AI för att konstruera chatbotar möjliggörs en strukturerad process för visualisering och implementering av innehåll på en webbplattform. Beslutsträdet, konstruerat i MIRO, identifierar och strukturerar olika komponenter som är nödvändiga för att skapa en skräddarsydd enhet för varje individ inom AKK-kategorin. Flowchart AI används för att utveckla en interaktiv chatbot som vägleder valet av enhet, mjukvara och tillbehör baserat på användarens specifika behov och preferenser. Studien inkluderar även prototyper av e-postmeddelanden för att förse användarna med en sammanfattning av rekommendationerna. Slutresultatet av detta arbete innebär framgångsrik utveckling av en interaktiv chatbot som guidar användare genom valet av lämpliga Tobii Dynavox-produkter, skräddarsytt efter deras individuella behov. Genom att erbjuda användarna en mer effektiv och riktad rekommendationsprocess sparar arbetet både tid och resurser för såväl användare som professionella, som logopeder. Projektet bidrar till förbättrad tillgång till information och vägledning inom detta område, vilket främjar ökad tillgänglighet och effektivitet inom alternativa och kompletterande kommunikationslösningar. Vidare har projektet främjat kunskapsutvecklingen inom området genom att utveckla ett interaktivt program för medicinteknisk utrustning. Ur ett bredare samhällsperspektiv har arbetet också bidragit till ökad tillgänglighet och effektivitet inom alternativ och kompletterande kommunikation, vilket kan förbättra livskvaliteten för personer som behöver dessa lösningar. Arbetet har minskat risken för felaktiga val och ökat användarnas förmåga att kommunicera effektivt och självständigt. / This study focuses on the development of an interactive chatbot to facilitate the selection of communication devices for individuals with alternative and augmentative communication (AAC) needs. By integrating MIRO for creating decision trees and Flowchart AI for constructing the chatbot, a structured process for visualizing and implementing content on a web platform is enabled. The decision tree, constructed in MIRO, identifies and organizes various components necessary to create a customized device for each individual within the AAC category. Flowchart AI is utilized to develop an interactive chatbot that guides users through the selection of device, software, and accessories based on their specific needs and preferences. The study also includes prototypes of email notifications to provide users with a summary of recommendations. The culmination of this work results in the successful development of an interactive chatbot that guides users through the selection of suitable Tobii Dynavox products tailored to their individual needs. By offering users a more efficient and targeted recommendation process, the work saves time and resources for both users and professionals, such as speech therapist. The project contributes to improved access to information and guidance in this area, promoting increased accessibility and efficiency in alternative and augmentative communication solutions. Furthermore, the project has promoted knowledge development in the field by creating an interactive program for medical equipment. From a broader societal perspective, the work has also contributed to increased accessibility and efficiency in alternative and augmentative communication, potentially enhancing the quality of life for individuals in need of these solutions. The work has reduced the risk of erroneous choices and enhanced users' ability to communicate effectively and independently. Tobii Dynavox MIRO Flowchart AI interactive chatbot communication solutions prototype efficiency improvement AAC evaluation decision tree analysis. Tobii Dynavox MIRO Flowchart AI interaktiv chatbot kommunikationslösningar prototyp effektivitetsförbättring AKK utvärdering beslutträdsanalys. Other Health Sciences Annan hälsovetenskap
310	Distributed conditional computation Léonard, Nicholas 08 1900 (has links) L'objectif de cette thèse est de présenter différentes applications du programme de recherche de calcul conditionnel distribué. On espère que ces applications, ainsi que la théorie présentée ici, mènera à une solution générale du problème d'intelligence artificielle, en particulier en ce qui a trait à la nécessité d'efficience. La vision du calcul conditionnel distribué consiste à accélérer l'évaluation et l'entraînement de modèles profonds, ce qui est très différent de l'objectif usuel d'améliorer sa capacité de généralisation et d'optimisation. Le travail présenté ici a des liens étroits avec les modèles de type mélange d'experts. Dans le chapitre 2, nous présentons un nouvel algorithme d'apprentissage profond qui utilise une forme simple d'apprentissage par renforcement sur un modèle d'arbre de décisions à base de réseau de neurones. Nous démontrons la nécessité d'une contrainte d'équilibre pour maintenir la distribution d'exemples aux experts uniforme et empêcher les monopoles. Pour rendre le calcul efficient, l'entrainement et l'évaluation sont contraints à être éparse en utilisant un routeur échantillonnant des experts d'une distribution multinomiale étant donné un exemple. Dans le chapitre 3, nous présentons un nouveau modèle profond constitué d'une représentation éparse divisée en segments d'experts. Un modèle de langue à base de réseau de neurones est construit à partir des transformations éparses entre ces segments. L'opération éparse par bloc est implémentée pour utilisation sur des cartes graphiques. Sa vitesse est comparée à deux opérations denses du même calibre pour démontrer le gain réel de calcul qui peut être obtenu. Un modèle profond utilisant des opérations éparses contrôlées par un routeur distinct des experts est entraîné sur un ensemble de données d'un milliard de mots. Un nouvel algorithme de partitionnement de données est appliqué sur un ensemble de mots pour hiérarchiser la couche de sortie d'un modèle de langage, la rendant ainsi beaucoup plus efficiente. Le travail présenté dans cette thèse est au centre de la vision de calcul conditionnel distribué émis par Yoshua Bengio. Elle tente d'appliquer la recherche dans le domaine des mélanges d'experts aux modèles profonds pour améliorer leur vitesse ainsi que leur capacité d'optimisation. Nous croyons que la théorie et les expériences de cette thèse sont une étape importante sur la voie du calcul conditionnel distribué car elle cadre bien le problème, surtout en ce qui concerne la compétitivité des systèmes d'experts. / The objective of this paper is to present different applications of the distributed conditional computation research program. It is hoped that these applications and the theory presented here will lead to a general solution of the problem of artificial intelligence, especially with regard to the need for efficiency. The vision of distributed conditional computation is to accelerate the evaluation and training of deep models which is very different from the usual objective of improving its generalization and optimization capacity. The work presented here has close ties with mixture of experts models. In Chapter 2, we present a new deep learning algorithm that uses a form of reinforcement learning on a novel neural network decision tree model. We demonstrate the need for a balancing constraint to keep the distribution of examples to experts uniform and to prevent monopolies. To make the calculation efficient, the training and evaluation are constrained to be sparse by using a gater that samples experts from a multinomial distribution given examples. In Chapter 3 we present a new deep model consisting of a sparse representation divided into segments of experts. A neural network language model is constructed from blocks of sparse transformations between these expert segments. The block-sparse operation is implemented for use on graphics cards. Its speed is compared with two dense operations of the same caliber to demonstrate and measure the actual efficiency gain that can be obtained. A deep model using these block-sparse operations controlled by a distinct gater is trained on a dataset of one billion words. A new algorithm for data partitioning (clustering) is applied to a set of words to organize the output layer of a language model into a conditional hierarchy, thereby making it much more efficient. The work presented in this thesis is central to the vision of distributed conditional computation as issued by Yoshua Bengio. It attempts to apply research in the area of mixture of experts to deep models to improve their speed and their optimization capacity. We believe that the theory and experiments of this thesis are an important step on the path to distributed conditional computation because it provides a good framework for the problem, especially concerning competitiveness inherent to systems of experts. calcul conditionnel distribué réseau de neurones apprentissage profond apprentissage supervisé apprentissage par renforcement arbres de décisions modèle de langage softmax hierarchique mélange d'experts torch distributed conditional computation neural network deep learning supervised learning reinforcement learning decision tree language model hierarchical softmax mixture of experts torch

Search results