Global ETD Search

1	Learning deep embeddings by learning to rank He, Kun 05 February 2019 (has links) We study the problem of embedding high-dimensional visual data into low-dimensional vector representations. This is an important component in many computer vision applications involving nearest neighbor retrieval, as embedding techniques not only perform dimensionality reduction, but can also capture task-specific semantic similarities. In this thesis, we use deep neural networks to learn vector embeddings, and develop a gradient-based optimization framework that is capable of optimizing ranking-based retrieval performance metrics, such as the widely used Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG). Our framework is applied in three applications. First, we study Supervised Hashing, which is concerned with learning compact binary vector embeddings for fast retrieval, and propose two novel solutions. The first solution optimizes Mutual Information as a surrogate ranking objective, while the other directly optimizes AP and NDCG, based on the discovery of their closed-form expressions for discrete Hamming distances. These optimization problems are NP-hard, therefore we derive their continuous relaxations to enable gradient-based optimization with neural networks. Our solutions establish the state-of-the-art on several image retrieval benchmarks. Next, we learn deep neural networks to extract Local Feature Descriptors from image patches. Local features are used universally in low-level computer vision tasks that involve sparse feature matching, such as image registration and 3D reconstruction, and their matching is a nearest neighbor retrieval problem. We leverage our AP optimization technique to learn both binary and real-valued descriptors for local image patches. Compared to competing approaches, our solution eliminates complex heuristics, and performs more accurately in the tasks of patch verification, patch retrieval, and image matching. Lastly, we tackle Deep Metric Learning, the general problem of learning real-valued vector embeddings using deep neural networks. We propose a learning to rank solution through optimizing a novel quantization-based approximation of AP. For downstream tasks such as retrieval and clustering, we demonstrate promising results on standard benchmarks, especially in the few-shot learning scenario, where the number of labeled examples per class is limited. Computer science Average precision Computer vision Deep learning Learning to rank Nearest neighbor retrieval Vector embedding
2	Performance comparison of data mining algorithms for imbalanced and high-dimensional data Rubio Adeva, Daniel January 2023 (has links) Artificial intelligence techniques, such as artificial neural networks, random forests, or support vector machines, have been used to address a variety of problems in numerous industries. However, in many cases, models have to deal with issues such as imbalanced data or high multi-dimensionality. This thesis implements and compares the performance of support vector machines, random forests, and neural networks for a new bank account fraud detection, a use case defined by imbalanced data and high multi-dimensionality. The neural network achieved both the best AUC-ROC (0.889) and the best average precision (0.192). However, the results of the study indicate that the difference between the models’ performance is not statistically significant to reject the initial hypothesis that assumed equal model performances. / Artificiell intelligens, som artificiella neurala nätverk, random forests eller support vector machines, har använts för att lösa en mängd olika problem inom många branscher. I många fall måste dock modellerna hantera problem som obalanserade data eller hög flerdimensionalitet. Denna avhandling implementerar och jämför prestandan hos support vector machines, random forests och neurala nätverk för att upptäcka bedrägerier med nya bankkonton, ett användningsfall som definieras av obalanserade data och hög flerdimensionalitet. Det neurala nätverket uppnådde både den bästa AUC-ROC (0,889) och den bästa genomsnittliga precisionen (0,192). Resultaten av studien visar dock att skillnaden mellan modellernas prestanda inte är statistiskt signifikant för att förkasta den ursprungliga hypotesen som antog lika modellprestanda. Data science neural network random forest support vector machine imbalanced data average precision ROC Datavetenskap neuralt nätverk slumpmässig skog stödvektormaskin obalanserad data medelprecision ROC Computer and Information Sciences Data- och informationsvetenskap
3	Recommender System for Gym Customers Sundaramurthy, Roshni January 2020 (has links) Recommender systems provide new opportunities for retrieving personalized information on the Internet. Due to the availability of big data, the fitness industries are now focusing on building an efficient recommender system for their end-users. This thesis investigates the possibilities of building an efficient recommender system for gym users. BRP Systems AB has provided the gym data for evaluation and it consists of approximately 896,000 customer interactions with 8 features. Four different matrix factorization methods, Latent semantic analysis using Singular value decomposition, Alternating least square, Bayesian personalized ranking, and Logistic matrix factorization that are based on implicit feedback are applied for the given data. These methods decompose the implicit data matrix of user-gym group activity interactions into the product of two lower-dimensional matrices. They are used to calculate the similarities between the user and activity interactions and based on the score, the top-k recommendations are provided. These methods are evaluated by the ranking metrics such as Precision@k, Mean average precision (MAP) @k, Area under the curve (AUC) score, and Normalized discounted cumulative gain (NDCG) @k. The qualitative analysis is also performed to evaluate the results of the recommendations. For this specific dataset, it is found that the optimal method is the Alternating least square method which achieved around 90\% AUC for the overall system and managed to give personalized recommendations to the users. Recommender system collaborative filtering matrix factorization sparse matrix latent semantic analysis singular value decomposition alternating least square Bayesian personalized ranking logistic matrix factorization stochastic gradient descent AUC metric mean average precision normalized discounted cumulative gain Rekommendationssystem Computer Sciences Datavetenskap (datalogi) Probability Theory and Statistics Sannolikhetsteori och statistik

1

Page generated in 0.0959 seconds