Spelling suggestions: "subject:"implicit data"" "subject:"implicit mata""
1 |
Building a Sporting Goods Recommendation SystemFlodman, Mikael January 2015 (has links)
This thesis report describes an attempt to build a recommender system for recommending sporting goods in an e-commerce setting, using the customer purchase history as the input dataset. Two input datasets were considered, the item purchases dataset and the item-category dataset. Both the datasets are implicit, that is not explicitly rated by the customer. The data is also very sparse that very few users have purchased more than a handful of the items featured in the dataset. The report describes a method for dealing with both the implicit datasets as well as addressing the problem of sparsity. The report introduces SVD (Single Value Decomposition) with matrix factorization as a implementation for recommendation systems. Specifically implementations in the Apache Mahout machine learning framework. / Denna rapport beskriver ett tillvägagångssätt för att med kundernas köphistorik bygga ett rekommendationssystem för rekommendation av sportprodukter på en e-handelsplats. Två olika datamängder behandlas, köphistorik per produkt och kund, samt köpfrekvensen per produktkategori per kund i köphistoriken. Båda är implicita datamängder, vilket betyder att kunderna inte har explicit uttryckt en åsikt för eller emot produkten, utan implicit uttrycker preferens genom sitt köp. Datan är även mycket gles, vilket betyder att den enskilda kunden generellt bara köpt en liten del av den totala mängden av sålda varor. Rapporten behandlar en metod som behandlar både den implicita karaktären av data och gleshets problemet. Rapporten introducerar SVD (Single Value Decomposition) med matrisfaktorisering som en metod för att implementera rekommendationssystem. Specifikt implementerat med hjälp av maskininlärningsbiblioteket Apache Mahout.
|
2 |
Discovering and Using Implicit Data for Information RetrievalYi, Xing 01 September 2011 (has links)
In real-world information retrieval (IR) tasks, the searched items and/or the users' queries often have implicit information associated with them -- information that describes unspecified aspects of the items or queries. For example, in web search tasks, web pages are often pointed to by hyperlinks (known as anchors) from other pages, and thus have human-generated succinct descriptions of their content (anchor text) associated with them. This indirectly available information has been shown to improve search effectiveness for different retrieval tasks. However, in many real-world IR challenges this information is sparse in the data; i.e., it is incomplete or missing in a large portion of the data. In this work, we explore how to discover and use implicit information in large amounts of data in the context of IR. We present a general perspective for discovering implicit information and demonstrate how to use the discovered data in four specific IR challenges: (1) finding relevant records in semi-structured databases where many records contain incomplete or empty fields; (2) searching web pages that have little or no associated anchor text; (3) using click-through records in web query logs to help search pages that have no or very few clicks; and (4) discovering plausible geographic locations for web queries that contain no explicit geographic information. The intuition behind our approach is that data similar in some aspects are often similar in other aspects. Thus we can (a) use the observed information of queries/documents to find similar queries/documents, and then (b) utilize those similar queries/documents to reconstruct plausible implicit information for the original queries/documents. We develop language modeling based techniques to effectively use content similarity among data for our work. Using the four different search tasks on large-scale noisy datasets, we empirically demonstrate the effectiveness of our approach. We further discuss the advantages and weaknesses of two complementary approaches within our general perspective of handling implicit information for retrieval purpose. Taken together, we describe a general perspective that uses contextual similarity among data to discover implicit information for IR challenges. Using this general perspective, we formally present two language modeling based information discovery approaches. We empirically evaluate our approaches using different IR challenges. Our research shows that supporting information discovery tailored to different search tasks can enhance IR systems' search performance and improve users' search experience.
|
3 |
Visualizing Users, User Communities, and Usage Trends in Complex Information Systems Using Implicit Rating DataKim, Seonho 01 May 2008 (has links)
Research on personalization, including recommender systems, focuses on applications such as in online shopping malls and simple information systems. These systems consider user profile and item information obtained from data explicitly entered by users. There it is possible to classify items involved and to personalize based on a direct mapping from user or user group to item or item group. However, in complex, dynamic, and professional information systems, such as digital libraries, additional capabilities are needed to achieve personalization to support their distinctive features: large numbers of digital objects, dynamic updates, sparse rating data, biased rating data on specific items, and challenges in getting explicit rating data from users. For this reason, more research on implicit rating data is recommended, because it is easy to obtain, suffers less from terminology issues, is more informative, and contains more user-centered information. In previous reports on my doctoral work, I discussed collecting, storing, processing, and utilizing implicit rating data of digital libraries for analysis and decision support. This dissertation presents a visualization tool, VUDM (Visual User-model Data Mining tool), utilizing implicit rating data, to demonstrate the effectiveness of implicit rating data in characterizing users, user communities, and usage trends of digital libraries. The results of user studies, performed both with typical end-users and with library experts, to test the usefulness of VUDM, support that implicit rating data is useful and can be utilized for digital library analysis software, so that both end users and experts can benefit. / Ph. D.
|
4 |
Rekommendationsmotor: med fokus inom E-lärande / Recommendation engine: focus within E-learningJakobsson, Lennart, Nilsson, Thires January 2018 (has links)
Studier kring rekommendationsmotorer är ett område med större signifikans i en växande digital verklighet. Mängden med information ökar och med mer information blir det svårare att hitta det som för individen är av intresse. Vissa specifika områden med tillämpning av rekommendationsmotorer är mer välstuderade än andra, domäner som sysslar med försäljning hamnar i den mer studerade kategorin. Andra domäner som är i behov av rekommendationsmotorer, som inte är lika välstuderade är verksamheter som tillhandahåller möjlighet för lärande via internet. En av dessa verksamheter heter Nomp och erbjuder ett läroverktyg för barn och ungdomar inom matematik. Målet med denna studie är därför att implementera en rekommendationsmotor inom denna mindre utforskade domän. Målet är även att undersöka nyttan med rekommendationsmotorn för applikationens användare. Studien har baserats på ett ramverk inom designforskning, vilket inkluderar olika typer av experiment samt en undersökning. Resultaten från dessa aktiviteter utgjorde empirin för den analys som sedan genomfördes. Resultatet ger visst stöd för att det är möjligt att implementera en rekommendationsmotor för denna domän. De visade däremot inget entydigt svar i vilken omfattning dess nytta har för slutanvändaren. Studiens målsättning uppfylldes till viss del, däremot kunde nyttan för slutanvändaren utforskats i större omfattning. Förhoppningen är att denna studie ska ha effekter i form av praktiska konsekvenser, där användare kan spendera mindre tid på att leta efter information som kan vara till nytta. Det som skiljer sig i denna studie från tidigare liknande studier är att rekommendationsmotorn är implementerad för att passa en verklig verksamhet. I jämförelse med andra studier är denna studie även baserad på data direkt från verksamhetens användare. Vissa liknande artefakter har blivit implementerade, men då är de ofta mer generella eller har använt sig av data som inte är relevant för domänen. Det är också vanligare att liknande rekommendationsmotorer använder sig av direkt användarfeedback för att göra rekommendationer, vilket inte används i denna studie. / Studies regarding recommendation engines have gained greater importance in our reality of the digital community. With regards to the continuously growing amount of digital information it has become harder to find information that’s of importance to the individual. Some specific domains with enforcement of recommendation engines are more studied than others, domains that distribute services or items usually end up in this category. Other domains that are in need of recommendation engines, that’s not as well explored is business which enables learning through the internet. One of these business is called Nomp and provides a learning tool for kids and young teenagers in mathematics. The goal with this study is therefore to implement a recommendation engine for a business that is within this lesser explored domain. The goal is also to explore the advantages a recommendation engine would provide for its users. The study is based on a framework within design science research, which included various kinds of experiments and a survey. The results from these activities represented the empirics for the analysis that was conducted. The results show some signs that it’s possible to implement an artifact for this domain. However, it does not clearly show to what extent it’s valuable for the end user. For some part, the objectives for this study was met. Although, the advantages for the users could have been explored in greater depth. The overall prospects by conducting this study is that it will have some practical consequences, that the user can or will spend lesser time to search for important information. Differences between this study and other similar studies is that the recommendation engine is implemented to fit the needs of a real business. Also, compared to others, this study is based on data collected directly from the end users. Some similar systems have been implemented but the artefact is often more general or might have used data that’s not relevant the domain. It’s also more common that similar recommendation engines are using direct user feedback to make recommendations, which is not used in this study.
|
5 |
Impact of implicit data in a job recommender systemWakman, Josef January 2020 (has links)
Many employment services base their online job recommendations to users based solely on explicit data in their profiles. The implicit data of what users for example click on, save and mark as irrelevant goes unused. Instead of making recommendations based on user behavior they make a direct comparison between user preferences and job ad attributes. A reason for this is the concern that the inclusion of implicit data can give odd recommendations resulting in a loss of credibility for the service. However, as research has shown this to be of great advantage to recommender systems. In this paper I implement a job recommender and test it both with user data including interaction history with job ads as well as with only explicit data. The results of the recommender with implicit data got better overall performance, but negligible gain in the ratio between true and false positives, or in other words the ratio between correct and incorrect recommendations.
|
6 |
UM AMBIENTE DE CONTEXTO PERSONALIZADO E ORIENTADO A TAREFAS NA ARQUITETURA CLINICSPACERizzetti, Tiago Antônio 21 August 2009 (has links)
The project ClinicSpace aims to fill gaps in current clinical systems, regarding to the characteristics of pervasive computing tasks and clinical activities support to the user (physician). The architecture of the model ClinicSpace, built from the perspective given by
the activity theory, it is composed of several modules that interconnected offer the features needed in a system geared to clinical pervasive end-user. One of these modules is the treatment of the clinical tasks. This work holds a discussion on the present requirements for
the treatment of the clinical tasks, defining an architecture to link them to the tasks of the user,
allowing context customization and automatic entry of data. The customization is achieved through the use of Programmable Elements of Context, which are represented by actuators, physical or logical, responsible for providing the system capacity of automatic executions, based on the parameters specified by the user. Yet the automatic data comes from the implicit way of obtaining these, the information used by applications that the user performs in the course of their duties. For this, an architecture was set up to support the customization and the semantic specification of data used. Building such features extended the pervasive middleware EXEHDA, modifying the already existing services and adding new ones. The
main contribution of this work is the interconnection between the components that make up the architecture, building a unique view of the context of a task from the perspective of the necessary data for it and the ability to be customized by the user. Thus, it reduces the need for
explicit data entry, and it contributes to the reducing rejections of its adoption of clinical systems in highly dynamic environments such as hospitals. / O projeto ClinicSpace tem por objetivo preencher as lacunas existentes nos sistemas clínicos atuais, no que tange às características de pervasividade e apoio de tarefas
computacionais às atividades clínicas que o usuário (médico) realiza. A arquitetura do modelo ClinicSpace, construída a partir da perspectiva dada pela teoria da atividade, é composta por vários módulos que interligados oferecem as características necessárias a um sistema clínico pervasivo orientado ao usuário-final. Um desses módulos é o tratamento de contexto das tarefas clínicas. Esse trabalho realiza uma discussão sobre os requisitos presentes para o tratamento de contexto das tarefas clínicas, definindo uma arquitetura para associá-los às tarefas do usuário, permitindo personalização de contexto e entrada automática de dados. A personalização é obtida através da utilização de Elementos Programáveis de Contexto, que são representados por atuadores, físicos ou lógicos, responsáveis por dotar o sistema de capacidade de execuções automáticas, baseadas em parâmetros de contexto especificados pelo usuário. Já a entrada automática de dados trata da obtenção destes de maneira implícita,
obtendo as informações utilizadas pelas aplicações que o usuário executa, no decorrer de suas tarefas. Para isso, definiu-se uma arquitetura com suporte à personalização e especificação semântica dos dados nela utilizados. Para construir tais funcionalidades, estendeu-se o middleware pervasivo EXEHDA, modificando serviços existentes e agregando novos serviços. A principal contribuição do trabalho está na interligação existente entre os
componentes que integram a arquitetura, construindo uma visão única do contexto de uma tarefa sob a perspectiva dos dados necessários a ela e da capacidade de personalização pelo usuário. Dessa forma, reduz-se a necessidade da entrada explícita de dados, e contribui-se para a redução da rejeição da adoção dos sistemas clínicos em ambientes altamente
dinâmicos, como os hospitalares.
|
7 |
Recommending digital books to children : Acomparative study of different state-of-the-art recommendation system techniques / Att rekommendera digitala böcker till barn : En jämförelsestudie av olika moderna tekniker för rekommendationssystemLundqvist, Malvin January 2023 (has links)
Collaborative filtering is a popular technique to use behavior data in the form of user’s interactions with, or ratings of, items in a system to provide personalized recommendations of items to the user. This study compares three different state-of-the-art Recommendation System models that implement this technique, Matrix Factorization, Multi-layer Perceptron and Neural Matrix Factorization, using behavior data from a digital book platform for children. The field of Recommendation Systems is growing, and many platforms can benefit of personalizing the user experience and simplifying the use of the platforms. To perform a more complex comparison and introduce a new take on the models, this study proposes a new way to represent the behavior data as input to the models, i.e., to use the Term Frequency-Inverse Document Frequency (TFIDF) of occurrences of interactions between users and books, as opposed to the traditional binary representation (positive if there has been any interaction and negative otherwise). The performance is measured by extracting the last book read for each user, and evaluating how the models would rank that book for recommendations to the user. To assess the value of the models for the children’s reading platform, the models are also compared to the existing Recommendation System on the digital book platform. The results indicate that the Matrix Factorization model performs best out of the three models when using children’s reading behavior data. However, due to the long training process and larger set of hyperparameters to tune for the other two models, these may not have reached an optimal hyperparameter tuning, thereby affecting the comparison among the three state-of-the-art models. This limitation is further discussed in the study. All three models perform significantly better than the current system on the digital book platform. The models with the proposed representation using TF-IDF values show notable promise, performing better than the binary representation in almost all numerical metrics for all models. These results can suggest future research work on more ways of representing behavior data as input to these types of models. / Kollaborativ filtrering är en populär teknik för att använda beteendedata från användare i form av t.ex. interaktioner med, eller betygsättning av, objekt i ett system för att ge användaren personliga rekommendationer om objekt. I den här studien jämförs tre olika modeller av moderna rekommendationssystem som tillämpar denna teknik, matrisfaktorisering, flerlagersperceptron och neural matrisfaktorisering, med hjälp av beteendedata från en digital läsplattform för barn. Rekommendationssystem är ett växande område, och många plattformar kan dra nytta av att anpassa användarupplevelsen utifrån individen och förenkla användningen av plattformen. För att utföra en mer komplex jämförelse och introducera en ny variant av modellerna, föreslår denna studie ett nytt sätt att representera beteendedata som indata till modellerna, d.v.s. att använda termfrekvens med omvänd dokumentfrekvens (TF- IDF) av förekomster av interaktioner mellan användare och böcker, i motsats till den traditionella binära representationen (positiv om en tidigare interaktion existerar och negativ i annat fall). Prestandan mäts genom att extrahera den senaste boken som lästs för varje användare, och utvärdera hur högt modellerna skulle rangordna den boken i rekommendationer till användaren. För att värdesätta modellerna för plattformen med digitala böcker, så jämförs modellerna också med det befintliga rekommendationssystemet på plattformen. Resultaten tyder på att matrisfaktorisering-modellen presterar bäst utav de tre modellerna när man använder data från barns läsbeteende. På grund av den långa träningstiden och fler hyperparametrar att optimera för de andra två modellerna, kan det dock vara så att de inte har nått en optimal hyperparameterinställning, vilket påverkar jämförelsen mellan de tre moderna modellerna. Denna begränsning diskuteras ytterligare i studien. Alla tre modellerna presterar betydligt bättre än det nuvarande systemet på läsplattformen. Modellerna med den föreslagna representationen av TFIDF-värden visar sig mycket lovande och presterar bättre än den binära representationen i nästan alla numeriska mått för alla modeller. Dessa resultat kan ge skäl för framtida forskning av fler sätt att representera beteendedata som indata till denna typ av modeller.
|
Page generated in 0.0635 seconds