Spelling suggestions: "subject:"tag recommendations""
1 |
Tag recommendation using Latent Dirichlet Allocation.Choubey, Rahul January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / Doina Caragea / The vast amount of data present on the internet calls for ways to label and organize this data according to specific categories, in order to facilitate search and browsing activities.
This can be easily accomplished by making use of folksonomies and user provided tags.
However, it can be difficult for users to provide meaningful tags. Tag recommendation
systems can guide the users towards informative tags for online resources such as websites, pictures, etc. The aim of this thesis is to build a system for recommending tags to URLs available through a bookmark sharing service, called BibSonomy. We assume that the URLs for which we recommend tags do not have any prior tags assigned to them.
Two approaches are proposed to address the tagging problem, both of them based on
Latent Dirichlet Allocation (LDA) Blei et al. [2003]. LDA is a generative and probabilistic
topic model which aims to infer the hidden topical structure in a collection of documents.
According to LDA, documents can be seen as mixtures of topics, while topics can be seen as mixtures of words (in our case, tags). The first approach that we propose, called topic words based approach, recommends the top words in the top topics representing a resource as tags for that particular resource. The second approach, called topic distance based approach, uses the tags of the most similar training resources (identified using the KL-divergence Kullback and Liebler [1951]) to recommend tags for a test untagged resource.
The dataset used in this work was made available through the ECML/PKDD Discovery
Challenge 2009. We construct the documents that are provided as input to LDA in two
ways, thus producing two different datasets. In the first dataset, we use only the description and the tags (when available) corresponding to a URL. In the second dataset, we crawl the URL content and use it to construct the document. Experimental results show that the LDA approach is not very effective at recommending tags for new untagged resources. However, using the resource content gives better results than using the description only. Furthermore,
the topic distance based approach is better than the topic words based approach, when only the descriptions are used to construct documents, while the topic words based approach works better when the contents are used to construct documents.
|
2 |
Hybrid Tag Recommendation in Collaborative Tagging SystemsLipczak, Marek 15 March 2012 (has links)
The simplicity and flexibility of tagging allows users to collaboratively create large, loosely structured repositories of Web resources. One of its main drawbacks is the need for manual formulation of tags for each posted resource. This task can be eased by a tag recommendation system, the objective of which is to propose a set of tags for a given resource, user pair. Tag recommendation is an interesting and well-defined practical problem. Its main features are constant interaction with users and availability of large amounts of tagged data. Given the opportunities (e.g., rich user feedback) and limitations (e.g., real-time response) of the tag recommendation setting, we defined six requirements for a practically useful tag recommendation system. We present a conceptual design and system architecture of a hybrid tag recommendation system, which meets all these requirements. The system utilizes the strengths of various tag sources (e.g., resource content and user profiles) and the relations between concepts captured in tag co-occurrence graphs mined from collaborative actions of users. The architecture of the proposed system is based on a text indexing engine, which allows the system to deal with large datasets in real time, while constantly adapting its models to newly added posts. The effectiveness and efficiency of the system was evaluated for six datasets representing a broad range of collaborative tagging systems. The experiments confirmed the high quality of results and practical usability of the system. In a comparative study the system outperformed a state-of-the-art algorithm based on tensor factorization for the most representative datasets applicable to both methods. The experiments on the characteristics of tagging data and the performance of the system allowed us to find answers to important research questions adapted from the general area of recommender systems. We confirmed the importance of infrequently used tags in the recommendation process and proposed solutions to overcome the cold start problem in tag recommendation. We demonstrated that a parameter tuning approach makes a hybrid tag recommendation system adaptable to various datasets. We also revealed the importance of the utilization of a feedback loop in the tag recommendation process.
|
3 |
Predição de tags usando linked data: um estudo de caso no banco de dados Arquigrafia / Tag prediction using linked data: a case study in the Arquigrafia databaseSouza, Ricardo Augusto Teixeira de 17 December 2013 (has links)
Dada a grande quantidade de conteúdo criado por usuários na Web, uma proposta para ajudar na busca e organização é a criação de sistemas de anotações (tagging systems), normalmente na forma de palavras-chave, extraídas do próprio conteúdo ou sugeridas por visitantes. Esse trabalho aplica um algoritmo de mineração de dados em um banco de dados RDF, contendo instâncias que podem fazer referências à rede Linked Data do DBpedia, para recomendação de tags utilizando as medidas de similaridade taxonômica, relacional e literal de descrições RDF. O banco de dados utilizado é o Arquigrafia, um sistema de banco de dados na Web cujo objetivo é catalogar imagens de projetos arquitetônicos, e que permite que visitantes adicionem tags às imagens. Foram realizados experimentos para a avaliação da qualidade das recomendações de tags realizadas considerando diferentes modelos do Arquigrafia incluindo o modelo estendido do Arquigrafia que faz referências ao DBpedia. Os resultados mostram que a qualidade da recomendação de determinadas tags pode melhorar quando consideramos diferentes modelos (com referências à rede Linked Data do DBpedia) na fase de aprendizado. / Given the huge content created by users in the Web, a way to help in search and organization is the creation of tagging systems, usually in a keyword form (extracted from the Web content or suggested by users). This work applies a data mining algorithm in a RDF database, which contain instances that can reference the DBpedia Linked Data repository, to recommend tags using the taxonomic, relational and literal similarities from RDF descriptions. The database used is the Arquigrafia, a database system available in the Web which goal is to catalog architecture projects, and it allows a user to add tags to images. Experiments were performed to evaluate the quality of the tag recommendations made considering differents models of Arquigrafia\'s database, including an extended model which has references to DBpedia. The results shown that the quality of the recommendations of some tags can be improved when we consider different models (with references to DBpedia Linked Data repository) in the learning phase.
|
4 |
Predição de tags usando linked data: um estudo de caso no banco de dados Arquigrafia / Tag prediction using linked data: a case study in the Arquigrafia databaseRicardo Augusto Teixeira de Souza 17 December 2013 (has links)
Dada a grande quantidade de conteúdo criado por usuários na Web, uma proposta para ajudar na busca e organização é a criação de sistemas de anotações (tagging systems), normalmente na forma de palavras-chave, extraídas do próprio conteúdo ou sugeridas por visitantes. Esse trabalho aplica um algoritmo de mineração de dados em um banco de dados RDF, contendo instâncias que podem fazer referências à rede Linked Data do DBpedia, para recomendação de tags utilizando as medidas de similaridade taxonômica, relacional e literal de descrições RDF. O banco de dados utilizado é o Arquigrafia, um sistema de banco de dados na Web cujo objetivo é catalogar imagens de projetos arquitetônicos, e que permite que visitantes adicionem tags às imagens. Foram realizados experimentos para a avaliação da qualidade das recomendações de tags realizadas considerando diferentes modelos do Arquigrafia incluindo o modelo estendido do Arquigrafia que faz referências ao DBpedia. Os resultados mostram que a qualidade da recomendação de determinadas tags pode melhorar quando consideramos diferentes modelos (com referências à rede Linked Data do DBpedia) na fase de aprendizado. / Given the huge content created by users in the Web, a way to help in search and organization is the creation of tagging systems, usually in a keyword form (extracted from the Web content or suggested by users). This work applies a data mining algorithm in a RDF database, which contain instances that can reference the DBpedia Linked Data repository, to recommend tags using the taxonomic, relational and literal similarities from RDF descriptions. The database used is the Arquigrafia, a database system available in the Web which goal is to catalog architecture projects, and it allows a user to add tags to images. Experiments were performed to evaluate the quality of the tag recommendations made considering differents models of Arquigrafia\'s database, including an extended model which has references to DBpedia. The results shown that the quality of the recommendations of some tags can be improved when we consider different models (with references to DBpedia Linked Data repository) in the learning phase.
|
5 |
基於文件相似度的標籤推薦-應用於問答型網站 / Applying Tag Recommendation base on Document Similarity in Question and Answer Website葉早彬, Tsao, Pin Yeh Unknown Date (has links)
隨著人們習慣的改變,從網路上獲取新知漸漸取代傳統媒體,這也延伸產生許多新的行為。社群標籤是近幾年流行的一種透過使用者標記來分類與詮釋資訊的方式,相較於傳統分類學要求物件被分類到預先定義好的類別,社群標籤則沒有這樣的要求,因此容易因應內容的變動做出調整。
問答型網站是近年來興起的一種個開放性的知識分享平台,例如quora、Stack Overflow、yahoo 奇摩知識+,使用者可以在平台上與網友做問答的互動,在問與答的討論中,結合大眾的經驗與專長,幫助使用者找到滿意的答案,使用單純的問答系統的好處是可以不必在不同且以分類為主的論壇花費時間尋找答案,和在關鍵字搜索中的結果花費時間尋找答案。
本研究希望能針對問答型網站的文件做自動標籤分類,運用標籤推薦技術來幫助使用者能夠更有效率的找到需要的問題,也讓問答平台可以把這些由使用者所產生的大量問題分群歸類。
在研究過程蒐集Stack Exchange問答網站共20638個問題,使用naïve Bayes演算法與文件相似度計算的方式,進行標籤推薦,推薦適合的標籤給新進文件。在研究結果中,推薦標籤的準確率有64.2%
本研究希望透過自動分類標籤,有效地分類問題。幫助使用者有效率的找到需要的問題,也能把這些由使用者所產生的大量問題分群歸類。 / With User's behavior change. User access to new knowledge from the internet instead of from the traditional media. This Change leads to a lot new behavior. Social tagging is popular in recent years through a user tag to classify and annotate information. Unlike traditional taxonomy requiring items are classified into predefined categories, Social tagging is more elastic to adjust through the content change.
Q & A Website is the rise in recent years. Like Quora , Stack Overflow , yahoo Knowledge plus. User can interact with other people form this platform , in Q & A discussion, with People's experience and expertise to help the user find a satisfactory answer.
This study hopes to build a tag recommendation system for Q & A Website. The recommendation system can help people find the right problem efficiently , and let Q & A platform can put these numerous problems into the right place.
We collect 20,638 questions from Stack Exchange. Use naïve Bayes algorithm and document similarity calculation to recommend tag for the new document. The result of the evaluation show we can effectively recommend relevant tags for the new question.
|
6 |
Towards Folksonomy-based Personalized Services in Social MediaRawashdeh, Majdi 30 April 2014 (has links)
Every single day, lots of users actively participate in social media sites (e.g., Facebook, YouTube, Last.fm, Flicker, etc.) upload photos, videos, share bookmarks, write blogs and annotate/comment on content provided by others. With the recent proliferation of social media sites, users are overwhelmed by the huge amount of available content. Therefore, organizing and retrieving appropriate multimedia content is becoming an increasingly important and challenging task. This challenging task led a number of research communities to concentrate on social tagging systems (also known as folksonomy) that allow users to freely annotate their media items (e.g., music, images, or video) with any sort of arbitrary words, referred to as tags. Tags assist users to organize their own content, as well as to find relevant content shared by other users. In this thesis, we first analyze how useful a folksonomy is for improving personalized services such as tag recommendation, tag-based search and item annotation. We then propose two new algorithms for social media retrieval and tag recommendation respectively. The first algorithm computes the latent preferences of tags for users from other similar tags, as well as latent annotations of tags for items from other similar items. We then seamlessly map the tags onto items, depending on an individual user’s query, to find the most desirable content relevant to the user’s needs. The second algorithm improves tag-recommendation and item annotation by adapting the Katz measure, a path-ensemble based proximity measure, for the use in social tagging systems. In this algorithm we model folksonomy as a weighted, undirected tripartite graph. We then apply the Katz measure to this graph, and exploit it to provide personalized tag recommendation for individual users. We evaluate our algorithms on two real-world folksonomies collected from Last.fm and CiteULike. The experimental results demonstrate that the proposed algorithms improve the search and the recommendation performance, and obtain significant gains in cold start situations where relatively little information is known about a user or an item
|
7 |
應用主題探勘與標籤聚合於標籤推薦之研究 / Application of topic mining and tag clustering for tag recommendation高挺桂, Kao, Ting Kuei Unknown Date (has links)
標記社群標籤是Web2.0以來流行的一種透過使用者詮釋和分享資訊的方式,作為傳統分類方法的替代,其方便、靈活的特色使得使用者能夠輕易地因應內容標註標籤。不過其也有缺點,除了有相當多無標籤標註的內容,也存在大量模糊、不精確的標籤,降低了系統本身組織分類標籤的能力。為了解決上述兩項問題,本研究提出了一種結合主題探勘與標籤聚合的自動化標籤推薦方法,期望能夠建立一個去人工過程的自動化標籤推薦規則,來推薦合適的標籤給使用者。
本研究蒐集了痞客邦部落格中,點閱次數大於5000次的熱門中文文章共2500篇,經過前處理,並以其中1939篇訓練模型及400篇作為測試語料來驗證方法。在主題探勘部分,本研究利用LDA主題模型計算不同文章的主題語意,來與既有標籤作出關聯,而能夠針對新進文章預測主題並推薦主題相關標籤給它。其中,本研究利用了能評斷模型表現情形的混淆度(Perplexity)來協助選取LDA的主題數,改善了LDA需要人主觀決定主題數的問題;在標籤聚合部分,本研究以階層式分群法,將有共同出現過的標籤群聚起來,以便找出有相似語意概念的標籤。其中,本研究將分群停止條件設定為共現次數最少為1次,改善了分群方法需要設定分群數量才能有結果的問題,也使本方法能夠自動化的找出合適的分群數目。
實驗結果顯示,依照文章主題語意來推薦標籤有一定程度的可行性,且以混淆度所協助選取的主題數取得一致性較好的結果。而依照階層式分群所分出的標籤群中,同一群中的標籤確實擁有相似、類似的概念語意。最後,在結合主題探勘與標籤聚合的方法上,其Top-1至Top-5的準確率平均提升了14.1%,且Top-1準確率也達到72.25%。代表本研究針對文章寫作及標記標籤的習性切入的做法,確實能幫助提升標籤推薦的準確率,也代表本研究確實建立了一個自動化的標籤推薦規則,能推薦出合適的標籤來幫助使用者在撰寫文章後,能夠更方便、精確的標上標籤。 / Tags are a popular way of interpreting and sharing information through use, and as a substitute for traditional classification methods, the convenience and flexibility of the community makes it easy for users to use. But it also has disadvantages, in addition to a considerable number of non-tagged content, there are also many fuzzy and inaccurate tags. To solve these two problems, this study proposes a tag recommendation method that combines the Topic Mining and Tag Clustering.
In this study, we collected a total of 2500 articles by Pixnet as a corpus. In the Topic Mining section, this study uses the LDA Model to calculate the subject semantics of different articles to associate with existing tags, and we can predict topics for new articles to recommend topics related tags to them. Among them, the topics number of the LDA Model uses the Perplexity to help the selection. In the Tag Clustering section, this study uses the Hierarchical Clustering to collect the tags that have appeared together to find similar semantic concepts. The stop condition is set to a minimum of 1 co-occurrence times, which solves the problem that the clustering method needs to set the number of groups to have the result.
First, the Topic Mining results show that it is feasible to recommend tags according to the semantics of the article, and the experiment proves that the number of topics chosen according to the Perplexity is superior to the other topics. Second, the Tag Clustering results show that the same group of tags does have similar conceptual semantics. Last, experiments show that the accuracy rate of Top-1 to Top-5 in combination with two methods increased average of 14.1%, and its Top-1 accuracy rate is 72.25%,and it tells that our tag recommendation method can recommend the appropriate tag for users to use.
|
8 |
Towards Folksonomy-based Personalized Services in Social MediaRawashdeh, Majdi January 2014 (has links)
Every single day, lots of users actively participate in social media sites (e.g., Facebook, YouTube, Last.fm, Flicker, etc.) upload photos, videos, share bookmarks, write blogs and annotate/comment on content provided by others. With the recent proliferation of social media sites, users are overwhelmed by the huge amount of available content. Therefore, organizing and retrieving appropriate multimedia content is becoming an increasingly important and challenging task. This challenging task led a number of research communities to concentrate on social tagging systems (also known as folksonomy) that allow users to freely annotate their media items (e.g., music, images, or video) with any sort of arbitrary words, referred to as tags. Tags assist users to organize their own content, as well as to find relevant content shared by other users. In this thesis, we first analyze how useful a folksonomy is for improving personalized services such as tag recommendation, tag-based search and item annotation. We then propose two new algorithms for social media retrieval and tag recommendation respectively. The first algorithm computes the latent preferences of tags for users from other similar tags, as well as latent annotations of tags for items from other similar items. We then seamlessly map the tags onto items, depending on an individual user’s query, to find the most desirable content relevant to the user’s needs. The second algorithm improves tag-recommendation and item annotation by adapting the Katz measure, a path-ensemble based proximity measure, for the use in social tagging systems. In this algorithm we model folksonomy as a weighted, undirected tripartite graph. We then apply the Katz measure to this graph, and exploit it to provide personalized tag recommendation for individual users. We evaluate our algorithms on two real-world folksonomies collected from Last.fm and CiteULike. The experimental results demonstrate that the proposed algorithms improve the search and the recommendation performance, and obtain significant gains in cold start situations where relatively little information is known about a user or an item
|
Page generated in 0.1407 seconds