Spelling suggestions: "subject:"binterest prediction"" "subject:"cinterest prediction""
1 |
Ontology engineering and feature construction for predicting friendship links and users interests in the Live Journal social networkBahirwani, Vikas January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / Doina Caragea / William H. Hsu / An ontology can be seen as an explicit description of the concepts and relationships that exist in a domain. In this thesis, we address the problem of building an interests' ontology and using the same to construct features for predicting both potential friendship relations between users in the social network Live Journal, and users' interests. Previous work has shown that the accuracy of predicting friendship links in this network is very low if simply interests common to two users are used as features and no network graph features are considered. Thus, our goal is to organize users' interests into an ontology (specifically, a concept hierarchy) and to use the semantics captured by this ontology to improve the performance of learning algorithms at the task of predicting if two users can be friends. To achieve this goal, we have designed and implemented a hybrid clustering algorithm, which combines hierarchical agglomerative and divisive clustering paradigms, and automatically builds the interests' ontology. We have explored the use of this ontology to construct interest-based features and shown that the resulting features improve the performance of various classifiers for predicting friendships in the Live Journal social network. We have also shown that using the interests' ontology, one can address the problem of predicting the interests of Live Journal users, a task that in absence of the ontology is not feasible otherwise as there is an overwhelming number of interests.
|
2 |
Better representation learning for TPMSRaza, Amir 10 1900 (has links)
Avec l’augmentation de la popularité de l’IA et de l’apprentissage automatique, le nombre
de participants a explosé dans les conférences AI/ML. Le grand nombre d’articles soumis
et la nature évolutive des sujets constituent des défis supplémentaires pour les systèmes
d’évaluation par les pairs qui sont cruciaux pour nos communautés scientifiques. Certaines
conférences ont évolué vers l’automatisation de l’attribution des examinateurs pour
les soumissions, le TPMS [1] étant l’un de ces systèmes existants. Actuellement, TPMS
prépare des profils de chercheurs et de soumissions basés sur le contenu, afin de modéliser
l’adéquation des paires examinateur-soumission.
Dans ce travail, nous explorons différentes approches pour le réglage fin auto-supervisé
des transformateurs BERT pour les données des documents de conférence. Nous démontrons
quelques nouvelles approches des vues d’augmentation pour l’auto-supervision dans le
traitement du langage naturel, qui jusqu’à présent était davantage axée sur les problèmes de
vision par ordinateur. Nous utilisons ensuite ces représentations d’articles individuels pour
construire un modèle d’expertise qui apprend à combiner la représentation des différents
travaux publiés d’un examinateur et à prédire leur pertinence pour l’examen d’un article
soumis. Au final, nous montrons que de meilleures représentations individuelles des papiers
et une meilleure modélisation de l’expertise conduisent à de meilleures performances dans
la tâche de prédiction de l’adéquation de l’examinateur. / With the increase in popularity of AI and Machine learning, participation numbers have
exploded in AI/ML conferences. The large number of submission papers and the evolving
nature of topics constitute additional challenges for peer-review systems that are crucial for
our scientific communities. Some conferences have moved towards automating the reviewer
assignment for submissions, TPMS [1] being one such existing system. Currently, TPMS
prepares content-based profiles of researchers and submission papers, to model the suitability
of reviewer-submission pairs.
In this work, we explore different approaches to self-supervised fine-tuning of BERT
transformers for conference papers data. We demonstrate some new approaches to augmentation
views for self-supervision in natural language processing, which till now has
been more focused on problems in computer vision. We then use these individual paper
representations for building an expertise model which learns to combine the representation
of different published works of a reviewer and predict their relevance for reviewing
a submission paper. In the end, we show that better individual paper representations
and expertise modeling lead to better performance on the reviewer suitability prediction task.
|
Page generated in 0.0946 seconds