Global ETD Search

1	Distributing Social Applications Leroy, Vincent 10 December 2010 (has links) (PDF) The so-called Web 2.0 revolution has fundamentally changed the way people interact with the Internet. The Web has turned from a read-only infrastructure to a collaborative platform. By expressing their preferences and sharing private information, the users benefit from a personalized Web experience. Yet, these systems raise several problems in terms of \emph{privacy} and \emph{scalability}. The social platforms use the user information for commercial needs and expose the privacy and preferences of the users. Furthermore, centralized personalized systems require costly data-centers. As a consequence, existing centralized social platforms do not exploit the full extent of the personalization possibilities. In this thesis, we consider the design of social networks and social information services in the context of \emph{peer-to-peer} (P2P) networks. P2P networks are decentralized architecture, thus the users participates to the service and control their own data. This greatly improves the privacy of the users and the scalability of the system. Nevertheless, building social systems in a distributed context also comes with many challenges. The information is distributed among the users and the system has be able to efficiently locate relevant data. The contributions of this thesis are as follow. We define the \emph{cold start link prediction} problem, which consists in predicting the edges of a social network solely from the social information of the users. We propose a method based on a \emph{probabilistic graph} to solve this problem. We evaluate it on a dataset from Flickr, using the group membership as social information. Our results show that the social information indeed enables a prediction of the social network. Thus, the centralization of the information threatens the privacy of the users, hence the need for decentralized systems. We propose \textsc{SoCS}, a \emph{decentralized} algorithm for \emph{link prediction}. Recommending neighbors is a central functionality in social networks, and it is therefore crucial to propose a decentralized approach as a first step towards P2P social networks. \textsc{SoCS} relies on gossip protocols to perform a force-based embedding of the social networks. The social coordinates are then used to predict links among vertices. We show that \textsc{SoCS} is adapted to decentralized systems at it is churn resilient and has a low bandwidth consumption. We propose \textsc{GMIN}, a \emph{decentralized} platform for \emph{personalized services} based on social information. \textsc{GMIN} provides each user with neighbors that share her interests. The clustering algorithm we propose takes care to encompass all the different interests of the user, and not only the main ones. We then propose a personalized \emph{query expansion} algorithm (\textsc{GQE}) that leverages the \textsc{GMIN} neighbors. For each query, the system computes a tag centrality based on the relations between tags as seen by the user and her neighbors. pair-à-pair réseaux sociaux folksonomy expansion de requêtes plongement de graphe prédiction de liens
2	Adaptation des systèmes de recherche d'information aux contextes : le cas des requêtes difficiles / Adapting information retrieval systems to contexts : the case of query difficulty Chifu, Adrian-Gabriel 15 June 2015 (has links) Le domaine de la recherche d'information (RI) étudie la façon de trouver des informations pertinentes dans un ou plusieurs corpus, pour répondre à un besoin d'information. Dans un Système de Recherche d'Information (SRI) les informations cherchées sont des " documents " et un besoin d'information prend la forme d'une " requête " formulée par l'utilisateur. La performance d'un SRI est dépendante de la requête. Les requêtes pour lesquelles les SRI échouent (pas ou peu de documents pertinents retrouvés) sont appelées dans la littérature des " requêtes difficiles ". Cette difficulté peut être causée par l'ambiguïté des termes, la formulation peu claire de la requête, le manque de contexte du besoin d'information, la nature et la structure de la collection de documents, etc. Cette thèse vise à adapter les systèmes de recherche d'information à des contextes, en particulier dans le cadre de requêtes difficiles. Le manuscrit est structuré en cinq chapitres principaux, outre les remerciements, l'introduction générale et les conclusions et perspectives. Le premier chapitre représente une introduction à la RI. Nous développons le concept de pertinence, les modèles de recherche de la littérature, l'expansion de requêtes et le cadre d'évaluation utilisé dans les expérimentations qui ont servi à valider nos propositions. Chacun des chapitres suivants présente une de nos contributions. Les chapitres posent les problèmes, indiquent l'état de l'art, nos propositions théoriques et leur validation sur des collections de référence. Dans le chapitre deux, nous présentons nos recherche sur la prise en compte du caractère ambigu des requêtes. L'ambiguïté des termes des requêtes peut en effet conduire à une mauvaise sélection de documents par les moteurs. Dans l'état de l'art, les méthodes de désambiguïsation qui donnent des bonnes performances sont supervisées, mais ce type de méthodes n'est pas applicable dans un contexte réel de RI, car elles nécessitent de l'information normalement indisponible. De plus, dans la littérature, la désambiguïsation de termes pour la RI est déclarée comme sous optimale. / The field of information retrieval (IR) studies the mechanisms to find relevant information in one or more document collections, in order to satisfy an information need. For an Information Retrieval System (IRS) the information to find is represented by "documents" and the information need takes the form of a "query" formulated by the user. IRS performance depends on queries. Queries for which the IRS fails (little or no relevant documents retrieved) are called in the literature "difficult queries". This difficulty may be caused by term ambiguity, unclear query formulation, the lack of context for the information need, the nature and structure of the document collection, etc. This thesis aims at adapting IRS to contexts, particularly in the case of difficult queries. The manuscript is organized into five main chapters, besides acknowledgements, general introduction, conclusions and perspectives. The first chapter is an introduction to RI. We develop the concept of relevance, the retrieval models from the literature, the query expansion models and the evaluation framework that was employed to validate our proposals. Each of the following chapters presents one of our contributions. Every chapter raises the research problem, indicates the related work, our theoretical proposals and their validation on benchmark collections. In chapter two, we present our research on treating the ambiguous queries. The query term ambiguity can indeed lead to poor document retrieval of documents by the search engine. In the related work, the disambiguation methods that yield good performance are supervised, however such methods are not applicable in a real IR context, as they require the information which is normally unavailable. Moreover, in the literature, term disambiguation for IR is declared under optimal. Recherche d'information Requêtes difficiles Expansion de requêtes Classification Apprentissage Recherche d'information sélective Désambiguïsation

Search results

Distributing Social Applications

Adaptation des systèmes de recherche d'information aux contextes : le cas des requêtes difficiles / Adapting information retrieval systems to contexts : the case of query difficulty