• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 130
  • 30
  • 14
  • 13
  • 12
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 227
  • 227
  • 106
  • 91
  • 52
  • 46
  • 38
  • 36
  • 33
  • 32
  • 31
  • 31
  • 28
  • 25
  • 23
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Information Integration Using aLinked Data Approach : Information Integration Using aLinked Data Approach

Munir, Jawad January 2015 (has links)
Enterprise product, in our case either an embedded system or a software application, development is a complex task, which require model approaches and multiple diverse software tools to be linked together in a tool-chain to support the development process. Individual tools in the tool-chain maintains an incomplete picture of the development process. Data integration is necessary between these various tools in order to have a unified, consistent view of the whole development process. Information integration between these tools is a challenging task due to heterogeneity of these tools. Linked data is a promising approach for tool and data integration, where tools are integrated at data level in a tool-chain. Linked data is an architectural style to integrate and require more definitions and specifications to capture relationships between data. In our case, data of tools are described and shared using OSLC specifications. While such an approach has been widely researched for tool integration, but none covers the aspect of using such a distributed approach for lifecycle data integration, management and search. In this thesis work, we investigated the use of linked data approach to lifecycle data integration. The outcome is a prototype tool-chain architecture for lifecycle data integration, which can support data intensive queries that require information from various data sources in the tool-chain. The report takes Scania´s data integration needs as a case study for the investigation and presents various insights gained during the prototype implementation. Furthermore, the report also presents the key benefits of using a linked data approach for data integration in an enterprise environment. Based on encouraging test results for our prototype, the architecture presented in this report can be seen as a probable solution to lifecycle data integration for the OSLC tool - chain. / Företagets produkt, i vårt fall antingen ett inbyggt system eller ett program, är utvecklingen en komplicerad uppgift, som kräver modell metoder och flera skiftande programverktyg som ska kopplas samman i en verktygskedja för att stödja utvecklingsprocessen. Enskilda verktyg i verktygskedjan bibehåller en ofullständig bild av utvecklingsprocessen. Dataintegration är nödvändigt mellan dessa olika verktyg för att få en enhetlig och konsekvent syn på hela utvecklingsprocessen. Informationsintegration mellan dessa verktyg är en utmanande uppgift på grund av heterogenitet av dessa verktyg. Kopplad data är en lovande strategi för verktygs-och dataintegration, där verktyg är integrerade på datanivå i en verktygskedja. Kopplade uppgifter är en arkitektonisk stil att integrera och kräver fler definitioner och specifikationer för att fånga relationer mellan data. I vårt fall är data av verktyg beskrivna och delad med hjälp av OSLC specifikationer. Medan ett sådant tillvägagångssätt har i stor utsträckning forskats på för integrationsverktyg, men ingen täcker aspekten att använda en sådan distribuerad strategi för livscykeldataintegration, hantering och sökning. I detta examensarbete, undersökte vi användning av länkad data strategi för livscykeldataintegration. Resultatet är en prototyp av verktygskedjans arkitektur för livscykeldataintegration, som kan stödja dataintensiva frågor som kräver information från olika datakällor i verktygskedjan. Rapporten tar Scanias dataintegrationsbehov som en fallstudie för utredning och presenterar olika insikter under genomförandet av prototypen. Vidare presenterar rapporten också de viktigaste fördelarna med att använda en länkad-data-strategi för dataintegration i en företagsmiljö. Baserat på positiva testresultat för vår prototyp, kan arkitekturen presenteras i denna rapport ses som en trolig lösning för livscykeldataintegration för OSLC verktyg - kedja.
132

Semantische Revisionskontrolle für die Evolution von Informations- und Datenmodellen

Hensel, Stephan 13 April 2021 (has links)
Stärker verteilte Systeme in der Planung und Produktion verbessern die Agilität und Wartbarkeit von Einzelkomponenten, wobei gleichzeitig jedoch deren Vernetzung untereinander steigt. Das stellt wiederum neue Anforderungen an die semantische Beschreibung der Komponenten und deren Verbindungen, wofür Informations- und Datenmodelle unabdingbar sind. Der Lebenszyklus dieser Modelle ist dabei von Änderungen geprägt, mit denen umgegangen werden muss. Heutige Revisionsverwaltungssysteme, die die industriell geforderte Nachvollziehbarkeit bereitstellen könnten, sind allerdings nicht auf die speziellen Anforderungen der Informations- und Datenmodelle zugeschnitten, wodurch Möglichkeiten einer konsistenten Evolution verringert werden. Im Rahmen dieser Dissertation wurde ein Revision Management System zur durchgängigen Unterstützung der Evolution von Informations- und Datenmodellen entwickelt, das Revisionsverwaltungs- und Evolutionsmechanismen integriert. Besonderheit ist hierbei die technologieunabhängige mathematische und semantische Beschreibung, die eine Überführung des Konzepts in unterschiedliche Technologien ermöglicht. Beispielhaft wurde das Konzept für das Semantic Web als Weiterentwicklung des Open-Source-Projektes R43ples umgesetzt. / The increased distribution of systems in planning and production leads to improved agility and maintainability of individual components, whereas concurrently their cross-linking increases. This causes new requirements for the semantic description of components and links for which information and data models are indispensable. The life cycle of those models is characterized by changes that must be dealt with. However, today’s revision control systems would provide the required industrial traceability but are not enough for the specific requirements of information and data models. As a result, possibilities for a consistent evolution are reduced. Within this thesis a revision management system was developed, integrating revision control and evolution mechanisms to support the evolution of information and data models. The key is the technology-independent mathematical and sematic description allowing the application of the concept within different technologies. Exemplarily the concept was implemented for the Semantic Web as an extension of the open source project R43ples.
133

Semi-automated co-reference identification in digital humanities collections

Croft, David January 2014 (has links)
Locating specific information within museum collections represents a significant challenge for collection users. Even when the collections and catalogues exist in a searchable digital format, formatting differences and the imprecise nature of the information to be searched mean that information can be recorded in a large number of different ways. This variation exists not just between different collections, but also within individual ones. This means that traditional information retrieval techniques are badly suited to the challenges of locating particular information in digital humanities collections and searching, therefore, takes an excessive amount of time and resources. This thesis focuses on a particular search problem, that of co-reference identification. This is the process of identifying when the same real world item is recorded in multiple digital locations. In this thesis, a real world example of a co-reference identification problem for digital humanities collections is identified and explored. In particular the time consuming nature of identifying co-referent records. In order to address the identified problem, this thesis presents a novel method for co-reference identification between digitised records in humanities collections. Whilst the specific focus of this thesis is co-reference identification, elements of the method described also have applications for general information retrieval. The new co-reference method uses elements from a broad range of areas including; query expansion, co-reference identification, short text semantic similarity and fuzzy logic. The new method was tested against real world collections information, the results of which suggest that, in terms of the quality of the co-referent matches found, the new co-reference identification method is at least as effective as a manual search. The number of co-referent matches found however, is higher using the new method. The approach presented here is capable of searching collections stored using differing metadata schemas. More significantly, the approach is capable of identifying potential co-reference matches despite the highly heterogeneous and syntax independent nature of the Gallery, Library Archive and Museum (GLAM) search space and the photo-history domain in particular. The most significant benefit of the new method is, however, that it requires comparatively little manual intervention. A co-reference search using it has, therefore, significantly lower person hour requirements than a manually conducted search. In addition to the overall co-reference identification method, this thesis also presents: • A novel and computationally lightweight short text semantic similarity metric. This new metric has a significantly higher throughput than the current prominent techniques but a negligible drop in accuracy. • A novel method for comparing photographic processes in the presence of variable terminology and inaccurate field information. This is the first computational approach to do so.
134

Interrogation des sources de données hétérogènes : une approche pour l'analyse des requêtes / Querying heterogeneous data sources

Soumana, Ibrahim 07 June 2014 (has links)
Le volume des données structurées produites devient de plus en plus considérable. Plusieurs aspects concourent à l’accroissement du volume de données structurées. Au niveau du Web, le Web de données (Linked Data) a permis l’interconnexion de plusieurs jeux de données disponibles créant un gigantesque hub de données. Certaines applications comme l’extraction d’informations produisent des données pour peupler des ontologies. Les capteurs et appareils (ordinateur, smartphone, tablette) connectés produisent de plus en plus de données. Les systèmes d’information d’entreprise sont également affectés. Accéder à une information précise devient de plus en plus difficile. En entreprise, des outils de recherche ont été mis au point pour réduire la charge de travail liée à la recherche d’informations, mais ces outils génèrent toujours des volumes importants. Les interfaces en langage naturel issues du Traitement Automatique des Langues peuvent être mises à contribution pour permettre aux utilisateurs d’exprimer naturellement leurs besoins en informations sans se préoccuper des aspects techniques liés à l’interrogation des données structurées. Les interfaces en langage naturel permettent également d’avoir une réponse concise sans avoir besoin de fouiller d’avantage dans une liste de documents. Cependant actuellement, ces interfaces ne sont pas assez robustes pour être utilisées par le grand public ou pour répondre aux problèmes de l’hétérogénéité ou du volume de données. Nous nous intéressons à la robustesse de ces systèmes du point de vue de l’analyse de la question. La compréhension de la question de l’utilisateur est une étape importante pour retrouver la réponse. Nous proposons trois niveaux d’interprétation pour l’analyse d’une question : domaine abstrait, domaine concret et la relation domaine abstrait/concret. Le domaine abstrait s’intéresse aux données qui sont indépendantes de la nature des jeux de données. Il s’agit principalement des données de mesures. L’interprétation s’appuie sur la logique propre à ces mesures. Le plus souvent cette logique a été bien décrite dans les autres disciplines, mais la manière dont elle se manifeste en langage naturel n’a pas fait l’objet d’une large investigation pour les interfaces en langage naturel basées sur des données structurées. Le domaine concret couvre le domaine métier de l’application. Il s’agit de bien interpréter la logique métier. Pour une base de données, il correspond au niveau applicatif (par opposition à la couche des données). La plupart des interfaces en langage naturel se focalisent principalement sur la couche des données. La relation domaine abstrait/concret s’intéresse aux interprétations qui chevauchent les deux domaines. Du fait de l’importance de l’analyse linguistique, nous avons développé l’infrastructure pour mener cette analyse. L’essentiel des interfaces en langage naturel qui tentent de répondre aux problématiques du Web de données (Linked Data) ont été développées jusqu’ici pour la langue anglaise et allemande. Notre interface tente d’abord de répondre à des questions en français / No english summary available
135

Um modelo para implementação de aplicações da Argument Web integradas com bases de dados abertos e ligados

Niche, Roberto 30 June 2015 (has links)
Submitted by Silvana Teresinha Dornelles Studzinski (sstudzinski) on 2015-10-21T15:18:41Z No. of bitstreams: 1 ROBERTO NICHE_.pdf: 2843778 bytes, checksum: 593973f2bdcb7e774f0022cc2e08fdea (MD5) / Made available in DSpace on 2015-10-21T15:18:41Z (GMT). No. of bitstreams: 1 ROBERTO NICHE_.pdf: 2843778 bytes, checksum: 593973f2bdcb7e774f0022cc2e08fdea (MD5) Previous issue date: 2015-06-30 / Milton Valente / Ferramentas de comunicação e colaboração são amplamente utilizadas na internet para expressar opiniões e descrever pontos de vista sobre os mais diversos assuntos. Entretanto elas não foram projetadas para apoiar a identificação precisa dos assuntos tratados e tampouco para permitir o relacionamento entre os elementos que compõem as interações. Os resultados observados são a disponibilidade de uma grande quantidade de informações geradas espontaneamente e a dificuldade de identificação precisa dos elementos de destaque dessas informações, bem como seus relacionamentos e suas fontes. A proposta central da Argument Web está relacionada com a definição de uma infraestrutura para anotar de forma precisa os argumentos das mensagens publicadas e possibilitar que estes estejam relacionados com suas diversas fontes. Quando integrada com a iniciativa de bases de dados abertos e ligados, a Argument Web apresenta o potencial de ampliar a qualidade das discussões colaborativas na Internet e favorecer a sua análise. Entretanto, as iniciativas para implementações de aplicações com base nestes conceitos ainda são restritas. Mesmo nas aplicações conhecidas, ainda são pouco exploradas as características de visualização e utilização de bases de dados abertos e ligados. Neste trabalho é descrito um modelo para a instanciação desse tipo de aplicações, com base no modelo Argument Interchange Format e no uso de linguagens da Web Semântica. O diferencial que este modelo apresenta está relacionado com a facilidade de integração entre fontes externas em formatos de bases de dados ligados. Um protótipo deste modelo foi avaliado em um estudo usando-se bases de dados abertas e ligadas no âmbito da administração pública brasileira, tendo sido observados bons resultados. / Internet communication and collaboration tools are widely used on the Internet to express opinions and describe views on various subjects. However, they were not designed to support the precise identification of the issues raised, nor to allow the relationship among the elements of the interactions. The observed results are the availability of a large amount of information generated spontaneously by users. Even then, the accurate identification of key discussion elements and their interconnecting relationships as well as their sources is still a challenge. The main goal of Argument Web is related to the definition of an infrastructure to note correctly the arguments of the posted messages and enable these to relate to its various sources. When integrated with the initiative to open and connected databases, the Argument Web has the potential to increase the quality of collaborative discussions on the Internet and to encourage their analysis. However, initiatives for application implementations based on these concepts are still restricted. Even in known applications, the display characteristics and use of open and linked data bases are still little explored. This paper describes a model for the creation of such applications, based on the Argument Interchange Format and the use of Semantic Web languages. We consider our main contributions to be twofold: first, our capability to integrate and link external data sources; and second, augmentation through. A prototype was created and employed in a case study, enabling discussion related to Brazilian government issues, in which good results were observed.
136

Um modelo para integração de informações de bases de dados abertos, com uso de ontologias

Tosin, Thyago de Melo 26 February 2016 (has links)
Submitted by Silvana Teresinha Dornelles Studzinski (sstudzinski) on 2016-05-09T13:21:17Z No. of bitstreams: 1 Thyago de Melo Tosin_.pdf: 4027788 bytes, checksum: 005b4b30c3ee7d5ddc7a8365fa1917f4 (MD5) / Made available in DSpace on 2016-05-09T13:21:17Z (GMT). No. of bitstreams: 1 Thyago de Melo Tosin_.pdf: 4027788 bytes, checksum: 005b4b30c3ee7d5ddc7a8365fa1917f4 (MD5) Previous issue date: 2016-02-26 / IFRR - Instituto Federal de Educação Ciências e Tecnologia de Roraima / Com a lei de Acesso à Informação (Lei 12527/2011), espera-se que nas esferas federais, estaduais e municipais estejam garantidas e facilitadas as atividades de acesso aos dados de interesse para o cidadão. As bases interligadas de dados abertos facilitam a aquisição desses dados, possibilitando que diversas aplicações sejam criadas e que consultas sejam realizadas. Entretanto observa-se uma carência de recursos para realizar o relacionamento das informações originadas em bases de dados abertas distintas. A integração de diferentes conjuntos de dados possibilita a criação de aplicações mais ricas e relevantes. A representação formal das relações entre os dados consultados permite o uso de mecanismos de inferência e mecanismos de consulta aos dados abertos e conectados. Este trabalho apresenta o desenvolvimento de recursos para inferir e relacionar tais informações no contexto de aplicações Web voltadas para a integração de bases de dados abertos e conectados. Um modelo foi desenvolvido como contribuição e um protótipo foi implementado como caso de uso. Neste protótipo foram utilizados os dados das compras governamentais que estavam armazenados em bases de dados relacionais, com o uso de uma ontologia, desenvolvida especificamente para este caso, os dados foram mapeados e importados para um triplestore Apache Fuseki em formato RDF, uma aplicação em Java EE com o uso do framework Apache Jena foi desenvolvida para visualização dos dados através de métodos de consulta utilizando SPARQL. Foram aplicadas três avaliações nesse protótipo: baseada em cenário, de usabilidade e da ontologia. Como resultados, verificou-se que o modelo implementado proporcionou a integração desejada e auxiliou os usuários a obterem uma melhor experiência na visualização dos dados interligados das compras governamentais com o orçamento federal. / With the Law of Access to Information and sanctioned in force, access to information, at least at the federal, state and local levels, make it easy for citizens. The bases linked open data, facilitate the acquisition of these data, however there is a lack of resources for inference of relationship information. This work aims at the development of these resources to infer and relate this information in the context of Web applications integrated with open and connected databases. The prototype implemented used the data from government purchases that was stored in relational databases, using an ontology, developed specifically for this case, the data have been mapped and imported into a triplestore Apache Fuseki in RDF format, a Java EE application using Apache Jena framework, was developed to display data through methods that consulted using SPARQL. Three evaluations were applied in this prototype: scenario-based, usability and ontology. As result, it was found that the model implemented helped people to have a better experience when viewing linked data of government purchases with the federal budget.
137

Uma infraestrutura semântica para integração de dados científicos sobre biodiversidade / A semantic infrastructure for integrating biodiversity scientific data

Serique, Kleberson Junio do Amaral 21 December 2017 (has links)
Pesquisas na área de biodiversidade são, em geral, transdisciplinares por natureza. Essas pesquisas tentam responder problemas complexos que necessitam de conhecimento transdisciplinar e requerem a cooperação entre pesquisadores de diversas disciplinas. No entanto, é raro que duas ou mais disciplinas distintas tenham observações, dados e métodos em formatos que permitam a colaboração imediata sobre hipóteses complexas e transdisciplinares. Hoje, a velocidade com que qualquer disciplina obtêm avanços científicos depende de quão bem seus pesquisadores colaboram entre si e com tecnologistas das áreas de bancos de dados, gerenciamento de workflow, visualização e tecnologias, como computação em nuvem. Dentro desse cenário, a Web Semântica surge, não só como uma nova geração de ferramentas para a representação de informações, mais também para a automação, integração, interoperabilidade e reutilização de recursos. Neste trabalho, uma infraestrutura semântica é proposta para a integração de dados científicos sobre biodiversidade. Sua arquitetura é baseada na aplicação das tecnologias da Web Semântica para se desenvolver uma infraestrutura eficiente, robusta e escalável aplicada ao domínio da Biodiversidade. O componente central desse ambiente é a linguagem BioDSL, uma Linguagem de Domínio Especifico (DSL) para mapear dados tabulares para o modelo RDF, seguindo os princípios de Linked Open Data. Esse ambiente integrado também conta com uma interface Web, editores e outras facilidades para conversão/integração de conjuntos de dados sobre biodiversidade. Para o desenvolvimento desse ambiente, houve a participação de instituições de pesquisa parceiras que atuam na área de biodiversidade da Amazônia. A ajuda do Laboratório de Interoperabilidade Semântica do Instituto Nacional de Pesquisas da Amazônia (INPA) foi fundamental para a especificação e testes do ambiente. Foram pesquisados vários casos de uso com pesquisadores do INPA e realizados testes com o protótipo do sistema. Nesses testes, ele foi capaz de converter arquivos de dados reais sobre biodiversidade para RDF e interligar automaticamente entidades presentes nesses dados a entidades presentes na web (nuvem LOD). Num experimento envolvendo 1173 registros de espécies ameaçadas, o ambiente conseguiu recuperar automaticamente 967 (82,4%) entidades (URIs) da LOD referentes a essas espécies, com matching completo para o nome das espécies, 149 (12,7%) com matching parcial (apenas um dos nomes da espécie), 36 (3,1%) não tiveram correspondências (sem resultados nas buscas) e 21 (1,7%) sem registro das especies na LOD. / Research in the area of biodiversity is, in general, transdisciplinary in nature. This type of research attempts to answer complex problems that require transdisciplinary knowledge and require the cooperation between researchers of diverse disciplines. However, it is rare for two or more distinct disciplines to have observations, data, and methods in formats that allow immediate collaboration on complex and transdisciplinary hypotheses. Today, the speed which any discipline gets scientific advances depends on how well its researchers collaborate with each other and with technologists from the areas of databases, workflow management, visualization, and internet technologies. Within this scenario, the Semantic Web arises not only as a new generation of tools for information representation, but also for automation, integration, interoperability and resource reuse. In this work, a semantic infrastructure is proposed for the integration of scientific data on biodiversity. This architecture is based on the application of Semantic Web technologies to develop an efficient, robust and scalable infrastructure for use in the field of Biodiversity. The core component of this infrastructure is the BioDSL language, a Specific Domain Language (DSL) to map tabular data to the RDF model, following the principles of Linked Open Data. This integrated environment also has a Web interface, editors and other facilities for converting/integrating biodiversity datasets. For the development of this environment, we had the participation of partner research institutions that work with Amazon biodiversity. The help of the Laboratory of Semantic Interoperability of the National Institute of Amazonian Research (INPA) was fundamental for the specification and tests of this infrastructure. Several use cases were investigated with INPA researchers and tests were carried out with the system prototype. In these tests, the prototype was able to convert actual biodiversity data files to RDF and automatically interconnect entities present in these data to entities present on the web (LOD cloud). In an experiment involving 1173 records of endangered species, the environment was able to automatically retrieve 967 (82.4%) LOD entities (URIs) for these species, with complete matching for the species name, 149 (12.7%) with partial matching (only one of the species names), 36 (3,1%) with no matching and 21 (1,7%) no have records at LOD.
138

Data Poisoning Attacks on Linked Data with Graph Regularization

January 2019 (has links)
abstract: Social media has become the norm of everyone for communication. The usage of social media has increased exponentially in the last decade. The myriads of Social media services such as Facebook, Twitter, Snapchat, and Instagram etc allow people to connect with their friends, and followers freely. The attackers who try to take advantage of this situation has also increased at an exponential rate. Every social media service has its own recommender systems and user profiling algorithms. These algorithms use users current information to make different recommendations. Often the data that is formed from social media services is Linked data as each item/user is usually linked with other users/items. Recommender systems due to their ubiquitous and prominent nature are prone to several forms of attacks. One of the major form of attacks is poisoning the training set data. As recommender systems use current user/item information as the training set to make recommendations, the attacker tries to modify the training set in such a way that the recommender system would benefit the attacker or give incorrect recommendations and hence failing in its basic functionality. Most existing training set attack algorithms work with ``flat" attribute-value data which is typically assumed to be independent and identically distributed (i.i.d.). However, the i.i.d. assumption does not hold for social media data since it is inherently linked as described above. Usage of user-similarity with Graph Regularizer in morphing the training data produces best results to attacker. This thesis proves the same by demonstrating with experiments on Collaborative Filtering with multiple datasets. / Dissertation/Thesis / Masters Thesis Computer Science 2019
139

Model-driven development of Rich Internet Applications on the Semantic Web

Hermida Carbonell, Jesús María 09 April 2013 (has links)
In the last decade, the Web 2.0 brought technological changes in the manner of interaction and communication between users and applications, and among applications as well. Rich Internet Applications (RIA) offer user interfaces with a higher level of interactivity, similar to desktop interfaces, embed multimedia contents and minimise the communication between client and server components. Nonetheless, RIAs behave as black boxes that show the information in a user-friendly manner but this information can be only visualised gradually, according to the events triggered by the users on the Web browser, which limits the access of software agents, e.g., Web searchers. In the context of the present Internet, where the value has been moved from the Web applications to the data they manage, the use of open technological solutions is a need. In this way, the Semantic Web was aimed at solving issues of semantic incompatibility among systems by means of standard techniques and technologies (from knowledge representation and sharing to trust and security), which can be the key to solving the issues detected in RIA. Although some solutions exist, they do not cover all the possible types of RIA or they are dependent on the technology chosen for the implementation of the Web application. As a first contribution, this thesis introduces the concept of Semantic Rich Internet Application (SRIA), which can be defined as a RIA that extensively uses Semantic Web technologies to provide a representation of its contents and to reuse existing knowledge sources on the Web. The solution proposed is adapted to the existing RIA types and technologies. The thesis presents the architecture proposed for this type of application, describing its software modules and components. The evaluation of the solution was performed based on a collection of case studies. The development of Web applications, especially in the context of the Semantic Web, is a process traditionally performed manually and, given the complexity of the SRIA applications in this case, it is a process which might be prone to errors. The application of model-driven engineering techniques can reduce the cost of development and maintenance (in terms of time and resources) of the proposed applications, as demonstrated their use in other types of Web applications. Moreover, they can facilitate the adoption of the solution by the community. In the light of these issues, as a second contribution, this thesis presents the Sm4RIA methodology (Semantic Models for RIA) for the development of SRIA, as an extension of the OOH4RIA methodology. The thesis describes the development process, the models (with the corresponding metamodels) and the transformations included in the methodology. The evaluation of the methodology consisted in the development of the case studies proposed. The application of this model-driven methodology can speed up the development of these Web applications and simplify the reuse of external sources of knowledge. Finally, the thesis describes the Sm4RIA extension for OIDE, i.e., an extension of the OIDE CASE tool that implements all the elements of the Sm4RIA methodology.
140

Un cadre de développement sémantique pour la recherche sociale

Stan, Johann 09 November 2011 (has links) (PDF)
Cette thèse présente un système permettant d'extraire les interactions partagées dans les réseaux sociaux et de construire un profil dynamique d'expertise pour chaque membre dudit réseau social. La difficulté principale dans cette partie est l'analyse de ces interactions, souvent très courtes et avec peu de structure grammaticale et linguistique. L'approche que nous avons mis en place propose de relier les termes importants de ces messages à des concepts dans une base de connaissance sémantique, type Linked Data. Cette connexion permet en effet d'enrichir le champ sémantique des messages en exploitant le voisinage sémantique du concept dans la base de connaissances. Notre première contribution dans ce contexte est un algorithme qui permet d'effectuer cette liaison avec une précision plus augmentée par rapport à l'état de l'art, en considérant le profil de l'utilisateur ainsi que les messages partagés dans la communauté dont il est membre comme source supplémentaire de contexte. La deuxième étape de l'analyse consiste à effectuer l'expansion sémantique du concept en exploitant les liens dans la base de connaissance. Notre algorithme utilise une heuristique basant sur le calcul de similarité entre les descriptions des concepts pour ne garder que ceux les plus pertinents par rapport au profil de l'utilisateur. Les deux algorithmes mentionnés précédemment permettent d'avoir un ensemble de concepts qui illustrent les centres d'expertise de l'utilisateur. Afin de mesurer le degré d'expertise de l'utilisateur qui s'applique sur chaque concept dans son profil, nous appliquons la méthode-standard vectoriel et associons à chaque concept une mesure composée de trois éléments : (i) le tf-idf, (ii) le sentiment moyen que l'utilisateur exprime par rapport au dit concept et (iii) l'entropie moyen des messages partagés contenant ledit concept. L'ensemble des trois mesures combinées permet d'avoir un poids unique associé à chaque concept du profil. Ce modèle de profil vectoriel permet de trouver les " top-k " profils les plus pertinents par rapport à une requête. Afin de propager ces poids sur les concepts dans l'expansion sémantique, nous avons appliqué un algorithme de type propagation sous contrainte (Constrained Spreading Activation), spécialement adapté à la structure d'un graphe sémantique. L'application réalisée pour prouver l'efficacité de notre approche, ainsi que d'illustrer la stratégie de recommandation est un système disponible en ligne, nommé " The Tagging Beak " (http://www.tbeak.com). Nous avons en effet développé une stratégie de recommandation type Q&A (question - réponse), où les utilisateurs peuvent poser des questions en langage naturel et le système recommande des personnes à contacter ou à qui se connecter pour être notifié de nouveaux messages pertinents par rapport au sujet de la question.

Page generated in 0.0333 seconds