Spelling suggestions: "subject:"annotation.""
61 |
Construction automatique d'outils et de ressources linguistiques à partir de corpus parallèles / Automatic creation of linguistic tools and resources from parallel corporaZennaki, Othman 11 March 2019 (has links)
Cette thèse porte sur la construction automatique d’outils et de ressources pour l’analyse linguistique de textes des langues peu dotées. Nous proposons une approche utilisant des réseaux de neurones récurrents (RNN - Recurrent Neural Networks) et n'ayant besoin que d'un corpus parallèle ou mutli-parallele entre une langue source bien dotée et une ou plusieurs langues cibles moins bien ou peu dotées. Ce corpus parallèle ou mutli-parallele est utilisé pour la construction d'une représentation multilingue des mots des langues source et cible. Nous avons utilisé cette représentation multilingue pour l’apprentissage de nos modèles neuronaux et nous avons exploré deux architectures neuronales : les RNN simples et les RNN bidirectionnels. Nous avons aussi proposé plusieurs variantes des RNN pour la prise en compte d'informations linguistiques de bas niveau (informations morpho-syntaxiques) durant le processus de construction d'annotateurs linguistiques de niveau supérieur (SuperSenses et dépendances syntaxiques). Nous avons démontré la généricité de notre approche sur plusieurs langues ainsi que sur plusieurs tâches d'annotation linguistique. Nous avons construit trois types d'annotateurs linguistiques multilingues: annotateurs morpho-syntaxiques, annotateurs en SuperSenses et annotateurs en dépendances syntaxiques, avec des performances très satisfaisantes. Notre approche a les avantages suivants : (a) elle n'utilise aucune information d'alignement des mots, (b) aucune connaissance concernant les langues cibles traitées n'est requise au préalable (notre seule supposition est que, les langues source et cible n'ont pas une grande divergence syntaxique), ce qui rend notre approche applicable pour le traitement d'un très grand éventail de langues peu dotées, (c) elle permet la construction d'annotateurs multilingues authentiques (un annotateur pour N langages). / This thesis focuses on the automatic construction of linguistic tools and resources for analyzing texts of low-resource languages. We propose an approach using Recurrent Neural Networks (RNN) and requiring only a parallel or multi-parallel corpus between a well-resourced language and one or more low-resource languages. This parallel or multi-parallel corpus is used to construct a multilingual representation of words of the source and target languages. We used this multilingual representation to train our neural models and we investigated both uni and bidirectional RNN models. We also proposed a method to include external information (for instance, low-level information from Part-Of-Speech tags) in the RNN to train higher level taggers (for instance, SuperSenses taggers and Syntactic dependency parsers). We demonstrated the validity and genericity of our approach on several languages and we conducted experiments on various NLP tasks: Part-Of-Speech tagging, SuperSenses tagging and Dependency parsing. The obtained results are very satisfactory. Our approach has the following characteristics and advantages: (a) it does not use word alignment information, (b) it does not assume any knowledge about target languages (one requirement is that the two languages (source and target) are not too syntactically divergent), which makes it applicable to a wide range of low-resource languages, (c) it provides authentic multilingual taggers (one tagger for N languages).
62 |
Um modelo arquitetural para captura e uso de informações de contexto em sistemas de anotações de vídeo / An architectural model to capture and use context information in video annotation systemsFagá Júnior, Roberto 11 June 2010 (has links)
Diversos pesquisadores vêm investigando métodos e técnicas para tornar possível às pessoas anotarem vídeos de modo transparente. A anotação pode ser realizada com a fala, com o uso de tinta digital ou algum outro meio que possa ser capturado enquanto a pessoa assiste ao vídeo. Tais anotações podem ser compartilhadas com outras pessoas, que podem estar assistindo ao mesmo vídeo em um mesmo instante ou em momentos diferentes, sendo interessante ainda que as anotações possam ser realizadas por várias pessoas de modo colaborativo. O paradigma Watch-and-Comment (WaC) propõe a captura transparente de anotações multimodais de usuários enquanto os mesmos assistem e comentam um vídeo. Como resultado desse processo, é gerado um vídeo digital interativo integrando o conteúdo original às anotações realizadas. Esta dissertação tem por objetivo explorar conceitos de computação ubíqua, redes sociais, redes peer-to-peer e TV interativa na proposta de um modelo arquitetural de ciência de informações de contexto para aplicações definidas segundo o paradigma WaC. O modelo explora a integração de um serviço ao paradigma, que auxilie ou forneça alternativas para que aplicações, do momento da captura ao acesso das anotações, utilizem informações de contexto do usuário, do vídeo e das anotações. O modelo também auxilia no estudo de colaboração entre usuários que realizam anotações em vídeos. Outra contribuição da dissertação é a prototipação de aplicações para avaliar e refinar o modelo proposto. São apresentadas extensões para a aplicação WaCTool, considerando o uso de redes sociais e de alternativas para a anotação em vídeos / Researchers have been investigating methods and techniques to allow people to annotate videos ubiquitously. Annotations can be made using voice, digital ink or some other media that can be captured while a person watches a video. These annotations can be shared with other people, who can be watching a video on the same time or at a different one. Also, these annotations can be made by many people collaboratively. The Watch-and-Comment (WaC) paradigm aims at capturing multimodal annotations in an ubiquitous way, while users watch and comment some video. As a result, an interactive digital video is generated combining the original content and the annotations. The work reported on this thesis explores concepts such as ubiquitous computing, social networks, peer-to-peer networks and interactive digital TV, to propose an architectural context-aware model to the applications defined by WaC paradigm. The model proposes the integration of a new service to the paradigm, supporting applications on the annotation process by offering capture alternatives and using context information from user, video and annotations. Also, the model provides a study in collaborative annotation process. Another contribution of this thesis is the prototypes built to evaluate and upgrade the proposed model. The prototypes are extensions from WaCTool, considering the use of social networks and alternatives to annotate in videos
63 |
Assister les pratiques de lecture savante sur écran à l'aide des outils sémantiques / Assisting expert reading practices on digital devices with semantic toolsTirole, Delphine 06 July 2016 (has links)
Au sein de la société occidentale, la lecture s'est imposée comme la pratique culturelle et intellectuelle de référence, notamment sous sa forme savante, qui s'accompagne de l'utilisation d'instruments (e.g. feutres surligneurs, notes Post-it) L'apparition de dispositifs de lecture numérique au cours du XXème siècle a marqué une étape importante dans l'évolution des pratiques de lecture. Attisant de nombreux débats, ces derniers ne sont pas perçus comme une évolution allant de soi de l'acte de lire. Ces dispositifs ont induit un changement important dans l'environnement du lecteur avec leur force de calcul et de programmation, qui donne la possibilité de développer des outils inédits, en particulier à destination du lecteur savant. En vue de proposer une nouvelle forme d'instrumentation, il est nécessaire de comprendre les pratiques actuelles des lecteurs savants et leurs besoins. Ce travail de recherche présente une série d'entretiens compréhensifs, réalisée auprès d'une population d'enseignants-chercheurs, pour identifier les qualités attendues d'un instrument sur écran. Par ailleurs, le développement des technologies numériques est lié à la collecte et à la gestion de données. A ce titre, les technologies du Web sémantique incarnent un moyen d'établir l’interopérabilité et la réutilisation de ces données. Ainsi,dans un deuxième.temps, l'objectif de cette recherche est de concevoir un modèle formel sous la forme d'une ontologie donnant lieu à la standardisation d'un gisement précis de données, issues des traces et des marques de lecture, développées avec le langage OWL. / In the Western soociety, reading is considered as the intellectual practice par excellence, especially in an expert from which is çommonly açcompanied by the use of tools (e.g- highlighter, Post-it note). The arrival of digital reading devices during the 20th century has been a milestone in the development of reading practices. Leading to many debates, these devices are not perceived as a self-évident évolution of reading. Digital devines have induced important changes within the reader's environment with their calculation and programming power, enabling new tools developmentfor the expert reader. In order to design a new tool, it is necessary to understand the actual practices of expert readers and their needs. This study présents a set of comprehensive interviews, çonduçted with scholars, aiming to identify qualities expected at the idea of a digital reading tool. Otherwise, digital technology development alllows to collect and manage more and more data. In this capacity, semantic Web technologies appear as means to make data interoperable and reusable. Thus, soeondly, this research project aims to design a formai modes as an OWL ontology, leading to the standardisation of a specific data set, coming from annotations practices.
64 |
Infrastructure logicielle multi-modèles pour l'accès à des servcies en mobilitéBocquet, Aurelien 01 December 2008 (has links) (PDF)
Les intergiciels sont aujourd'hui incontournables lorsqu'il s'agit de développer des applications réparties. Des simples Web Services aux architectures n-tiers, d'une unique communication client / serveur à un réseau dynamique pair-à-pair, chaque conception requiert des outils adaptés et performants. En complément de chaque utilisation spécifique des intergiciels, leur contexte de déploiement nécessite des mécanismes particuliers afin de s'adapter au mieux à la situation.<br /><br />Face à ces besoins, les intergiciels proposent des modèles de programmation et de communication différents, fournissant des moyens de communication efficaces dans certaines situations.<br /><br />La mobilité introduit une problématique supplémentaire pour ces intergiciels. D'une part l'interopérabilité devient inévitable ; le nombre de composants répartis susceptibles d'être utilisés en mobilité est immense, et les composants peuvent être développés avec différents intergiciels. D'autre part le contexte varie, et avec lui les conditions et capacités de communication évoluent.<br /><br />Nous traitons dans cette thèse des impératifs actuels d'un intergiciel en mobilité. Nous proposons pour cela une approche multi-modèles, basée sur les travaux actuels dans ce domaine, et présentant des concepts novateurs.<br /><br />Cette approche se compose d'un modèle de programmation générique, proposant différents types de communications synchrones, asynchrones, et basées sur des patrons de conception. Elle se compose également d'une combinaison de modèles de communication, assurant l'interopérabilité avec les intergiciels standards, et offrant des possibilités de communications enrichies, capables de s'adapter aux changements de contextes.<br />Des politiques d'adaptation définissent les règles de combinaison des modèles en fonction d'observations du contexte, afin de se comporter au mieux face à ses évolutions.<br />Des mécanismes d'adaptation dynamique permettent à notre approche de proposer une prise en compte en temps réel des changements de contexte, et permettent également de reconfigurer le système pendant son exécution afin de répondre à des besoins de déploiement.<br /><br />Nous avons validé notre approche au travers d'une application concrète aux problèmes engendrés par l'utilisation d'un proxy Internet à bord des trains : le développement d'un greffon multi-modèles a illustré et justifié notre approche, et l'évaluation de ce greffon a montré les bénéfices de celle-ci face aux changements de contexte.<br />Pour implémenter entièrement notre approche et proposer ainsi un intergiciel multi-modèles, nous avons conçu et développé notre infrastructure logicielle multi-modèles, proposant tous les concepts de l'approche. Une première version "statique" puis une version finale offrant les mécanismes d'adaptation dynamique ont été implémentées et permettent ainsi de profiter des bénéfices de notre approche multi-modèles.
65 |
Does it have to be trees? : Data-driven dependency parsing with incomplete and noisy training dataSpreyer, Kathrin January 2011 (has links)
We present a novel approach to training data-driven dependency parsers on incomplete annotations. Our parsers are simple modifications of two well-known dependency parsers, the transition-based Malt parser and the graph-based MST parser. While previous work on parsing with incomplete data has typically couched the task in frameworks of unsupervised or semi-supervised machine learning, we essentially treat it as a supervised problem. In particular, we propose what we call agnostic parsers which hide all fragmentation in the training data from their supervised components.
We present experimental results with training data that was obtained by means of annotation projection. Annotation projection is a resource-lean technique which allows us to transfer annotations from one language to another within a parallel corpus. However, the output tends to be noisy and incomplete due to cross-lingual non-parallelism and error-prone word alignments. This makes the projected annotations a suitable test bed for our fragment parsers. Our results show that (i) dependency parsers trained on large amounts of projected annotations achieve higher accuracy than the direct projections, and that (ii) our agnostic fragment parsers perform roughly on a par with the original parsers which are trained only on strictly filtered, complete trees. Finally, (iii) when our fragment parsers are trained on artificially fragmented but otherwise gold standard dependencies, the performance loss is moderate even with up to 50% of all edges removed. / Wir präsentieren eine neuartige Herangehensweise an das Trainieren von daten-gesteuerten Dependenzparsern auf unvollständigen Annotationen. Unsere Parser sind einfache Varianten von zwei bekannten Dependenzparsern, nämlich des transitions-basierten Malt-Parsers sowie des graph-basierten MST-Parsers.
Während frühere Arbeiten zum Parsing mit unvollständigen Daten die Aufgabe meist in Frameworks für unüberwachtes oder schwach überwachtes maschinelles Lernen gebettet haben, behandeln wir sie im Wesentlichen mit überwachten Lernverfahren. Insbesondere schlagen wir "agnostische" Parser vor, die jegliche Fragmentierung der Trainingsdaten vor ihren daten-gesteuerten Lernkomponenten verbergen.
Wir stellen Versuchsergebnisse mit Trainingsdaten vor, die mithilfe von Annotationsprojektion gewonnen wurden. Annotationsprojektion ist ein Verfahren, das es uns erlaubt, innerhalb eines Parallelkorpus Annotationen von einer Sprache auf eine andere zu übertragen. Bedingt durch begrenzten crosslingualen Parallelismus und fehleranfällige Wortalinierung ist die Ausgabe des Projektionsschrittes jedoch üblicherweise verrauscht und unvollständig. Gerade dies macht projizierte Annotationen zu einer angemessenen Testumgebung für unsere fragment-fähigen Parser. Unsere Ergebnisse belegen, dass (i) Dependenzparser, die auf großen Mengen von projizierten Annotationen trainiert wurden, größere Genauigkeit erzielen als die zugrundeliegenden direkten Projektionen, und dass (ii) die Genauigkeit unserer agnostischen, fragment-fähigen Parser der Genauigkeit der Originalparser (trainiert auf streng gefilterten, komplett projizierten Bäumen) annähernd gleichgestellt ist. Schließlich zeigen wir mit künstlich fragmentierten Gold-Standard-Daten, dass (iii) der Verlust an Genauigkeit selbst dann bescheiden bleibt, wenn bis zu 50% aller Kanten in den Trainingsdaten fehlen.
66 |
Navigating Textual Space in Print and Digital Interfaces: A Study of the Material and Cognitive Dimensions of Reading SystemsBialkowski, Voytek 01 December 2011 (has links)
This research examines situated behaviours and perceptions around textual navigation as it is practiced in situ by professionals working in various domains. In its investigation of interactions between human cognition and mediating artifacts, this research relies heavily on the resources of cognitive ethnography, including both observation and in-depth interviews with participants. Relevant contributions from the fields of information studies, book history, digital humanities, and human-computer interaction are presented to further elucidate the findings of this study. The findings reveal several emergent, interrelated navigational strategies, such as the use of annotations as navigational aids, reliance on automated interface actions, and the navigational value of interface metaphors. In further addressing the practice of textual navigation, this research also describes the creation of a prototype interface reflecting the study’s findings. This research proposes new ways of conceptualizing textual navigation and designing interfaces that support emergent textual interaction.
67 |
Konzeption und Umsetzung eines Werkzeugs zur Definition von Navigationsflüssen mittels DienstannotationenMartens, Felix 25 October 2010 (has links) (PDF)
Die Diplomarbeit stellt einen innovativen und leichtgewichtigen Modellierungsansatz zur Beschreibung interaktiver, dienstbasierter Anwendungen auf Basis von Dienstannotationen vor.
68 |
ExtractCFG : a framework to enable accurate timing back annotation of C language source codeGoswami, Arindam 30 September 2011 (has links)
The current trend in embedded systems design is to move the initial
design and exploration phase to a higher level of abstraction, in order to tackle
the rapidly increasing complexity of embedded systems. One approach of
abstracting software development from the low level platform details is host-
compiled simulation. Characteristics of the target platform are represented in
a host-compiled simulation model by annotating the high level source code.
Compiler optimizations make accurate annotation of the code a challenging
task. In this thesis, we describe an approach to enable correct back-annotation
of C code at the basic block level, while taking compiler optimizations into
account. / text
69 |
Navigating Textual Space in Print and Digital Interfaces: A Study of the Material and Cognitive Dimensions of Reading SystemsBialkowski, Voytek 01 December 2011 (has links)
This research examines situated behaviours and perceptions around textual navigation as it is practiced in situ by professionals working in various domains. In its investigation of interactions between human cognition and mediating artifacts, this research relies heavily on the resources of cognitive ethnography, including both observation and in-depth interviews with participants. Relevant contributions from the fields of information studies, book history, digital humanities, and human-computer interaction are presented to further elucidate the findings of this study. The findings reveal several emergent, interrelated navigational strategies, such as the use of annotations as navigational aids, reliance on automated interface actions, and the navigational value of interface metaphors. In further addressing the practice of textual navigation, this research also describes the creation of a prototype interface reflecting the study’s findings. This research proposes new ways of conceptualizing textual navigation and designing interfaces that support emergent textual interaction.
70 |
Ambiente integrado de modelagem distribuída para sistemas de informação na internet / Integrated environment for distributed modeling of web information systemsPompermaier, Leandro Bento January 1999 (has links)
O objetivo principal desta dissertação explora alguns aspectos relacionados ao desenvolvimento colaborativo de sistemas de informação na Internet. E apresentado o Editor Diagramático na Internet (EDI), que suporta a especificação colaborativa de aplicações. Este editor utiliza tecnologia e funcionalidade dos hiperdocumentos, oferecendo características como: compartilhamento de informações, colaboração entre vários autores e varias visões dos dados conceituais armazenados. EDI foi implementado utilizando a linguagem de programação Java e projetada de forma genérica para permitir a criação de editores de diferentes notações diagramáticas. Este trabalho propõe a utilização de anotações em documentos de desenvolvimento de sistemas de informação na Internet. Estas anotações auxiliam no desenvolvimento colaborativo de sistemas, tornando o processo mais colaborativo e com um produto resultante de qualidade superior. As anotações estão baseadas em dois tipos de usuários: o usuário proprietário, responsável pela criação do documento, e o usuário colaborador, que inclui anotações nos documentos. Anotações (identificadas por uma especifica cor) podem ser de inclusão, alteração, remoção de conceitos (visões) ou registro de comentários. / The main goal of this work is to explore some issues related to collaborative development of information systems on the Internet. A Diagrammatic Editor on the Internet (EDI) that supports collaborative specification of applications is described. This editor uses hyperdocument technology and funcionalities, offering features such as information sharing, collaboration between several authors, multiple views of the stored conceptual data, among others. EDI was implemented using the Java language, and designed with the purpose of being generic to enable the easy creation of specific editors for different diagrammatic notations. The use of annotations for the joint development of information systems on the Internet is proposed. With these annotations the development process becomes more collaborative and the quality of the final product may increase. Annotations are based on two types of users: the owner, who is an author responsible for the creation of a document, and the collaborator, who makes annotations on those documents. Annotations (identified by one specific colour) can be of inclusion, change and removal of concepts (views), or recording of comments.
Page generated in 0.0841 seconds