Global ETD Search

1	Construction de ressources linguistiques arabes à l’aide du formalisme de grammaires de propriétés en intégrant des mécanismes de contrôle / Building arabic linguistic resources using the property grammar formalism by integrating control mechanisms Bensalem, Raja 14 December 2017 (has links) La construction de ressources linguistiques arabes riches en informations syntaxiques constitue un enjeu important pour le développement de nouveaux outils de traitement automatique. Cette thèse propose une approche pour la création d’un treebank de l’arabe intégrant des informations d’un type nouveau reposant sur le formalisme des Grammaires de Propriétés. Une propriété syntaxique caractérise une relation pouvant exister entre deux unités d’une certaine structure syntaxique. Cette grammaire est induite automatiquement à partir du treebank arabe ATB, ce qui constitue un enrichissement de cette ressource tout en conservant ses qualités. Cet enrichissement a été également appliqué aux résultats d’analyse d’un analyseur état de l’art du domaine, le Stanford Parser, offrant la possibilité d’une évaluation s’appuyant sur un ensemble de mesures obtenues à partir de cette ressource. Les étiquettes des unités de cette grammaire sont structurées selon une hiérarchie de types permettant la variation de leur degré de granularité, et par conséquent du degré de précision des informations. Nous avons pu ainsi construire, à l’aide de cette grammaire, d’autres ressources linguistiques arabes. En effet, sur la base de cette nouvelle ressource, nous avons développé un analyseur syntaxique probabiliste à base de propriétés syntaxiques, le premier appliqué pour l'arabe. Une grammaire de propriétés lexicalisée probabiliste fait partie de son modèle d’apprentissage pour pouvoir affecter positivement le résultat d’analyse et caractériser ses structures syntaxiques avec les propriétés de ce modèle. Nous avons enfin évalué les résultats obtenus en les comparant à celles du Stanford Parser. / The building of syntactically informative Arabic linguistic resources is a major issue for the development of new machine processing tools. We propose in this thesis to create an Arabic treebank that integrates a new type of information, which is based on the Property Grammar formalism. A syntactic property is a relation between two units of a given syntactic structure. This grammar is automatically induced from the Arabic treebank ATB. We enriched this resource with the property representations of this grammar, while retaining its qualities. We also applied this enrichment to the parsing results of a state-of-the-art analyzer, the Stanford Parser. This provides the possibility of an evaluation using a measure set, which is calculated on this resource. We structured the tags of the units in this grammar according to a type hierarchy. This permit to vary the granularity level of these units, and consequently the accuracy level of the information. We have thus been able to construct, using this grammar, other Arabic linguistic resources. Secondly, based on this new resource, we developed a probabilistic syntactic parser based on syntactic properties. This is the first analyzer of this type that we have applied to Arabic. In the learning model, we integrated a probabilistic lexicalized property grammar that may positively affect the parsing result and describe its syntactic structures with its properties. Finally, we evaluated the parsing results of this approach by comparing them to those of the Stanford Parser. Grammaires de propriétés Langue arabe Treebanks Mécanismes de contrôle Analyseur syntaxique probabiliste Property grammars Arabic language Treebank Control mechanisms Probabilistic parser 004
2	Modelling syntactic gradience with loose constraint-based parsing: Modélisation de la gradience syntaxique par analyse relâchée à base de contraintes / Modélisation de la gradience syntaxique par analyse relâchée à base de contraintes Prost, Jean-Philippe January 2008 (has links) Thesis submitted for the joint institutional requirements for the double-badged degree of Doctor of Philosophy and Docteur de l'Université de Provence, Spécialité : Informatique. / Thesis (PhD)--Macquarie University, Division of Information and Communication Sciences, Department of Computing, 2008. / Includes bibliography (p. 229-240) and index. / Introduction -- Background -- A model-theoretic framework for PG -- Loose constraint-based parsing -- A computational model for gradience -- Conclusion. / The grammaticality of a sentence has conventionally been treated in a binary way: either a sentence is grammatical or not. A growing body of work, however, focuses on studying intermediate levels of acceptability, sometimes referred to as gradience. To date, the bulk of this work has concerned itself with the exploration of human assessments of syntactic gradience. This dissertation explores the possibility to build a robust computational model that accords with these human judgements. -- We suggest that the concepts of Intersective Gradience and Subsective Gradience introduced by Aarts for modelling graded judgements be extended to cover deviant language. Under such a new model, the problem then raised by gradience is to classify an utterance as a member of a specific category according to its syntactic characteristics. More specifically, we extend Intersective Gradience (IG) so that it is concerned with choosing the most suitable syntactic structure for an utterance among a set of candidates, while Subsective Gradience (SG) is extended to be concerned with calculating to what extent the chosen syntactic structure is typical from the category at stake. IG is addressed in relying on a criterion of optimality, while SG is addressed in rating an utterance according to its grammatical acceptability. As for the required syntactic characteristics, which serve as features for classifying an utterance, our investigation of different frameworks for representing the syntax of natural language shows that they can easily be represented in Model-Theoretic Syntax; we choose to use Property Grammars (PG), which offers to model the characterisation of an utterance. We present here a fully automated solution for modelling syntactic gradience, which characterises any well formed or ill formed input sentence, generates an optimal parse for it, then rates the utterance according to its grammatical acceptability. -- Through the development of such a new model of gradience, the main contribution of this work is three-fold. -- First, we specify a model-theoretic logical framework for PG, which bridges the gap observed in the existing formalisation regarding the constraint satisfaction and constraint relaxation mechanisms, and how they relate to the projection of a category during the parsing process. This new framework introduces the notion of loose satisfaction, along with a formulation in first-order logic, which enables reasoning about the characterisation of an utterance. -- Second, we present our implementation of Loose Satisfaction Chart Parsing (LSCP), a dynamic programming approach based on the above mechanisms, which is proven to always find the full parse of optimal merit. Although it shows a high theoretical worst time complexity, it performs sufficiently well with the help of heuristics to let us experiment with our model of gradience. -- And third, after postulating that human acceptability judgements can be predicted by factors derivable from LSCP, we present a numeric model for rating an utterance according to its syntactic gradience. We measure a good correlation with grammatical acceptability by human judgements. Moreover, the model turns out to outperform an existing one discussed in the literature, which was experimented with parses generated manually. / Mode of access: World Wide Web. / xxviii, 283 p. ill Computational linguistics Mathematical linguistics Semantics Algorithms gradience acceptability grammaticality optimality configuration Model-Theoretic Syntax Property Grammars characterisation constraint-based chart parsing robustness loose constraint satisfaction

Search results

Construction de ressources linguistiques arabes à l’aide du formalisme de grammaires de propriétés en intégrant des mécanismes de contrôle / Building arabic linguistic resources using the property grammar formalism by integrating control mechanisms

Modelling syntactic gradience with loose constraint-based parsing: Modélisation de la gradience syntaxique par analyse relâchée à base de contraintes / Modélisation de la gradience syntaxique par analyse relâchée à base de contraintes