Global ETD Search

11	Outomatiese Setswana lemma-identifisering / Jeanetta Hendrina Brits Brits, Jeanetta Hendrina January 2006 (has links) Within the context of natural language processing, a lemmatiser is one of the most important core technology modules that has to be developed for a particular language. A lemmatiser reduces words in a corpus to the corresponding lemmas of the words in the lexicon. A lemma is defined as the meaningful base form from which other more complex forms (i.e. variants) are derived. Before a lemmatiser can be developed for a specific language, the concept "lemma" as it applies to that specific language should first be defined clearly. This study concludes that, in Setswana, only stems (and not roots) can act independently as words; therefore, only stems should be accepted as lemmas in the context of automatic lemmatisation for Setswana. Five of the seven parts of speech in Setswana could be viewed as closed classes, which means that these classes are not extended by means of regular morphological processes. The two other parts of speech (nouns and verbs) require the implementation of alternation rules to determine the lemma. Such alternation rules were formalised in this study, for the purpose of development of a Setswana lemmatiser. The existing Setswana grammars were used as basis for these rules. Therewith the precision of the formalisation of these existing grammars to lemmatise Setswana words could be determined. The software developed by Van Noord (2002), FSA 6, is one of the best-known applications available for the development of finite state automata and transducers. Regular expressions based on the formalised morphological rules were used in FSA 6 to create finite state transducers. The code subsequently generated by FSA 6 was implemented in the lemmatiser. The metric that applies to the evaluation of the lemmatiser is precision. On a test corpus of 1 000 words, the lemmatiser obtained 70,92%. In another evaluation on 500 complex nouns and 500 complex verbs separately, the lemmatiser obtained 70,96% and 70,52% respectively. Expressed in numbers the precision on 500 complex and simplex nouns was 78,45% and on complex and simplex verbs 79,59%. The quantitative achievement only gives an indication of the relative precision of the grammars. Nevertheless, it did offer analysed data with which the grammars were evaluated qualitatively. The study concludes with an overview of how these results might be improved in the future. / Thesis (M.A. (African Languages))--North-West University, Potchefstroom Campus, 2006. Computational linguistics Setswana grammar Setswana morphology Lemmatisation Stemming Lemma Natural language processing Regular expression Finite state automata Finite state transducer FSA 6
12	Economic networks: communication, cooperation & complexity Angus, Simon Douglas, Economics, Australian School of Business, UNSW January 2007 (has links) This thesis is concerned with the analysis of economic network formation. There are three novel sections to this thesis (Chapters 5, 6 and 8). In the first, the non-cooperative communication network formation model of Bala and Goyal (2000) (BG) is re-assessed under conditions of no inertia. It is found that the Strict Nash circle (or wheel) structure is still the equilibrium outcome for n = 3 under no inertia. However, a counter-example for n = 4 shows that with no inertia infinite cycles are possible, and hence the system does not converge. In fact, cycles are found to quickly dominate outcomes for n > 4 and further numerical simulations of conditions approximating no inertia (probability of updating > 0.8 to 1) indicate that cycles account for a dramatic slowing of convergence times. These results, together with the experimental evidence of Falk and Kosfeld (2003) (FK) motivate the second contribution of this thesis. A novel artificial agent model is constructed that allows for a vast strategy space (including the Best Response) and permits agents to learn from each other as was indicated by the FK results. After calibration, this model replicates many of the FK experimental results and finds that an externality exploiting ratio of benefits and costs (rather than the difference) combined with a simple altruism score is a good proxy for the human objective function. Furthermore, the inequity aversion results of FK are found to arise as an emergent property of the system. The third novel section of this thesis turns to the nature of network formation in a trust-based context. A modified Iterated Prisoners' Dilemma (IPD) model is developed which enables agents to play an additional and costly network forming action. Initially, canonical analytical results are obtained despite this modification under uniform (non-local) interactions. However, as agent network decisions are 'turned on' persistent cooperation is observed. Furthermore, in contrast to the vast majority of non-local, or static network models in the literature, it is found that a-periodic, complex dynamics result for the system in the long-run. Subsequent analysis of this regime indicates that the network dynamics have fingerprints of self-organized criticality (SOC). Whilst evidence for SOC is found in many physical systems, such dynamics have been seldom, if ever, reported in the strategic interaction literature. networks complexity cooperation communication prisoners dilemma game theory graph-theory agent-based modeling finite state automata artificial adaptive agents computational economics econophysics self-organized criticality
13	Outomatiese Setswana lemma-identifisering / Jeanetta Hendrina Brits Brits, Jeanetta Hendrina January 2006 (has links) Within the context of natural language processing, a lemmatiser is one of the most important core technology modules that has to be developed for a particular language. A lemmatiser reduces words in a corpus to the corresponding lemmas of the words in the lexicon. A lemma is defined as the meaningful base form from which other more complex forms (i.e. variants) are derived. Before a lemmatiser can be developed for a specific language, the concept "lemma" as it applies to that specific language should first be defined clearly. This study concludes that, in Setswana, only stems (and not roots) can act independently as words; therefore, only stems should be accepted as lemmas in the context of automatic lemmatisation for Setswana. Five of the seven parts of speech in Setswana could be viewed as closed classes, which means that these classes are not extended by means of regular morphological processes. The two other parts of speech (nouns and verbs) require the implementation of alternation rules to determine the lemma. Such alternation rules were formalised in this study, for the purpose of development of a Setswana lemmatiser. The existing Setswana grammars were used as basis for these rules. Therewith the precision of the formalisation of these existing grammars to lemmatise Setswana words could be determined. The software developed by Van Noord (2002), FSA 6, is one of the best-known applications available for the development of finite state automata and transducers. Regular expressions based on the formalised morphological rules were used in FSA 6 to create finite state transducers. The code subsequently generated by FSA 6 was implemented in the lemmatiser. The metric that applies to the evaluation of the lemmatiser is precision. On a test corpus of 1 000 words, the lemmatiser obtained 70,92%. In another evaluation on 500 complex nouns and 500 complex verbs separately, the lemmatiser obtained 70,96% and 70,52% respectively. Expressed in numbers the precision on 500 complex and simplex nouns was 78,45% and on complex and simplex verbs 79,59%. The quantitative achievement only gives an indication of the relative precision of the grammars. Nevertheless, it did offer analysed data with which the grammars were evaluated qualitatively. The study concludes with an overview of how these results might be improved in the future. / Thesis (M.A. (African Languages))--North-West University, Potchefstroom Campus, 2006. Computational linguistics Setswana grammar Setswana morphology Lemmatisation Stemming Lemma Natural language processing Regular expression Finite state automata Finite state transducer FSA 6
14	Zpracování turkických jazyků / Processing of Turkic Languages Ciddi, Sibel January 2014 (has links) Title: Processing of Turkic Languages Author: Sibel Ciddi Department: Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague Supervisor: RNDr. Daniel Zeman, Ph.D. Abstract: This thesis presents several methods for the morpholog- ical processing of Turkic languages, such as Turkish, which pose a specific set of challenges for natural language processing. In order to alleviate the problems with lack of large language resources, it makes the data sets used for morphological processing and expansion of lex- icons publicly available for further use by researchers. Data sparsity, caused by highly productive and agglutinative morphology in Turkish, imposes difficulties in processing of Turkish text, especially for meth- ods using purely statistical natural language processing. Therefore, we evaluated a publicly available rule-based morphological analyzer, TRmorph, based on finite state methods and technologies. In order to enhance the efficiency of this analyzer, we worked on expansion of lexicons, by employing heuristics-based methods for the extraction of named entities and multi-word expressions. Furthermore, as a prepro- cessing step, we introduced a dictionary-based recognition method for tokenization of multi-word expressions. This method complements...
15	Analyse de programmes malveillants par abstraction de comportements / Malware Analysis by Behavior Abstraction Beaucamps, Philippe 14 November 2011 (has links) L’analyse comportementale traditionnelle opère en général au niveau de l’implantation de comportements malveillants. Pourtant, elle s’intéresse surtout à l’identification de fonctionnalités données et elle se situe donc plus naturellement à un niveau fonctionnel. Dans cette thèse, nous définissons une forme d’analyse comportementale de programmes qui opère non pas sur les interactions élémentaires d’un programme avec le système mais sur la fonction que le programme réalise. Cette fonction est extraite des traces d’un pro- gramme, un procédé que nous appelons abstraction. Nous définissons de façon simple, intuitive et formelle les fonctionnalités de base à abstraire et les comportements à détecter, puis nous proposons un mécanisme d’abstraction applicable à un cadre d’analyse statique ou dynamique, avec des algorithmes pratiques à complexité raisonnable, enfin nous décrivons une technique d’analyse comportementale intégrant ce mécanisme d’abstraction. Notre méthode est particulièrement adaptée à l’analyse des programmes dans des langages de haut niveau ou dont le code source est connu, pour lesquels l’analyse statique est facilitée : applications mobiles en .NET ou Java, scripts, extensions de navigateurs, composants off-the-shelf.Le formalisme d’analyse comportementale par abstraction que nous proposons repose sur la théorie de la réécriture de mots et de termes, les langages réguliers de mots et de termes et le model checking. Il permet d’identifier efficacement des fonctionnalités dans des traces et ainsi d’obtenir une représentation des traces à un niveau fonctionnel; il définit les fonctionnalités et les comportements de façon naturelle, à l’aide de formules de logique temporelle, ce qui garantit leur simplicité et leur flexibilité et permet l’utilisation de techniques de model checking pour la détection de ces comportements ; il opère sur un ensemble quelconque de traces d’exécution ; il prend en compte le flux de données dans les traces d’exécution; et il permet, sans perte d’efficacité, de tenir compte de l’incertitude dans l’identification des fonctionnalités. Un cadre d’expérimentation a été mis en place dans un contexte d’analyse dynamique comme statique / Traditional behavior analysis usually operates at the implementation level of malicious behaviors. Yet, it is mostly concerned with the identification of given functionalities and is therefore more naturally defined at a functional level. In this thesis, we define a form of program behavior analysis which operates on the function realized by a program rather than on its elementary interactions with the system. This function is extracted from program traces, a process we call abstraction. We define in a simple, intuitive and formal way the basic functionalities to abstract and the behaviors to detect, then we propose an abstraction mechanism applicable both to a static or to a dynamic analysis setting, with practical algorithms of reasonable complexity, finally we describe a behavior analysis technique integrating this abstraction mechanism. Our method is particularly suited to the analysis of programs written in high level languages or with a known source code, for which static analysis is facilitated: mobile applications for .NET or Java, scripts, browser addons, off-the-shelf components.The formalism we propose for behavior analysis by abstraction relies on the theory of string and terms rewriting, word and tree languages and model checking. It allows an efficient identification of functionalities in traces and thus the construction of a represen- tation of traces at a functional level; it defines functionalities and behaviors in a natural way, using temporal logic formulas, which assure their simplicity and their flexibility and enables the use of model checking techniques for behavior detection; it operates on an unrestricted set of execution traces; it handles the data flow in execution traces; and it allows the consideration of uncertainty in the identification of functionalities, with no complexity overhead. Experiments have been conducted in a dynamic and static analysis setting Détection de codes malveillants Analyse comportementale Abstraction de comportements Model checking Réécriture de mots Réécriture de termes Analyse statique Automates à états finis Malware detection Behavioral analysis Behavior abstraction Model cheeking String rewriting Static analysis Finite state automata 005.8
16	Quelques développements combinatoires autour des groupes de Coxeter et des partitions d'entiers / Some combinatorial developpements about Coxeter Groups and integer partitions Pétréolle, Mathias 25 November 2015 (has links) Cette thèse porte sur l'étude de la combinatoire énumérative, plus particulièrement autour des partitions d'entiers et des groupes de Coxeter. Dans une première partie, à l'instar de Han et de Nekrasov-Okounkov, nous étudions des développements combinatoires des puissances de la fonction êta de Dedekind, en termes de longueurs d'équerres de partitions d'entiers. Notre approche, bijective, utilise notamment les identités de Macdonald en types affines (en particulier le type C), généralisant l'approche de Han en type A. Nous étendons ensuite avec de nouveaux paramètres ces développements, grâce à de nouvelles propriétés de la décomposition de Littlewood vis-à-vis des partitions et statistiques considérées. Cela nous permet de déduire des formules des équerres symplectiques, ainsi qu'une connexion avec la théorie des représentations. Dans une seconde partie, nous étudions les éléments cycliquement pleinement commutatifs dans les groupes de Coxeter introduits par Boothby et al., qui forment une sous famille des éléments pleinement commutatifs. Nous commençons par développer une construction, la clôture cylindrique, donnant un cadre théorique qui est aux éléments CPC ce que les empilements de Viennot sont aux éléments PC. Nous donnons une caractérisation des éléments CPC en terme de clôtures cylindriques pour n'importe quel système de Coxeter. Celle-ci nous permet de déterminer en termes d'expressions réduites les éléments CPC dans tous les groupes de Coxeter finis ou affines, et d'en déduire dans tous ces groupes l'énumération de ces éléments. En utilisant la théorie des automates finis, nous montrons aussi que la série génératrice de ces éléments est une fraction rationnelle / This thesis focuses on enumerative combinatorics, particularly on integer partitions and Coxeter groups. In the first part, like Han and Nekrasov-Okounkov, we study the combinatorial expansion of power of the Dedekind's eta function, in terms of hook lengths of integer partitions. Our approach, bijective, use the Macdonald identities in affine types, generalizing the study of Han in the case of type A. We extend with new parameters the expansions that we obtained through new properties of the Littlewood decomposition. This enables us to deduce symplectic hook length formulas and a connexion with representation theory. In the second part, we study the cyclically fully commutative elements in Coxeter groups, introduced by Boothby et al., which are a sub family of the fully commutative elements. We start by introducing a new construction, the cylindrical closure, which give a theoretical framework for the CPC elements analogous to the Viennot's heaps for fully commutative elements. We give a characterization of CPC elements in terms of cylindrical closures in any Coxeter groups. This allows to deduce a characterization of these elements in terms of reduced decompositions in all finite and affine Coxeter and their enumerations in those groups. By using the theory of finite state automata, we show that the generating function of these elements is always rational, in all Coxeter groups Identités de Macdonald Fonction éta de Dedekind Partitions d'entiers Groupes de Coxeter Empilements Automates finis Macdonald identities Dedekind's eta function Integer partitions Coxeter group Heap Cyclically fully commutative elements Finite state automata 511.6

Page generated in 0.1041 seconds