91 |
Query Segmentation For E-Commerce SitesGong, Xiaojing 12 July 2013 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Query segmentation module is an integral part of Natural Language Processing which analyzes users' query and divides them into separate phrases. Published works on the query segmentation focus on the web search using Google n-gram frequencies corpus or text retrieval from relational databases. However, this module is also useful in the domain of E-Commerce for product search. In this thesis, we will discuss query segmentation in the context of the E-Commerce area. We propose a hybrid
unsupervised segmentation methodology which is based on prefix tree, mutual information and relative frequency count to compute the score of query pairs and involve Wikipedia for new words recognition. Furthermore, we use two unique E-Commerce evaluation methods to quantify the accuracy of our query segmentation method.
|
92 |
[pt] POLISSEMIA DO PREFIXO DES- EM SUBSTANTIVOS DE AÇÃO NO PORTUGUÊS BRASILEIRO: UMA ANÁLISE DA LÍNGUA EM USO / [en] POLYSEMY OF THE PREFIX DES- IN ACTION NOUNS IN BRAZILIAN PORTUGUESE: AN ANALYSIS OF THE LANGUAGE IN USECARLOS GUSTAVO CAMILLO PEREIRA 22 May 2020 (has links)
[pt] Este trabalho investiga as acepções do prefixo des em substantivos de ação no Português Brasileiro por meio de uma abordagem da língua em uso e focaliza a importância do contexto para o reconhecimento do sentido que está sendo ativado pelo afixo. Inicialmente, são apresentadas as principais contribuições da Tradição Gramatical Normativa e da Teoria da Linguística Gerativa para a análise do referido prefixo. Posteriormente, são detalhadas as bases teóricas que compõem este trabalho, que se fundamentam nos pressupostos da Linguística Cognitiva, paradigma de investigação dos estudos da linguagem que enfatiza a importância do sentido em uma perspectiva não-objetivista e da investigação da língua em uso. O corpus utilizado nesta pesquisa foi constituído a partir do mega-corpus eletrônico NILC da Universidade de São Paulo do campus de São Carlos. Os resultados da análise de dados revelam que a acepção do afixo des- na língua em uso é altamente influenciada pelo contexto, de maneira que é possível uma mesma palavra possuir sentidos diferentes, conforme esteja em diferentes situações de uso. / [en] This dissertation investigates the meanings of the prefix des in action nouns in Brazilian Portuguese through an approach to the language in use and focuses on the importance of context for the recognition of the meaning being activated by the affix. Initially, the main contributions of the Normative
Grammatical Tradition and the Theory of Generative Linguistics to the analysis of the referred prefix are presented. Subsequently, the theoretical bases that make up this work are detailed, which are based on the assumptions of Cognitive Linguistics, a research paradigm of language studies that emphasizes the importance of meaning in a non-objectivist perspective and the investigation of the language in use. The corpus used in this research was constituted from the electronic mega-corpus NILC of the University of São Paulo developed on the São Carlos campus. The results of the data analysis reveal that the meaning of the affix des- in the language in use is highly influenced by the context, therefore it is possible for the same word to have different meanings, depending on whether it is in different situations of use.
|
93 |
Ανάπτυξη συστημάτων δημοσιεύσεων/συνδρομών σε δομημένα δίκτυα ομοτίμων εταίρων / Content-based publish/subscribe systems over DHT-based Peer-to-Peer NetworksΑικατερινίδης, Ιωάννης 18 April 2008 (has links)
Τα τελευταία χρόνια οι εφαρμογές συνεχούς μετάδοσης ροών πληροφορίας στο διαδίκτυο έχουν γίνει ιδιαίτερα δημοφιλείς. Με τον συνεχώς αυξανόμενο ρυθμό εισόδου νέων αντικειμένων πληροφορίας, γίνεται ολοένα και πιο επιτακτική η ανάγκη για την ανάπτυξη πληροφορικών συστημάτων που να μπορούν να προσφέρουν στους χρήστες τους μόνο εκείνες τις πληροφορίες που τους ενδιαφέρουν, φιλτράροντας τεράστιους όγκους από άσχετες για τον κάθε χρήστη, πληροφορίες. Ένα μοντέλο διάδοσης πληροφορίας ικανό να ενσωματώσει τέτοιου είδους ιδιότητες, είναι το μοντέλο δημοσιεύσεων/συνδρομών βασισμένο στο περιεχόμενο ( content-based publish/subscribe)
Βασική συνεισφορά μας στο χώρο είναι η εφαρμογή του μοντέλου δημοσιεύσεων/συνδρομών βασισμένου στο περιεχόμενο (content-based publish/subscribe) πάνω στα δίκτυα ομοτίμων ώστε να μπορέσουμε να προσφέρουμε στους χρήστες υψηλή εκφραστικότητα κατά την δήλωση των ενδιαφερόντων τους, λειτουργώντας σε ένα πλήρως κατανεμημένο και κλιμακώσιμο περιβάλλον. Ο κορμός των προτεινόμενων λύσεων σε αυτή τη διατριβή είναι: (α) η ανάπτυξη αλγορίθμων για την αποθήκευση των κλειδιών των δημοσιεύσεων σε κατάλληλους κόμβους του δικτύου με βάση τις συνθήκες στο περιεχόμενο που έχουν δηλωθεί και (β) αλγορίθμων δρομολόγησης δημοσιεύσεων στο διαδίκτυο έτσι ώστε να ((συναντούν)) αυτούς τους κόμβους οι οποίοι περιέχουν συνδρομές που ικανοποιούνται από την πληροφορία της δημοσίευσης.
Οι προτεινόμενοι αλγόριθμοι υλοποιήθηκαν και εξετάσθηκαν ενδελεχώς με προσομοίωση μελετώντας την απόδοσή τους με βάση μετρικές όπως: η δίκαιη κατανομή του φόρτου στους κόμβους του δικτύου από τη διακίνηση μηνυμάτων κατά την επεξεργασία των συνδρομών/δημοσιεύσεων, ο συνολικός αριθμός μηνυμάτων που διακινούνται, ο συνολικός όγκος επιπλέον πληροφορίας που απαιτούν οι αλγόριθμοι να εισέλθει στο δίκτυο (network bandwidth), και ο χρόνος που απαιτείται για την ανεύρεση των συνδρομών που συζευγνύουν με κάθε δημοσίευση. / In the past few years the continuous data streams applications have become particularly popular. With the continuously increasing rate of entry of new information, it becomes imperative the need for developing appropriate infrastructures that will offer only the information that users are interested for, filtering out large volumes of irrelevant for each user, information. The content-based publish/subscribe model, is capable of handling large volumes of data traffic in a distributed, fully decentralized manner.
Our basic contribution in this research area is the coupling of the content-based publish/subscribe model with the structured (DHT-based) peer-to-peer networks, offering high expressiveness to users on stating their interests. The proposed infrastructure operated in a distributed and scalable environment. The proposed solutions in this thesis are related to the development and testing: (a) of a number of algorithms for subscription processing in the network and (b) of a number of algorithms for processing the publication events.
The proposed algorithms were developed and thoroughly tested with a detailed simulation-based experimentation. The performance metrics are: the fair distribution of load in the nodes of network from the distribution of messages while processing subscriptions and publication events, the total number of messages that are generated, the total volume of additional information that is required from the algorithms to operate, and the time that is required for matching publication events to subscriptions.
|
94 |
Average case analysis of algorithms for the maximum subarray problemBashar, Mohammad Ehsanul January 2007 (has links)
Maximum Subarray Problem (MSP) is to find the consecutive array portion that maximizes the sum of array elements in it. The goal is to locate the most useful and informative array segment that associates two parameters involved in data in a 2D array. It's an efficient data mining method which gives us an accurate pattern or trend of data with respect to some associated parameters. Distance Matrix Multiplication (DMM) is at the core of MSP. Also DMM and MSP have the worst-case complexity of the same order. So if we improve the algorithm for DMM that would also trigger the improvement of MSP. The complexity of Conventional DMM is O(n³). In the average case, All Pairs Shortest Path (APSP) Problem can be modified as a fast engine for DMM and can be solved in O(n² log n) expected time. Using this result, MSP can be solved in O(n² log² n) expected time. MSP can be extended to K-MSP. To incorporate DMM into K-MSP, DMM needs to be extended to K-DMM as well. In this research we show how DMM can be extended to K-DMM using K-Tuple Approach to solve K-MSP in O(Kn² log² n log K) time complexity when K ≤ n/log n. We also present Tournament Approach which solves K-MSP in O(n² log² n + Kn²) time complexity and outperforms the K-Tuple
|
95 |
Les affixes/créments dans le lexique de l'arabe : exploration du niveau submorphémique de l'arabe / Affixes / crements in the lexicon of Arabic : Exploration of the submorphemic level in ArabicKhchoum, Salem 04 December 2014 (has links)
Cette thèse s’inscrit dans le cadre des travaux de révision de la structure de la racine sémitique en général, et arabe en particulier. Dans l’introduction, nous avons entamé une relecture critique des efforts des grammairiens arabes en quête du seuil minimal du sens, et qui ont abouti à poser la racine trilitère comme étant ce seuil ultime et inanalysable. Ce choix a été fait malgré tous les signes d’instabilité que comporte ce concept - tels que son incapacité à expliquer la réversibilité de l’ordre des consonnes et leur variation phonétique indépendamment du sens. C’est un choix synchronique anhistorique qui exclut la notion de temps, et émane d’une conception révélationniste (tawqîf) du langage. Au XIXe siècle, l'évolutionnisme darwiniste, étendu à la philologie, met à mal cette conception figée de la racine. En intégrant la notion de temps, nombre d’Orientalistes, suivis par quelques philologues arabes de l’époque, ont montré à travers une approche comparative que la racine trilitère est une forme évoluée d’une base primitive bilitère (ou monosyllabique). Depuis, plusieurs explications ont été proposées pour la formation de la racine trilitère à partir d’une base bilitère (croisement, incrémentation, affixation). Certains linguistes comme Hurwitz ont essayé d’identifier et de systématiser par déduction les éléments ternaires et leurs valeurs sémantiques. Dans notre travail, qui s'inscrit dans le cadre de la théorie des matrices et des étymons, nous démontrons à travers l’analyse synchronique de près de 1000 items que la racine trilitère est analysable en termes d’étymons bilitères et de créments, ou affixes. Les éléments affixables ou incrémentables à la base sont phonétiquement les mêmes (gutturales, sonantes, labiales, nasales) et corrélés souvent aux mêmes valeurs grammaticales (factitif, statif, moyen) ou sémantiques (l’intensif). La troisième position de la racine est la position privilégiée dans ce processus d’affixation/incrémentation. La racine trilitère n’est donc pas le seuil minimal du sens, et s’avère réductible à une base biconsonantique rendue trilitère grâce à un segment crément ou affixe. Ceci peut avoir un effet sur notre conception du lexique arabe, désormais réorganisable autour des bases bilitères soit abstraites (les traits phonétiques), soit concrètes (les étymons primitifs), qui sont à leur tour transformables en radicaux trilitères grâce à une liste préalable de créments et affixes spécifiant la signification primordiale véhiculée par la base bilitère. / This thesis is part of the revisionist work on the structure of the Semitic root in general, and the Arabic root in particular. In the introduction, we present a critical review of the efforts of Arab grammarians in their quest of a minimum linguistic threshold associated with meaning, a quest that resulted in establishing the triliteral root as the ultimate unanalysable unit. This choice was made despite the many obvious shortcomings of this theoretical framework, such as its inability to explain the reversal of the order of consonants, or their phonetic variation (regardless of its meaning). It is an anhistorical, synchronic choice that excludes the notion of time, and finds its roots in a revelationnist (tawqîf) linguistic framework. In the end of the nineteenth century, the extended Darwinist theory has undermined this static conception of the root. By integrating the notion of time, a number of Orientalists - followed by some Arab philologists of that period - showed, through a comparative methodology, that the triliteral root has evolved from a primitive monosyllabic (or biconsonantal) root. Since then, several explanations have been proposed for the formation of the triliteral root from a biliteral base (crossed bilateral roots, affixation of formative increments or determinatives).Some linguists, such as Hurwitz, tried to identify and to systematize by deduction the ternary elements and their semantic values. In this work, which is carried within the framework of the Matrix and Etymons Theory , we demonstrate through the synchronic analysis of nearly 1,000 trilateral items that the triliteral root is analyzable in terms of biliteral etymons and of separable increments or affixes added at the beginning (prefixation), the middle (infixation) or the end (suffixation) of bilateral bases.The characteristic elements that can be affixed or incremented on the base are phonetically similar (gutturals, sonorants, labial, and some dentals) and correlated often with the same grammatical (factitive, stative, reflexive or middle voice) or semantic (intensive) values. The third position of the root is the position favored in this process of affixation / incrementation. Thus, the triliteral root is not the minimal threshold of meaning, and can be broken down to a biconsonantal base, which became triliteral thanks to an incremental or affixal segment. These findings may affect our perception of the Arabic lexicon, which can now be rearranged around biliterals bases, either abstract ( i.e. phonetic features), or concrete (i.e. historical primitive etymons) that are in turn convertible into triliteral radicals through a preliminary list increments and affixes that specify the primary meaning conveyed by the bilateral base.
|
96 |
La contribution de la traduction à l'expansion lexicale du sesotho / The contribution of translation to the lexical expansion of SesothoSebotsa, Mosisili 22 November 2016 (has links)
Si la traduction est simplement définie comme un processus de communication bilingue dont le but général est de reproduire en langue cible un texte qui soit fonctionnellement équivalent au texte de départ (Reiss 2004 : 168-169), l’approche empruntée dans la présente thèse est celle d’une opération interculturelle et systématique qui vise à capturer le message issu d’une langue étrangère, à le décrypter en tenant en compte des nuances culturelles ou inhérentes à la discipline, et à le rendre le plus clairement possible en se servant d'éléments linguistiques et extralinguistiques compréhensibles dans la langue du locuteur cible. L'objectif est de déterminer la contribution de la traduction à l’expansion lexicale du sesotho, domaine qui demeure peu exploré par les spécialistes de cette langue. La problématique de ce travail repose sur la constatation que les néologismes en sesotho ne sont pas documentés de manière satisfaisante, si bien qu’il est difficile d'évaluer la contribution de la traduction à l’expansion lexicale. Les études antérieures sur la morphologie, la dérivation, la composition, l’emprunt et la dénomination s’appuient sur la mesure de la productivité, soulevant la question de savoir si la traduction en soi contribue à l’enrichissement terminologique du sesotho. Le point de départ de la thèse est l'hypothèse selon laquelle l’interaction avec le monde européen a nécessité de traduire de nombreux concepts qui n’existaient pas dans les systèmes traditionnels du Lesotho, ce qui a entraîné un nouveau dynamisme qui a permis de combler des lacunes terminologiques évidentes et de s’ouvrir et de s’adapter aux nouvelles réalités. Pour mettre cette hypothèse à l’épreuve et arriver à des conclusions éclairées et fiables, je cherche à répondre à trois questions : 1) Quelle est la structure des mots sesothos par rapport à celle de l’anglais en tant que langue source de traduction en sesotho, et du français en tant que langue de rédaction de la thèse ? 2) Etant donné que le sesotho est utilisé concomitamment avec l’anglais sans pour autant être la langue d'une culture inventrice en matière technologique, quel est le rôle que joue l’emprunt dans son expansion lexicale ? 3) D’un point de vue lexicologique, comment le sesotho répond-il aux besoins terminologiques dans les domaines de spécialité techno-scientifiques ? Pour y répondre, je m'appuie sur Doke (1954) et Matšela et al. (1981) pour situer le sesotho parmi les langues bantoues, préciser les fonctions du préfixe classificateur et établir la différence entre les composés sesothos d'une part et les composés anglais et français d'autre part. J’utilise ensuite la théorie avancée par Lederer (1990) pour démontrer l’influence syntaxique, sémantique et morphologique que l’anglais a sur le sesotho et pour présenter les différents procédés d’emprunt du sesotho. Diki-Kidiri (2008), Dispaldro et al. (2010) et Baboya (2008) démontrent la nécessité de faire appel aux informateurs-spécialistes pour confirmer l’hypothèse de départ. Les résultats obtenus mettent en évidence qu’en effet, la traduction a contribué à l’expansion du sesotho moderne, bien que cela n’ait pas été documenté, d’où la recommandation d'un travail collaboratif entre lexicologues au Lesotho, au Botswana, en Namibie et en Afrique du Sud, pour ouvrir de nouvelles perspectives d’études linguistiques sur le sesotho et pouvoir suivre et mesurer l’évolution de la langue. / Whilst translation is simply defined as a communication process whose main objective is to reproduce in the target language a text that is functionally equivalent to the source text (Reiss 2004: 168-169), the approach taken in this study views translation as an intercultural and systematic operation whose objective is to capture the message from the foreign language, to decrypt it taking into account cultural nuances or those inherent in the field at hand and to render it in the clearest possible manner using linguistic and extra linguistic elements which are comprehensible to the speaker of the target language. This study is aimed at determining the contribution of translation to the lexical expansion of Sesotho, an area which has been little explored by specialists of the language. The core issue is centred on the observation that Sesotho neologisms are not well documented, so that it is hard to measure the contribution of translation towards the lexical expansion of Sesotho. Analyses of morphology, derivation, compounding, borrowing and denomination are mainly focused on productivity in order to determine whether translation as a discipline contributes towards the creation of new words in the language.The study begins by positing the hypothesis that the interaction with the Western world necessitated the translation of numerous concepts which were absent from the then existing Sesotho systems. This process of interaction contributed a new dynamism that helped the language to bridge the terminological gap, to open up and adapt to new realities. In order to put this hypothesis to the test and arrive at well-researched and reliable conclusions, I attempt to probe three issues of concern: firstly, what is the structure of the Sesotho language compared to that of the English language as the source language of most translations into Sesotho and compared to that of the French language as the language in which this study is presented? Secondly, considering that Sesotho is used simultaneously with English even though it is not a techno-scientifically inventing language, what is the role played by the processes of borrowing in the lexical expansion of Sesotho? Thirdly, from the word-formation point of view, how does Sesotho respond to the terminological deficiencies in various fields of specialisation?To address these issues, Doke (1954) and Matšela et al. (1981) serve as references to situate Sesotho among the Bantu languages, to highlight the functions of the class prefix and to establish the difference between Sesotho and English and French compounding. Secondly, the theory advanced by Lederer (1990) serves as a springboard to analyse the syntactic, semantic and morphological influences that English has on Sesotho and to present the different borrowing processes. The third issue is addressed based on the theories presented by Diki-Kidiri (2008) while the theories proposed by Dispaldro et al. (2010) and Baboya (2008) led to the decision to call upon specialist informants to confirm the original hypothesis. The results obtained provide evidence that translation has, in fact, contributed to the lexical expansion of modern Sesotho, even though this has not been well documented. The study recommends collaborative work between Lesotho, Botswana, Namibian and South African linguists in order to open new avenues of linguistic studies on Sesotho with the aim to measure and monitor the evolution of the language.
|
97 |
Generická analýza toků v počítačových sítích / Generic Flow Analysis in Computer NetworksJančová, Markéta January 2020 (has links)
Tato práce se zabývá problematikou popisu síťového provozu pomocí automaticky vytvořeného modelu komunikace. Hlavním zaměřením jsou komunikace v řídicích systémech , které využívají speciální protokoly, jako je například IEC 60870-5-104 . V této práci představujeme metodu charakteristiky síťového provozu z pohledu obsahu komunikace i chování v čase. Tato metoda k popisu využívá deterministické konečné automaty , prefixové stromy a analýzu opakovatelnosti. Ve druhé části této diplomové práce se zaměřujeme na implementaci programu, který je schopný na základě takového modelu komunikace verifikovat síťový provoz v reálném čase.
|
98 |
Automatický tarifikační systém telefonních hovorů / Automated accounting system of phone callsKondler, Radim January 2016 (has links)
The thesis deals with the numbering plan, PBX, tariffs, draft disable tariff system and its possible extensions. Within numbering plans work deals with shapes of national and international figures, describes different types of these numbers, description of the structure and definition of fields, including the length of these fields. Within countries and regions of the Czech Republic are given individual prefix for both fixed and mobile networks. There are mentioned special shapes that are different in length compared to standard types. All these types of shapes are listed according to ITU-T. Within the exchanges are given generation switches, PBXs use within the telephone network and a description of the selected functions. Furthermore, there are kinds of pricing and options of valuation call. The work contains a description of automated acconting system, according to the principle of work of the program and a description of the individual steps and functions that thanks to the cooperation process individually each calls, convert numbers to international form, assigns the call tariff and appreciate the call according to the tariff. Finally, there are described possible extensions of this automated acconting system.
|
Page generated in 0.0489 seconds