Global ETD Search

1	Ontology for cultural variations in interpersonal communication: building on theoretical models and crowdsourced knowledge Thakker, Dhaval, Karanasios, S, Blanchard, E., Lau, L., Dimitrova, V. 05 May 2017 (has links) Yes / The domain of cultural variations in interpersonal communication is becoming increasingly important in various areas, including human-human interaction (e.g. business settings) and humancomputer interaction (e.g. during simulations, or with social robots). User generated content (UGC) in social media can provide an invaluable source of culturally diverse viewpoints for supporting the understanding of cultural variations. However, discovering and organizing UGC is notoriously challenging and laborious for humans, especially in ill-defined domains such as culture. This calls for computational approaches to automate the UGC sensemaking process by using tagging, linking and exploring. Semantic technologies allow automated structuring and qualitative analysis of UGC, but are dependent on the availability of an ontology representing the main concepts in a specific domain. For the domain of cultural variations in interpersonal communication, no ontological model exists. This paper presents the first such ontological model, called AMOn+, which defines cultural variations and enables tagging culture-related mentions in textual content. AMOn+ is designed based on a novel interdisciplinary approach that combines theoretical models of culture with crowdsourced knowledge (DBpedia). An evaluation of AMOn+ demonstrated its fitness-for-purpose regarding domain coverage for annotating culture-related concepts mentioned in text corpora. This ontology can underpin computational models for making sense of UGC. Ontology Knowledge engineering Culture Crowdsourced knowledge Semantic tagging
2	DEXTER: Generating Documents by means of computational registers Oldham, Joseph D. 01 January 2000 (has links) Software is often capable of efficiently storing and managing data on computers. However, even software systems that store and manage data efficiently often do an inadequate job of presenting data to users. A prototypical example is the display of raw data in the tabular results of SQL queries. Users may need a presentation that is sensitive to data values and sensitive to domain conventions. One way to enhance presentation is to generate documents that correctly convey the data to users, taking into account the needs of the user and the values in the data. I have designed and implemented a software approach to generating human-readable documents in a variety of domains. The software to generate a document is called a {\em computational register}, or ``register'' for short. A {\em register system} is a software package for authoring and managing individual registers. Registers generating documents in various domains may be managed by one register system. In this thesis I describe computational registers at an architectural level and discuss registers as implemented in DEXTER, my register system. Input to DEXTER registers is a set of SQL query results. DEXTER registers use a rule-based approach to create a document outline from the input. A register creates the output document by using flexible templates to express the document outline. The register approach is unique in several ways. Content determination and structural planning are carried out sequentially rather than simultaneously. Content planning itself is broken down into data re-representation followed by content selection. No advanced linguistic knowledge is required to understand the approach. Register authoring follows a course very similar to writing a single document. The internal data representation and content planning steps allow registers to use flexible templates, rather than more abstract grammar-based approaches, to render the final document, Computational registers are applicable in a variety of domains. What registers can be written is restricted not by domain, but by the original data representation. Finally, DEXTER shows that a single software suite can assist in authoring and management of a variety of registers.
3	Next Page Prediction With Popularity Based Page Rank, Duration Based Page Rank And Semantic Tagging Approach Yanik, Banu Deniz 01 February 2012 (has links) (PDF) Using page rank and semantic information are frequently used techniques in next page prediction systems. In our work, we extend the use of Page Rank algorithm for next page prediction with several navigational attributes, which are size of the page, duration of the page visit and duration of transition (two page visits sequentially), frequency of page and transition. In our model, we define popularity of transitions and pages by using duration information, use it in a relation with page size, and visit frequency factors. By using the popularity value of pages, we bias conventional Page Rank algorithm and model a next page prediction system that produces page recommendations under given top-n value. Moreover, we extract semantic terms from web URLs in order to tag pages semantically. The extracted terms are mapped into web URLs with different level of details in order to find semantically similar pages for next page recommendations. With this tagging, we model another next page prediction method, which uses Semantic Tagging (ST) similarity and exploits PPR values as a supportive method. Moreover, we model a Hybrid Page Rank (HPR) algorithm that uses both Semantic Tagging based approach and Popularity Based Page Rank values of pages together in order to investigate the effect of PPR and ST with equal weights. In addition, we investigate the effect of local (a synopsis of directed web graph) and global (whole directed web graph) modeling on next page prediction accuracy. QA Computer Software 76.75-76.765
4	Developing UCAF, an administrative functionality for the U-Call IVR reporting system Rostami, Asreen January 2014 (has links) Mobile phones and Interactive Voice Response (IVR) applications are being progressively used in developing countries to collect voice-based reports about bad governance or poor public service delivery, reported by citizens. Such systems (e.g. Avaaj Otalo, Foroba Blon, etc.) can give an opportunity to rural users in developing countries to easily influence and participate in public affairs. Despite the ongoing efforts on using such solutions, the lack of an efficient system of administration can cause delays in broadcasting the collected reports as quickly as possible, to reach the relevant authorities. This thesis presents the results of a real-world deployment of an administrative functionality for an IVR system called U-Call, used in the Northern districts of Uganda. U-Call Administrative Functionality (UCAF) interacts with the U-Call administrators through mobile phones and gives the moderator access to the registered users. It allows administrators to easily publish and tag audio reports over the Web using their mobile phones. It also uses a semantic tagging module to increase findability and information categorization on the U-Call’s website. After an initial validation and successful evaluation of UCAF in the field, during a trip to Uganda, additional features were incorporated, such as multiple authentication process and dynamic tagging. UCAF and its additional features was succefully delivered to the end user, as part of the U-Call reporting system. / People’s Voices: Developing Cross Media Services to Promote Citizens Participation in Local Governance Activities ICT4D IVR Semantic Tagging VoIP Drupal UCAF U-Call Computer Sciences Datavetenskap (datalogi)
5	Diccionari electrònic bilingüe català>anglés de locucions referencials idiomàtiques de somatismes Escolano Marín, Xènia 22 July 2021 (has links) The ultimate aim of this research is to design a Catalan>English bilingual electronic dictionary of somatic idioms. To do so, we undertake a semasiological characterisation of somatic idioms to determine their semantic value and morphosyntactic combinatorial possibilities going from language to abstraction, identifying the differences in conceptualisation between Catalan and English and their equivalence relationship on the basis of their occurrences in corpora. The methodology used for the description of these somatisms is based on lexicogrammar but in a bottom-up manner. The theory of lexicogrammar (M. Gross 1975, 1981) holds that every elementary phrase is constituted by at least one first order predicate that introduces its arguments, represented by nouns or phrases. For example, in the phrase 'Luc admire le courage de Léa', we can find the verbal predicate 'admirer' with its arguments 'Luc' and 'Léa', that are morphologically equivalent to 'Luc est admiratif (pour + devant) le courage de Léa', with an adjective predicate, and 'Luc a de l'admiration pour le courage de Léa', with a nominal predicate, since there is no change in the structure of arguments (Le Pesant & Mathieu-Colas 1998: 8). This entails a systematic description of the syntactic and semantic properties of verbs, predicate nouns and adjectives in the form of tables that represent a class of lexical elements –characterised in terms of syntactic-semantic features (such as human, animal, plant, concrete, abstract, locative and temporal)–which correspond to a certain syntactic category with a series of common distributional and transformational properties. However, this description is insufficient for the complete automatic treatment of language; this is why in 1992 G. Gross adds to lexicogrammar the notion of object classes: semantic groups defined by the syntactic relations they maintain with one or more classes of verbs, called appropriate predicates (Gross 2012: 101). This exhaustive description of the language has made it possible to consider certain linguistic phenomena like fixation for the automatic treatment of language. Considering that computerised corpora play a very prominent role in usage-based linguistics, a trend that considers grammar and use to be closely interrelated, lexicogrammar can represent a theoretical model particularly suitable for the detection and analysis of phraseological units (PhUs) in corpora. In particular, we focus on the description of the 50 most frequent idioms in Catalan containing one of the five most common anthropomorphic somatic lexemes in all languages: hand, head, heart, eye and ear (Mellado 2004). To identify which are these 50 most frequent idioms (10 per each lexeme), we retrieve all the occurrences for each of the five mentioned lexemes from the Corpus Textual Informatitzat de la Llengua Catalana (CTILC) (~ 52M words with texts from 1833-1988). Then we carry out a semi-automatic extraction of the different combinations of the most frequent bigrams (candidates for idioms of the dictionary) for each somatic lexeme using the software Metaconcor. For example, for mà (hand), we obtain: a mà (at hand), a mà (by hand), entre mans (on [one’s] hands), de mà en mà (from hand to hand), mà d’obra (labour force), mà dura (firm hand), picar de mans (to clap [one’s hands]), lligar de peus i mans (to tie [sb’s] hands and feet), a mans plenes (liberally) and rentar-se’n les mans (to wash [one’s] hands [of]). We also consider the most frequent verbal and nominal co-occurrences of each bigram. Once we have translated the Catalan idioms into English, we apply this step to the English equivalents based on their occurrences in the British National Corpus (BNC) (~ 100M words with texts from 1985-1994). In order to design the dictionary, we undertake a syntactic-semantic description of the Catalan idioms and their equivalents in English –indicating its argument structures and semantic values according to the object class to which we ascribe them (the one referring to the values of <somatisms>)–, based on the occurrences of these units offered by the CTILC for Catalan and the BNC for English. This analysis is expressed through a tagging recognised by automatic language processing systems, based on the one used by the Laboratoire de Linguistique Informatique (LLI) –current LDI (Lexiques, Dictionnaires, Informatique) from the Paris 13 University–, whose dictionaries follow the system of the Laboratoire d’Automatique Documentaire et Linguistique (LADL) (Paris 7) –founded by M. Gross in 1968– and incorporate the notion of G. Gross’ object classes (1992). Having identified the most frequent idioms and their English equivalents, as well as their distributional (syntagmatic relations) and transformational possibilities (paradigmatic relations), we give account of their semantic relations (for both languages): conceptual variations (polysemy), intersynonymic variants (relations of synonymy), their eventual relations of antonymy and hyperonymy and their paradigmatic variants (according to their occurrences in the corpora used). All this information is registered in the dictionary in the form of four coordinated files with fields of different nature (cf. 2): two files containing the arguments of the idioms (one file for Catalan and another file for English) and two files containing the predicates (i.e. the idioms as entries) (one file for Catalan and another one for English). Considering the relevance object classes may have in fixed expressions, our dictionary mostly contains idioms –predicates– that combine invariable (argument human nouns) and variable (anthropomorphic somatic lexemes) elements –arguments–: for example, to wash <one’s> hands (of)/C:<so-ma:elre>/G:v/N0:<hum>/N1:<so-ma:elre>/N3:<ina>/Ca:rentar-se les mans (d’) <alguna cosa>, where N0 refers to the subject (a human), and N1 to the first complement (direct object) in the argument structure of this idiom with the somatic lexeme (so) hand, which in this case has a semantic value of refusal of responsibility (elre) (in other cases this lexeme will be in idioms which may evoke, among others, proximity or facility [at hand], manual labour [by hand], activity [on (one’s) hands], severity [firm hand], itinerancy [from hand to hand], human resources [labour force], approval, enthusiasm or attention [to clap (one’s hands)], immobility or repression [to tie (sb’s) hands and feet] and abundance [liberally], which will be reflected in the argument file). N3 is the third complement, which here corresponds to a concrete inanimate object (<ina>)(e.g. “I’m entirely opposed to the er (pause) the idea that they should wash their hands of their (pause) er, obligations er, the nineteen sixty eight act (pause) as” [BNC]). In this case there is no N2 complement, which usually refers to an indirect object. Ca indicates the Catalan equivalent of the idiom. Opting for an electronic phraseological dictionary with a semasiological approach implies, on the one hand, finding the PhUs contained in it more easily, since instead of departing from specific concepts, it departs from the semantic values of the units (idioms) grouped in a linguistically well-defined object class. On the other, it offers the advantage of being suitable for Natural Language Processing, as it has a coding recognised by automatic language processing systems, incorporating numerous fields with information of morphological, syntactic-semantic (the most common distributional and transformational properties) and diasystematic nature (if applicable). It also contains a specific field referred to translations into other languages (bilingual or multilingual). Therefore, these repertoires have a dual function: decoding and encoding information. In fact, one of the main advantages offered by semasiological electronic dictionaries over more traditional ones is that they present in different entries each of the argument structures of a predicate, which allows it to be monosemised. Thus, each usage of a predicate is conceived as a lexical unit to which a description is assigned, a sine qua non for machine translation. This is particularly relevant to hybrid systems, which combine statistics with syntax and semantics (e.g. Systran and Sadaw). The eventual product derived from this thesis, as well as the linguistic data (methodology, examples, tagging and files), can be considered unprecedented. Until now, lexicogrammar had been implemented based on linguistic intuition, which could make the creation of object classes somehow biased by the linguist’s idiolect. Using a syntactic-semantic tagging based on the above-mentioned methodology enables precision and objectivity, starting from real language instances (occurrences of these idioms in corpora). All in all, the usability and replicability of the linguistic data provided in the research may offer a wide range of possibilities, since this tagging could be implemented into linguistic analysis tools (e.g. in the form of tags in the “part of speech” label if idioms are used) and the model could be replicated to expressions from other lexico-phraseological fields, different types of PhUs and other languages. / La present tesi doctoral ha rebut el finançament del contracte d’investigació predoctoral Ayudas para la Formación de Profesorado Universitario (Ref. FPU17/0032) concedit pel Ministerio de Educación, Cultura y Deporte d’Espanya (actual Ministerio de Ciencia, Innovación y Universidades). El projecte de tesi s’ha inscrit en el si de l’Institut Superior d’Investigació Cooperativa IVITRA [ISIC-IVITRA] (Programa per a la Constitució i Acreditació d’Instituts Superiors d’Investigació Cooperativa d’Excel∙lència de la Generalitat Valenciana, Ref. ISIC/012/042) i s’ha desenvolupat en el marc dels projectes, xarxes i grups de recerca següents: «Variación y cambio lingüístico en catalán. Una aproximación diacrónica según la Lingüística de Corpus» (MICINUN, Ref. PGC2018-099399-B-100371); (IEC, Ref. PRO2018-S04-MARTINES); Grup d’Investigació VIGROB-125 de la UA; Xarxa de recerca en innovació en docència universitària «Lingüística de Corpus i Mediterrània intercultural: investigació educativa per a l’aplicació de la Lingüística de Corpus en entorns multilingües diacrònics. Aplicacions del Metacorpus CIMTAC» (Institut de Ciències de l’Educació de la UA, Ref. 4581-2018), i Grup d’Investigació en Tecnologia Educativa en Història de la Cultura, Diacronia lingüística i Traducció (Universitat d’Alacant, Ref. GITE-09009-UA). Electronic Dictionary Phraseology Idioms Somatisms Lexicogrammar Syntactic-Semantic Tagging, Corpus-Based Methodology Interlinguistic Equivalents. Filología Catalana
6	Zkoumání úlohy univerzálního sémantického značkování pomocí neuronových sítí, řešením jiných úloh a vícejazyčným učením / Zkoumání úlohy univerzálního sémantického značkování pomocí neuronových sítí, řešením jiných úloh a vícejazyčným učením Abdou, Mostafa January 2018 (has links) July 19, 2018 In this thesis we present an investigation of multi-task and transfer learning using the recently introduced task of semantic tagging. First we employ a number of natural language processing tasks as auxiliaries for semantic tag- ging. Secondly, going in the other direction, we employ seman- tic tagging as an auxiliary task for three di erent NLP tasks: Part-of-Speech Tagging, Universal Dependency parsing, and Natural Language Inference. We compare full neural network sharing, partial neural network sharing, and what we term the learning what to share setting where neg- ative transfer between tasks is less likely. Fi- nally, we investigate multi-lingual learning framed as a special case of multi-task learning. Our ndings show considerable improvements for most experiments, demonstrating a variety of cases where multi-task and transfer learning methods are bene cial. 1 References 2
7	Adaptive Semantic Annotation of Entity and Concept Mentions in Text Mendes, Pablo N. 05 June 2014 (has links) No description available. Computer Science semantic annotation semantic tagging named entity recognition name resolution entity disambiguation entity linking keyphrase extraction word sense disambiguation entity classification entity extraction adaptive flexible
8	Neural Methods Towards Concept Discovery from Text via Knowledge Transfer Das, Manirupa January 2019 (has links) No description available. Computer Engineering Computer Science Information Science Library Science Linguistics

Search results