Global ETD Search

21	The text encoding software of the Thesaurus Linguae Aegyptiae Schweitzer, Simon January 2016 (has links) The Thesaurus Linguae Aegyptiae (TLA; http://aaew.bbaw.de/tla) is the publication platform of the project „Structure and Transformation in the Vocabulary of the Egyptian Language: Texts and Knowledge in the Culture of Ancient Egypt“ (formerly known as “Altägyptisches Wörterbuch”) located in Berlin and Leipzig. It contains the largest corpus of Egyptian texts (ca. 1.4 million text words) and it is a very important tool for linguistic, philological, lexicographical, and cultural research. My paper introduces you to the software behind the TLA. I will show how easy it is to add a new text to the corpus with transcription, translation, Hieroglyphic codes, and metadata and how easy you can add any annotations of different types like rubra, citations from other texts, comments, direct speech. The software itself is freely available and platform independent. You are welcome to use our software to edit your texts and to cooperate with us! info:eu-repo/classification/ddc/930 ddc:930
22	Resolving Quasi-Synonym Relationships in Automatic Thesaurus Construction using Fuzzy Rough Sets and an Inverse Term Frequency Similarity Function Davault, Julius Mack, III 01 January 2009 (has links) One of the problems associated with automatic thesaurus construction is with determining the semantic relationship between word pairs. Quasi-synonyms provide a type of equivalence relationship: words are similar only for purposes of information retrieval. Determining such relationships in a thesaurus is hard to achieve automatically. The term vector space model and an inverse term frequency similarity function can provide a way to automatically determine the similarity between words in thesaurus. A thesaurus constructed using this method can also improve precision and recall in information retrieval, when the thesaurus is constructed in conjunction with fuzzy rough set algorithms and used with tight upper approximation query expansion. This dissertation presents a method that combines fuzzy rough sets and a word weighting and inverse term frequency similarity function as a technique for automatic thesaurus construction. automatic thesaurus construction fuzzy rough sets inverse term frequency synonyms Computer Sciences
23	[en] INSTANCE-BASED SCHEMA MATCHING / [pt] ALINHAMENTO DE ESQUEMAS BASEADO EM INSTÂNCIAS DANIELA FRANCISCO BRAUNER 10 December 2008 (has links) [pt] Um mediador é um componente de software que auxilia o acesso a fontes de dados. Com o advento da Web, a construção de mediadores impõe desafios importantes, tais como a capacidade de fornecer acesso integrado a fontes de dados independentes e dinâmicas e a habilidade de resolver a heterogeneidade semântica entre os esquemas destas fontes. Para lidar com esses desafios, o alinhamento de esquemas é uma questão fundamental. Nesta tese são propostas abordagens de alinhamento de esquemas de classificação (tesauros) e esquemas conceituais, utilizando instâncias como evidências para os mapeamentos. As abordagens propostas são classificadas em dois tipos: adaptativa e a priori, referindo-se, respectivamente, à descoberta dos mapeamentos de forma incremental ou à definição dos mapeamentos antes da implantação do mediador. Por fim, são apresentados experimentos para validação e teste das abordagens propostas. / [en] A mediator is a software component that helps accessing data sources. With the advent of the Web, the design of mediators imposes important challenges, such as the ability of providing integrated access to independent and dynamic data sources and the ability of resolving the semantic heterogeneity between different data source schemas. To deal with these challenges, schema matching is a fundamental issue. In this thesis, matching approaches for classification schemas (thesauri) and conceptual schemas are proposed, using instances as evidences for the mappings. The proposed approaches are classified as adaptative and a priori, referring to, respectively, the discovery of the mappings in an incremental way or the definition of the mappings before the deployment of the mediator. Finally, experiments to validate and test the proposed approaches are presented. [pt] BANCO DE DADOS [en] DATABASE [pt] TESAURO [en] THESAURUS [pt] ESQUEMA CONCEITUAL [en] CONCEPTUAL SCHEMA
24	A construção de tesauros com a integração de procedimentos terminográficos / Cervantes, Brígida Maria Nogueira. January 2009 (has links) Orientador: Mariângela Spotti Lopes Fujita / Banca: Maria de Fátima Gonçalves Moreira Tálamo / Banca: João Batista Ernesto de Moraes / Banca: Marta Lígia Pomim Valentim / Banca: Vera Regina Casari Boccato / Resumo: Investiga a integração da Terminografia para a construção de tesauros na busca de procedimentos terminográficos que podem ser aplicados em conjunto com procedimentos metodológicos existentes de análise de assunto, para o aprimoramento da representação de conceitos na construção de tesauros. Realiza um estudo teórico-metodológico da construção de tesauro, com enfoque na identificação de conceitos em áreas de especialidade para a organização e recuperação temática da informação. Apresenta como objetivo geral enunciar um modelo metodológico para a construção de tesauro com a integração de procedimentos terminográficos. Como objetivos específicos: analisar e sintetizar referenciais teórico-metodológicos sobre construção de tesauros; identificar os principais aspectos teórico-metodológicos da Terminologia/Terminografia contribuintes para a construção de tesauros; e apresentar proposta de um modelo metodológico terminográfico para a construção de tesauros. A metodologia da pesquisa qualifica-se por sua natureza bibliográfica, descritiva e exploratória, concentrando-se na abordagem temática do vocabulário de áreas de especialidade. Enfatiza como um resultado do trabalho aplicado o "Tesauro Terminográfico Preliminar em Gestão da Informação", disponível na web. Conclui que o aprimoramento de etapas da construção de tesauro, aliado a contribuições de procedimentos terminográficos, produz uma representação de conceitos, por meio de termos, tendo em vista a obtenção de um vocabulário consistente, que compõe a base para a organização e recuperação temática da informação, e compatível com a demanda de áreas de especialidade. / Abstract: It investigates the terminographic integration for the thesauri construction in the search of terminographic procedures that may be used together with existing methodological procedures of subject analysis, for the improvement of the concepts representation in the thesauri construction. It is a theoretical-methodological study of the thesauri construction, focusing in the concepts identification in specialized area for the representation and thematic information retrieval. The general purpose of this research is to conceive a methodological model for the thesauri construction with the terminographic procedures integration. The specific purposes are to analyze and synthesize theoretical-methodological framework on thesauri construction, identify the main theoretical-methodological aspects of Terminology/Terminography which contribute to the thesauri construction, and present a proposal of a terminographic methodological model for the thesaurus construction. The research methodology is bibliographical, descriptive and exploratory, focusing on the thematic approach of vocabulary of speciality areas. It emphasizes as a result of this work, the "Preliminary Terminographic Thesauri in Information Management", available in the web. It concludes that the improvement of thesauri construction stages, with the contributions of terminographic procedures, produce a concepts representation, by means of terms, having in mind the acquisition of a consistent vocabulary, which forms the basis for the organization and thematic information retrieval, and compatible with the demand of specialized areas. / Doutor Tesauros. Linguagem documentária. Thesaurus construction. eng Terminography. eng Terminographic thesauri. eng
25	Wie sehr können maschinelle Indexierung und modernes Information Retrieval Bibliotheksrecherchen verbessern? Hauer, Manfred 30 November 2004 (has links) (PDF) Mit maschinellen Verfahren lässt sich die Qualität der Inhaltserschließung dramatisch steigern. intelligentCAPTURE ist seit 2002 produktiv im Einsatz in Bibliotheken und Dokumentationszentren. Zu dessen Verfahren gehören Module für die Dokumentenakquisition, insbesondere Scanning und OCR, korrekte Textextraktion aus PDF-Dateien und Websites sowie Spracherkennung für "textlose" Objekte. Zusätzliche Verfahren zur Informationsextraktion können optional folgen. Als relevant erkannter Content wird mittels der CAI-Engine (Computer Aided Indexing) maschinell inhaltlich ausgewertet. Dort findet ein Zusammenspiel computerlinguistischer Verfahren (sprachabhängige Morphologie, Syntaxanalyse, Statistik) und semantischer Strukturen (Klassifikationen, Systematiken, Thesauri, Topic Maps, RDF, semantische Netze) statt. Aufbereitete Inhalte und fertige, human editierbare Indexate werden schließlich über frei definierbare Exportformate an die jeweiligen Bibliothekssysteme und in der Regel auch an intelligentSEARCH übergeben. intelligentSEARCH ist eine zentrale Verbunddatenbank zum Austausch zwischen allen produktiven Partnern weltweit aus dem öffentlichen und privatwirtschaftlichen Bereich. Der Austausch ist auf tauschbare Medien, bislang Inhaltsverzeichnisse, aus urheberrechtlichen Gründen begrenzt. Gleichzeitig ist diese Datenbank "Open Content" für die akademische Öffentlichkeit mit besonders leistungsstarken Retrieval-Funktionen, insbesondere mit semantischen Recherche-Möglichkeiten und der Visualisierung von semantischen Strukturen (http://www.agi-imc.de/intelligentSEARCH.nsf). Sowohl für die Indexierung als auch für die Recherche können unterschiedliche semantische Strukturen genutzt werden - je nach Erkenntnisinteresse, Weltsicht oder Sprache. OCR RDF computer aided indexing topic map ddc:020 ddc:004 Klassifikation Scanning Systematik Thesaurus
26	Automatic construction of domain-specific concept structures Chen, Libo. Unknown Date (has links) Techn. University, Diss., 2006--Darmstadt.
27	Sistema de indexa??o autom?tica de of?cios do Departamento de Computa??o da UFVJM Costa, Aline Pereira da 30 September 2016 (has links) Submitted by Jos? Henrique Henrique (jose.neves@ufvjm.edu.br) on 2017-05-04T19:38:16Z No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) aline_pereira_costa.pdf: 2041406 bytes, checksum: 9995d567446721eda6ae8ff6dbdbe60c (MD5) / Approved for entry into archive by Rodrigo Martins Cruz (rodrigo.cruz@ufvjm.edu.br) on 2017-05-16T16:59:24Z (GMT) No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) aline_pereira_costa.pdf: 2041406 bytes, checksum: 9995d567446721eda6ae8ff6dbdbe60c (MD5) / Made available in DSpace on 2017-05-16T16:59:24Z (GMT). No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) aline_pereira_costa.pdf: 2041406 bytes, checksum: 9995d567446721eda6ae8ff6dbdbe60c (MD5) Previous issue date: 2016 / Este projeto surgiu de uma defici?ncia do Departamento de Computa??o da UFVJM (Universidade Federal dos Vales do Jequitinhonha e Mucuri), em armazenar e recuperar seus of?cios. Tal Departamento possui dificuldades na organiza??o e armazenamento eficiente destes of?cios o que inviabiliza o acesso aos documentos e dispende muito tempo na localiza??o e recupera??o da informa??o. Diante disso, foi desenvolvido um Sistema de Indexa??o Autom?tica utilizando-se t?cnicas da biblioteconomia e t?cnicas computacionais que visa automatizar o processo de indexa??o de novos of?cios, otimizar a recupera??o e democratizar o acesso a informa??o. O banco de dados do Sistema foi constru?do baseado em um vocabul?rio controlado: o tesauro. O tesauro ? um tipo de vocabul?rio controlado, mais complexo, que trabalha com ambiguidade dos termos, sinon?mia, rela??es hier?rquicas e associativas e foi elaborado a partir de conceitos selecionados e seus termos relacionados presentes nos of?cios. O escopo inicial do projeto permeia o espa?o de 2011 a 2014, totalizando 239 of?cios. Para a alimenta??o do Sistema com os novos documentos que surgir?o prop?e-se a auto alimenta??o do tesauro que far? a an?lise de relev?ncia de novos termos nos novos of?cios atrav?s de um algoritmo em constru??o. O sistema estar? em ambiente virtual, para que o acesso seja democratizado e o processo de tratamento de novos of?cios seja automatizado. Sendo positiva a implementa??o do projeto, sugere-se que os demais departamentos da UFVJM utilizem o mesmo sistema para organiza??o dos seus documentos, ganhando agilidade nos processos e satisfa??o do usu?rio final na localiza??o do que procura. / Disserta??o (Mestrado Profissional) ? Programa de P?s-Gradua??o em Educa??o, Universidade Federal dos Vales do Jequitinhonha e Mucuri, 2016. / This project arose from an actual deficiency of the UFVJM's Department of Computer Science (Universidade Federal dos Vales do Jequitinhonha e Mucuri) to store and retrieve archives. This institution has been having to put up with problems to store documents efficiently. This difficulty prevents access to documents and imposes a heavy burden to the department's staff, in terms of time to locate and retrieve information. Therefore, we developed an system to index information efficiently. To this end, we have used techniques borrowed from various areas within information science. Such techniques allowed us to automate the process of indexing new archives, optimize their recovery and, as a result, we have been able to democratize the access to information. The proposed database was built based on a controlled vocabulary: the thesaurus. The thesaurus is a controlled type of vocabulary, more complex, working with ambiguity of terms, synonymy, hierarchical relationships and associativity. It was prepared from selected concepts and terms related to these concepts in the archives. The initial scope of the project covers a time period from 2011 to 2014, totaling 239 offices. To feed the system with new documents yet to emerge we propose to self feeding mechanism. This self-feeding process will lead to the relevant analysis of new terms in the new archives through an algorithm, currently under implementation. The system will be in a virtual environment, which ensures not only the democratic access to information, but also the automatic handling of new documents. In case this project obtains positive feedback, we shall suggest that other UFVJM departments use the same system for organizing documents; hence, decreasing their response time, and improving the experience of their end-users. Vocabul?rio controlado Tesauro Sistema Indexa??o Controlled vocabulary Thesaurus System Indexing
28	O estudo e desenvolvimento do protótipo de uma ferramenta de apoio a formulação de consultas a bases de dados na área da saúde / The study and development of the prototype of a tool for supporting query formulation to databases in the health area Webber, Carine Geltrudes January 1997 (has links) O objetivo deste trabalho é, através do estudo de diversas tecnologias, desenvolver o protótipo de uma ferramenta capaz de oferecer suporte ao usuário na formulacdo de uma consulta a MEDLINE (Medical Literature Analysis and Retrieval System On Line). A MEDLINE é um sistema de recuperação de informações bibliográficas, na área da biomedicina, desenvolvida pela National Library of Medicine. Ela é uma ferramenta cuja utilizando tem sido ampliada nesta área em decorrência do aumento da utilizando de literatura, disponível eletronicamente, por profissionais da área da saúde. As pessoas, em geral, buscam informação e esperam encontrá-la exatamente de acordo com as suas expectativas, de forma ágil e utilizando todas as fontes de recursos disponíveis. Foi com este propósito que surgiram os primeiros Sistema de Recuperação de Informação (SRI) onde, de forma simplificada, um usuário constrói uma consulta, a qual expressa sua necessidade de informação, em seguida o sistema a processa e os resultados obtidas através dela retornam ao usuário. Grande parte dos usuários encontram dificuldades em representar a sua necessidade de informação de forma a obter resultados satisfatórios em um SRI. Os termos que o usuário escolhe para compor a consulta nem sempre são os mesmos que o sistema reconhece. A fim de que um usuário seja bem sucedido na definição dos termos que compõem a sua consulta é aconselhável que ele conheça a terminologia que foi empregada na indexação dos itens que ele deseja recuperar ou que possa contar com um intermediário que possua esse conhecimento. Em situações em que nenhuma dessas possibilidades seja verdadeira recursos que viabilizem uma consulta bem sucedida se fazem necessários. Este trabalho, inicialmente, apresenta um estudo geral sobre os Sistemas de Recuperação de Informações (SRI), enfocando todos os processos envolvidos e relacionados ao armazenamento, organização e a própria recuperação. Posteriormente, são destacados aspectos relacionados aos vocabulários e classificações medicas em uso, os quais serão Úteis para uma maior compreensão das dificuldades encontradas pelos usuários durante a interação com um sistema com esta finalidade. E, finalmente, é apresentado o protótipo do Sistema para Formulação de Consultas a MEDLINE, bem como seus componentes e funcionalidades. O Sistema para Formulação de Consultas a MEDLINE foi desenvolvido com o intuito de permitir que o usuário utilize qualquer termo na formulação de uma consulta destinada a MEDLINE. Ele possibilita a integração de diferentes terminologias médicas, originárias de vocabulários e classificações disponíveis em língua portuguesa e atualmente em uso. Esta abordagem permite a criação de uma terminologia biomédica mais completa, sendo que cada termo mantém relacionamentos, os quais descrevem a sua semântica, com outros. / The goal of this work is, through the study of many technologies, to develop the prototype of a tool able to offer support to the user in query formulation to the MEDLINE (Medical Literature Analysis and Retrieval System On Line). The MEDLINE is a bibliographical information retrieval system in the biomedicine area developed by National Library of Medicine. It is a tool whose usefulness has been amplifyed in this area by the increase of literature utilization, eletronically available, by health care profissionals. People, in general, look for information and are interested in finding it exactly like their expectations, in an agile way and using every single information source available. With this purpouse the first Information Retrieval System (IRS ) emerged, where in a simplifyed way, a user defines a query, that expresses an information necessity and, one step ahead, the system processes it and returns to the user answers from the query. Most of the users think is difficult to represent their information necessity in order to be succesful in searching an IRS. The terms that the user selects to compose the query are not always the same that the system recognizes. In order to be successfull in the definition of the terms that will compose his/her query is advisable that the user know the terminology that was employed in the indexing process of the wanted items or that he/she can have an intermediary person who knows about it. In many situations where no one of these possibilities can be true, resources that make a successfull query possible will be needed. This work, firstly, presents a general study on IRS focusing all the process involved and related to the storage, organization and retrieval. Lately, aspects related to the medical classifications and vocabulary are emphasized, which will be usefull for a largest comprehension of the difficulties found by users during interaction with a system like this. And, finally, the prototype of the Query Formulation System to MEDLINE is presented, as well as its components and funcionalities. The Query Formulation System to MEDLINE was developed with the intention of allowing the user to use any term in the formulation of a query to the MEDLINE. It allows the integration of different medical terminologies originated from classifications and vocabulary available in Portuguese language and in use today. This approach permits the creation of a more complete biomedical terminology in which each term maintains relationships that describe its semantic. Armazenamento : Dados Recuperacao : Informacao Formulacao : Consulta Tesauro Informática médica Information retrieval Query formulation Medical terminology Thesaurus
29	[en] A SOFTWARE INFRASTRUCTURE FOR CATALOG MATCHING / [pt] UMA INFRA-ESTRUTURA DE SOFTWARE PARA ALINHAMENTO DE CATÁLOGOS HETEROGÊNEOS ALEXANDRE GAZOLA 29 May 2008 (has links) [pt] A maior parte dos bancos de dados existentes é projetada de maneira independente e, portanto, é geralmente implementada utilizando diferentes esquemas conceituais, criando um contexto de heterogeneidade em níveis sintático, estrutural e semântico. Não obstante, quando um conjunto de bancos de dados se refere a um mesmo domínio, eventualmente, surge a necessidade de integrá-los em um mesmo banco, ou de intermediar o acesso ao conjunto de bancos de forma transparente. Para tratar o problema da heterogeneidade, torna-se necessário o alinhamento dos esquemas de cada um dos bancos de dados envolvidos. Esse processo geralmente é feito por especialistas de domínio, mas tende a ser um trabalho muito tedioso e propenso a erros. Esta dissertação apresenta o CatalogMatcher, uma infra-estrutura de software para alinhamento de catálogos heterogêneos. Um catálogo armazena dados sobre um conjunto de objetos de um determinado domínio, tipicamente classificados por algum tipo de taxonomia ou tesauro. O CatalogMatcher contém componentes que implementam estratégias de alinhamento de catálogos heterogêneos utilizando abordagens baseadas em instâncias. / [en] Most databases are independently designed and, therefore, are usually implemented using different conceptual schemas, which creates a context of syntactic, structural and semantic-level heterogeneity. Nevertheless, when a set of databases refers to a common domain, it may become necessary to integrate them into a single database, or to intermediate access to the databases in a transparent way. To deal with the heterogeneity problem, it becomes necessary to align the conceptual schemas. This process is usually carried out by domain specialists, and tends to be tedious and error-prone. This dissertation presents the CatalogMatcher, a software infrastructure for catalog matching. A catalog stores data about a set of objects from a specific domain, typically classified by some sort of taxonomy or thesaurus. The CatalogMatcher contains components that implement instance-based alignment strategies. [pt] ALINHAMENTO [en] FOOTING [pt] CATALOGO DE OBJETOS [en] OBJECTS CATALOG [pt] ESQUEMA [en] SCHEMA [pt] TESAURO [en] THESAURUS
30	A construção de tesauros com a integração de procedimentos terminográficos Cervantes, Brígida Maria Nogueira [UNESP] 25 September 2009 (has links) (PDF) Made available in DSpace on 2014-06-11T19:32:42Z (GMT). No. of bitstreams: 0 Previous issue date: 2009-09-25Bitstream added on 2014-06-13T20:43:41Z : No. of bitstreams: 1 cervantes_bmn_dr_mar.pdf: 771731 bytes, checksum: e1199688bf675a26db2c9fbd1fc8ad17 (MD5) / Uel / Investiga a integração da Terminografia para a construção de tesauros na busca de procedimentos terminográficos que podem ser aplicados em conjunto com procedimentos metodológicos existentes de análise de assunto, para o aprimoramento da representação de conceitos na construção de tesauros. Realiza um estudo teórico-metodológico da construção de tesauro, com enfoque na identificação de conceitos em áreas de especialidade para a organização e recuperação temática da informação. Apresenta como objetivo geral enunciar um modelo metodológico para a construção de tesauro com a integração de procedimentos terminográficos. Como objetivos específicos: analisar e sintetizar referenciais teórico-metodológicos sobre construção de tesauros; identificar os principais aspectos teórico-metodológicos da Terminologia/Terminografia contribuintes para a construção de tesauros; e apresentar proposta de um modelo metodológico terminográfico para a construção de tesauros. A metodologia da pesquisa qualifica-se por sua natureza bibliográfica, descritiva e exploratória, concentrando-se na abordagem temática do vocabulário de áreas de especialidade. Enfatiza como um resultado do trabalho aplicado o “Tesauro Terminográfico Preliminar em Gestão da Informação”, disponível na web. Conclui que o aprimoramento de etapas da construção de tesauro, aliado a contribuições de procedimentos terminográficos, produz uma representação de conceitos, por meio de termos, tendo em vista a obtenção de um vocabulário consistente, que compõe a base para a organização e recuperação temática da informação, e compatível com a demanda de áreas de especialidade. / It investigates the terminographic integration for the thesauri construction in the search of terminographic procedures that may be used together with existing methodological procedures of subject analysis, for the improvement of the concepts representation in the thesauri construction. It is a theoretical-methodological study of the thesauri construction, focusing in the concepts identification in specialized area for the representation and thematic information retrieval. The general purpose of this research is to conceive a methodological model for the thesauri construction with the terminographic procedures integration. The specific purposes are to analyze and synthesize theoretical-methodological framework on thesauri construction, identify the main theoretical-methodological aspects of Terminology/Terminography which contribute to the thesauri construction, and present a proposal of a terminographic methodological model for the thesaurus construction. The research methodology is bibliographical, descriptive and exploratory, focusing on the thematic approach of vocabulary of speciality areas. It emphasizes as a result of this work, the “Preliminary Terminographic Thesauri in Information Management”, available in the web. It concludes that the improvement of thesauri construction stages, with the contributions of terminographic procedures, produce a concepts representation, by means of terms, having in mind the acquisition of a consistent vocabulary, which forms the basis for the organization and thematic information retrieval, and compatible with the demand of specialized areas. Tesauros Linguagem documentária Terminografia Linguagens documentárias alfabéticas Thesaurus construction Terminography Terminographic thesauri

Search results