Global ETD Search

51	Topic and link detection from multilingual news. January 2003 (has links) Huang Ruizhang. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2003. / Includes bibliographical references (leaves 110-114). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- The Defitition of Topic and Event --- p.2 / Chapter 1.2 --- Event and Topic Discovery --- p.2 / Chapter 1.2.1 --- Problem Definition --- p.2 / Chapter 1.2.2 --- Characteristics of the Discovery Problems --- p.3 / Chapter 1.2.3 --- Our Contributions --- p.5 / Chapter 1.3 --- Story Link Detection --- p.5 / Chapter 1.3.1 --- Problem Definition --- p.5 / Chapter 1.3.2 --- Our Contributions --- p.6 / Chapter 1.4 --- Thesis Organization --- p.7 / Chapter 2 --- Literature Review --- p.8 / Chapter 2.1 --- University of Massachusetts (UMass) --- p.8 / Chapter 2.1.1 --- Topic Detection Approach --- p.8 / Chapter 2.1.2 --- Story Link Detection Approach --- p.9 / Chapter 2.2 --- BBN Technologies --- p.10 / Chapter 2.3 --- IBM Research Center --- p.11 / Chapter 2.4 --- Carnegie Mellon University (CMU) --- p.12 / Chapter 2.4.1 --- Topic Detection Approach --- p.12 / Chapter 2.4.2 --- Story Link Detection Approach --- p.14 / Chapter 2.5 --- National Taiwan University (NTU) --- p.14 / Chapter 2.5.1 --- Topic Detection Approach --- p.14 / Chapter 2.5.2 --- Story Link Detection Approach --- p.15 / Chapter 3 --- System Overview --- p.17 / Chapter 3.1 --- News Sources --- p.18 / Chapter 3.2 --- Story Preprocessing --- p.24 / Chapter 3.3 --- Information Extraction --- p.25 / Chapter 3.4 --- Gloss Translation --- p.26 / Chapter 3.5 --- Term Weight Calculation --- p.30 / Chapter 3.6 --- Event And Topic Discovery --- p.31 / Chapter 3.7 --- Story Link Detection --- p.33 / Chapter 4 --- Event And Topic Discovery --- p.34 / Chapter 4.1 --- Overview of Event and Topic discovery --- p.34 / Chapter 4.2 --- Event Discovery Component --- p.37 / Chapter 4.2.1 --- Overview of Event Discovery Algorithm --- p.37 / Chapter 4.2.2 --- Similarity Calculation --- p.39 / Chapter 4.2.3 --- Story and Event Combination --- p.43 / Chapter 4.2.4 --- Event Discovery Output --- p.44 / Chapter 4.3 --- Topic Discovery Component --- p.45 / Chapter 4.3.1 --- Overview of Topic Discovery Algorithm --- p.47 / Chapter 4.3.2 --- Relevance Model --- p.47 / Chapter 4.3.3 --- Event and Topic Combination --- p.50 / Chapter 4.3.4 --- Topic Discovery Output --- p.50 / Chapter 5 --- Event And Topic Discovery Experimental Results --- p.54 / Chapter 5.1 --- Testing Corpus --- p.54 / Chapter 5.2 --- Evaluation Methodology --- p.56 / Chapter 5.3 --- Experimental Results on Event Discovery --- p.58 / Chapter 5.3.1 --- Parameter Tuning --- p.58 / Chapter 5.3.2 --- Event Discovery Result --- p.59 / Chapter 5.4 --- Experimental Results on Topic Discovery --- p.62 / Chapter 5.4.1 --- Parameter Tuning --- p.64 / Chapter 5.4.2 --- Topic Discovery Results --- p.64 / Chapter 6 --- Story Link Detection --- p.67 / Chapter 6.1 --- Topic Types --- p.67 / Chapter 6.2 --- Overview of Link Detection Component --- p.68 / Chapter 6.3 --- Automatic Topic Type Categorization --- p.70 / Chapter 6.3.1 --- Training Data Preparation --- p.70 / Chapter 6.3.2 --- Feature Selection --- p.72 / Chapter 6.3.3 --- Training and Tuning Categorization Model --- p.73 / Chapter 6.4 --- Link Detection Algorithm --- p.74 / Chapter 6.4.1 --- Story Component Weight --- p.74 / Chapter 6.4.2 --- Story Link Similarity Calculation --- p.76 / Chapter 6.5 --- Story Link Detection Output --- p.77 / Chapter 7 --- Link Detection Experimental Results --- p.80 / Chapter 7.1 --- Testing Corpus --- p.80 / Chapter 7.2 --- Topic Type Categorization Result --- p.81 / Chapter 7.3 --- Link Detection Evaluation Methodology --- p.82 / Chapter 7.4 --- Experimental Results on Link Detection --- p.83 / Chapter 7.4.1 --- Language Normalization Factor Tuning --- p.83 / Chapter 7.4.2 --- Link Detection Performance --- p.90 / Chapter 7.4.3 --- Link Detection Performance Breakdown --- p.91 / Chapter 8 --- Conclusions and Future Work --- p.95 / Chapter 8.1 --- Conclusions --- p.95 / Chapter 8.2 --- Future Work --- p.96 / Chapter A --- List of Topic Title Annotated for TDT3 corpus by LDC --- p.98 / Chapter B --- List of Manually Annotated Events for TDT3 Corpus --- p.104 / Bibliography --- p.114 Journalism--Data processing Broadcast journalism--Data processing Information retrieval Cross-language information retrieval Computational linguistics English language--Data processing Chinese language--Data processing
52	Portable language technology a resource-light approach to morpho-syntactic tagging / Feldman, Anna. January 2006 (has links) Thesis (Ph. D.)--Ohio State University, 2006. / Title from first page of PDF file. Includes bibliographical references (p. 258-273).
53	Entwurf und Implementierung eines Frameworks zur Analyse und Evaluation von Verfahren im Information Retrieval Wilhelm, Thomas 13 August 2008 (has links) (PDF) Diese Diplomarbeit führt kurz in das Thema Information Retrieval mit den Schwerpunkten Evaluation und Evaluationskampagnen ein. Im Anschluss wird anhand der Nachteile eines vorhandenen Retrieval Systems ein neues Retrieval Framework zur experimentellen Evaluation von Ansätzen aus dem Information Retrieval entworfen und umgesetzt. Die Komponenten des Frameworks sind dabei so abstrakt angelegt, dass verschiedene, bestehende Retrieval Systeme, wie zum Beispiel Apache Lucene oder Terrier, integriert werden können. Anhand einer Referenzimplementierung für den ImageCLEF Photographic Retrieval Task des ImageCLEF Tracks des Cross Language Evaluation Forums wird die Funktionsfähigkeit des Frameworks überprüft und bestätigt. Content-Based Image Retrieval (CBIR) ddc:004 ddc:020 ddc:000 Evaluation Framework <Informatik> Information-Retrieval-System
54	Exploring the health experiences of Korean immigrant women in retirement Choi, Jaeyoung Unknown Date No description available. Korea immigrant women health retirement employment focused ethnography translation translator qualitative methods research cross-culture cross-language research acculturation Berry’s Model Canada
55	The processing of German Sign Language sentences / Three event-related potential studies on phonological, morpho-syntactic, and semantic aspects Hosemann, Jana Alexandra 10 April 2015 (has links) No description available. 400 800 German Sign Language DGS sign language processing ERP N400 transition phase priming effect cross-language co-activation agreement violation Philologien (PPN621711713)
56	Peer to peer English/Chinese cross-language information retrieval Lu, Chengye January 2008 (has links) Peer to peer systems have been widely used in the internet. However, most of the peer to peer information systems are still missing some of the important features, for example cross-language IR (Information Retrieval) and collection selection / fusion features. Cross-language IR is the state-of-art research area in IR research community. It has not been used in any real world IR systems yet. Cross-language IR has the ability to issue a query in one language and receive documents in other languages. In typical peer to peer environment, users are from multiple countries. Their collections are definitely in multiple languages. Cross-language IR can help users to find documents more easily. E.g. many Chinese researchers will search research papers in both Chinese and English. With Cross-language IR, they can do one query in Chinese and get documents in two languages. The Out Of Vocabulary (OOV) problem is one of the key research areas in crosslanguage information retrieval. In recent years, web mining was shown to be one of the effective approaches to solving this problem. However, how to extract Multiword Lexical Units (MLUs) from the web content and how to select the correct translations from the extracted candidate MLUs are still two difficult problems in web mining based automated translation approaches. Discovering resource descriptions and merging results obtained from remote search engines are two key issues in distributed information retrieval studies. In uncooperative environments, query-based sampling and normalized-score based merging strategies are well-known approaches to solve such problems. However, such approaches only consider the content of the remote database but do not consider the retrieval performance of the remote search engine. This thesis presents research on building a peer to peer IR system with crosslanguage IR and advance collection profiling technique for fusion features. Particularly, this thesis first presents a new Chinese term measurement and new Chinese MLU extraction process that works well on small corpora. An approach to selection of MLUs in a more accurate manner is also presented. After that, this thesis proposes a collection profiling strategy which can discover not only collection content but also retrieval performance of the remote search engine. Based on collection profiling, a web-based query classification method and two collection fusion approaches are developed and presented in this thesis. Our experiments show that the proposed strategies are effective in merging results in uncooperative peer to peer environments. Here, an uncooperative environment is defined as each peer in the system is autonomous. Peer like to share documents but they do not share collection statistics. This environment is a typical peer to peer IR environment. Finally, all those approaches are grouped together to build up a secure peer to peer multilingual IR system that cooperates through X.509 and email system.
57	Word embeddings for monolingual and cross-language domain-specific information retrieval / Ordinbäddningar för enspråkig och tvärspråklig domänspecifik informationssökning Wigder, Chaya January 2018 (has links) Various studies have shown the usefulness of word embedding models for a wide variety of natural language processing tasks. This thesis examines how word embeddings can be incorporated into domain-specific search engines for both monolingual and cross-language search. This is done by testing various embedding model hyperparameters, as well as methods for weighting the relative importance of words to a document or query. In addition, methods for generating domain-specific bilingual embeddings are examined and tested. The system was compared to a baseline that used cosine similarity without word embeddings, and for both the monolingual and bilingual search engines the use of monolingual embedding models improved performance above the baseline. However, bilingual embeddings, especially for domain-specific terms, tended to be of too poor quality to be used directly in the search engines. / Flera studier har visat att ordinbäddningsmodeller är användningsbara för många olika språkteknologiuppgifter. Denna avhandling undersöker hur ordinbäddningsmodeller kan användas i sökmotorer för både enspråkig och tvärspråklig domänspecifik sökning. Experiment gjordes för att optimera hyperparametrarna till ordinbäddningsmodellerna och för att hitta det bästa sättet att vikta ord efter hur viktiga de är i dokumentet eller sökfrågan. Dessutom undersöktes metoder för att skapa domänspecifika tvåspråkiga inbäddningar. Systemet jämfördes med en baslinje utan inbäddningar baserad på cosinuslikhet, och för både enspråkiga och tvärspråkliga sökningar var systemet som använde enspråkiga inbäddningar bättre än baslinjen. Däremot var de tvåspråkiga inbäddningarna, särskilt för domänspecifika ord, av låg kvalitet och gav för dåliga resultat för direkt användning inom sökmotorer. information retrieval domain-specific information retrieval cross-language information retrieval word embeddings bilingual embeddings informationssökning domänspecifik informationssökning tvärspråklig informationssökning ordinbäddningar tvåspråkiga inbäddningar Computer Sciences Datavetenskap (datalogi)
58	Portable language technology: a resource-light approach to morpho-syntactic taggin Feldman, Anna 19 September 2006 (has links) No description available. portable technology resource-light morpho-syntactic tagging resource-poor cross-language induction Russian Czech Spanish Portuguese Catalan cognate words cognate identification cognate transfer morphological analysis corpora annotation
59	Automatic construction of English/Chinese parallel corpus. January 2001 (has links) Li Kar Wing. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. / Includes bibliographical references (leaves 88-96). / Abstracts in English and Chinese. / ABSTRACT --- p.i / ACKNOWLEDGEMENTS --- p.v / LIST OF TABLES --- p.viii / LIST OF FIGURES --- p.ix / CHAPTERS / Chapter 1. --- INTRODUCTION --- p.1 / Chapter 1.1 --- Application of corpus-based techniques --- p.2 / Chapter 1.1.1 --- Machine Translation (MT) --- p.2 / Chapter 1.1.1.1 --- Linguistic --- p.3 / Chapter 1.1.1.2 --- Statistical --- p.4 / Chapter 1.1.1.3 --- Lexicon construction --- p.4 / Chapter 1.1.2 --- Cross-lingual Information Retrieval (CLIR) --- p.6 / Chapter 1.1.2.1 --- Controlled vocabulary --- p.6 / Chapter 1.1.2.2 --- Free text --- p.7 / Chapter 1.1.2.3 --- Application corpus-based approach in CLIR --- p.9 / Chapter 1.2 --- Overview of linguistic resources --- p.10 / Chapter 1.3 --- Written language corpora --- p.12 / Chapter 1.3.1 --- Types of corpora --- p.13 / Chapter 1.3.2 --- Limitation of comparable corpora --- p.16 / Chapter 1.4 --- Outline of the dissertation --- p.17 / Chapter 2. --- LITERATURE REVIEW --- p.19 / Chapter 2.1 --- Research in automatic corpus construction --- p.20 / Chapter 2.2 --- Research in translation alignment --- p.25 / Chapter 2.2.1 --- Sentence alignment --- p.27 / Chapter 2.2.2 --- Word alignment --- p.28 / Chapter 2.3 --- Research in alignment of sequences --- p.33 / Chapter 3. --- ALIGNMENT AT WORD LEVEL AND CHARACTER LEVEL --- p.35 / Chapter 3.1 --- Title alignment --- p.35 / Chapter 3.1.1 --- Lexical features --- p.37 / Chapter 3.1.2 --- Grammatical features --- p.40 / Chapter 3.1.3 --- The English/Chinese alignment model --- p.41 / Chapter 3.2 --- Alignment at word level and character level --- p.42 / Chapter 3.2.1 --- Alignment at word level --- p.42 / Chapter 3.2.2 --- Alignment at character level: Longest matching --- p.44 / Chapter 3.2.3 --- Longest common subsequence(LCS) --- p.46 / Chapter 3.2.4 --- Applying LCS in the English/Chinese alignment model --- p.48 / Chapter 3.3 --- Reduce overlapping ambiguity --- p.52 / Chapter 3.3.1 --- Edit distance --- p.52 / Chapter 3.3.2 --- Overlapping in the algorithm model --- p.54 / Chapter 4. --- ALIGNMENT AT TITLE LEVEL --- p.59 / Chapter 4.1 --- Review of score functions --- p.59 / Chapter 4.2 --- The Score function --- p.60 / Chapter 4.2.1 --- (C matches E) and (E matches C) --- p.60 / Chapter 4.2.2 --- Length similarity --- p.63 / Chapter 5. --- EXPERIMENTAL RESULTS --- p.69 / Chapter 5.1 --- Hong Kong government press release articles --- p.69 / Chapter 5.2 --- Hang Seng Bank economic monthly reports --- p.76 / Chapter 5.3 --- Hang Seng Bank press release articles --- p.78 / Chapter 5.4 --- Hang Seng Bank speech articles --- p.81 / Chapter 5.5 --- Quality of the collections and future work --- p.84 / Chapter 6. --- CONCLUSION --- p.87 / Bibliography Machine translating Machine translating--China--Hong Kong English language--Machine translating Chinese language--Machine translating Cross-language information retrieval
60	Auxílio na prevenção de doenças crônicas por meio de mapeamento e relacionamento conceitual de informações em biomedicina / Support in the Prevention of Chronic Diseases by means of Mapping and Conceptual Relationship of Biomedical Information Pollettini, Juliana Tarossi 28 November 2011 (has links) Pesquisas recentes em medicina genômica sugerem que fatores de risco que incidem desde a concepção de uma criança até o final de sua adolescência podem influenciar no desenvolvimento de doenças crônicas da idade adulta. Artigos científicos com descobertas e estudos inovadores sobre o tema indicam que a epigenética deve ser explorada para prevenir doenças de alta prevalência como doenças cardiovasculares, diabetes e obesidade. A grande quantidade de artigos disponibilizados diariamente dificulta a atualização de profissionais, uma vez que buscas por informação exata se tornam complexas e dispendiosas em relação ao tempo gasto na procura e análise dos resultados. Algumas tecnologias e técnicas computacionais podem apoiar a manipulação dos grandes repositórios de informações biomédicas, assim como a geração de conhecimento. O presente trabalho pesquisa a descoberta automática de artigos científicos que relacionem doenças crônicas e fatores de risco para as mesmas em registros clínicos de pacientes. Este trabalho também apresenta o desenvolvimento de um arcabouço de software para sistemas de vigilância que alertem profissionais de saúde sobre problemas no desenvolvimento humano. A efetiva transformação dos resultados de pesquisas biomédicas em conhecimento possível de ser utilizado para beneficiar a saúde pública tem sido considerada um domínio importante da informática. Este domínio é denominado Bioinformática Translacional (BUTTE,2008). Considerando-se que doenças crônicas são, mundialmente, um problema sério de saúde e lideram as causas de mortalidade com 60% de todas as mortes, o presente trabalho poderá possibilitar o uso direto dos resultados dessas pesquisas na saúde pública e pode ser considerado um trabalho de Bioinformática Translacional. / Genomic medicine has suggested that the exposure to risk factors since conception may influence gene expression and consequently induce the development of chronic diseases in adulthood. Scientific papers bringing up these discoveries indicate that epigenetics must be exploited to prevent diseases of high prevalence, such as cardiovascular diseases, diabetes and obesity. A large amount of scientific information burdens health care professionals interested in being updated, once searches for accurate information become complex and expensive. Some computational techniques might support management of large biomedical information repositories and discovery of knowledge. This study presents a framework to support surveillance systems to alert health professionals about human development problems, retrieving scientific papers that relate chronic diseases to risk factors detected on a patient\'s clinical record. As a contribution, healthcare professionals will be able to create a routine with the family, setting up the best growing conditions. According to Butte, the effective transformation of results from biomedical research into knowledge that actually improves public health has been considered an important domain of informatics and has been called Translational Bioinformatics. Since chronic diseases are a serious health problem worldwide and leads the causes of mortality with 60% of all deaths, this scientific investigation will probably enable results from bioinformatics researches to directly benefit public health. Bioinformática Translacional Chronic Diseases Doenças Crônicas Epigenetic Factors Fatores Epigenéticos Informática Biomédica Information Retrieval Medical Informatics Mineração de Textos Natural Language Processing Processamento de Linguagem Natural Recuperação de Informação Text Mining Translational Bioinformatics

Search results