Global ETD Search

1	Kontrola pravopisu v českých textech / Spelling check in the czech texts Bureš, Stanislav January 2011 (has links) The Master‘s thesis deals with spell checking in the czech texts. It also contains an overview of the most used phonetic algorithms, including their properties and it deals with focus on metric methods, which are used to compare two words. The second part of this thesis deals with implementation of selected algorithms to the spell checker software and demonstration its spell - checking function in czech texts. The last part of this thesis deals about building context – sensitive algorithm, which is performs text correction.
2	Implementation and evaluation of a text extraction tool for adverse drug reaction information Dahlberg, Gunnar January 2010 (has links) Inom ramen för Världshälsoorganisationens (WHO:s) internationella biverkningsprogram rapporterar sjukvårdspersonal och patienter misstänkta läkemedelsbiverkningar i form av spontana biverkningsrapporter som via nationella myndigheter skickas till Uppsala Monitoring Centre (UMC). Hos UMC lagras rapporterna i VigiBase, WHO:s biverkningsdatabas. Rapporterna i VigiBase analyseras med hjälp av statistiska metoder för att hitta potentiella samband mellan läkemedel och biverkningar. Funna samband utvärderas i flera steg där ett tidigt steg i utvärderingen är att studera den medicinska litteraturen för att se om sambandet redan är känt sedan tidigare (tidigare kända samband filtreras bort från fortsatt analys). Att manuellt leta efter samband mellan ett visst läkemedel och en viss biverkan är tidskrävande. I den här studien har vi utvecklat ett verktyg för att automatiskt leta efter medicinska biverkningstermer i medicinsk litteratur och spara funna samband i ett strukturerat format. I verktyget har vi implementerat och integrerat funktionalitet för att söka efter medicinska biverkningar på olika sätt (utnyttja synonymer,ta bort ändelser på ord, ta bort ord som saknar betydelse, godtycklig ordföljd och stavfel). Verktygets prestanda har utvärderats på manuellt extraherade medicinska termer från SPC-texter (texter från läkemedels bipacksedlar) och på biverkningstexter från Martindale (medicinsk referenslitteratur för information om läkemedel och substanser) där WHO-ART- och MedDRA-terminologierna har använts som källa för biverkningstermer. Studien visar att sofistikerad textextraktion avsevärt kan förbättra identifieringen av biverkningstermer i biverkningstexter jämfört med en ordagrann extraktion. / Background: Initial review of potential safety issues related to the use of medicines involves reading and searching existing medical literature sources for known associations of drug and adverse drug reactions (ADRs), so that they can be excluded from further analysis. The task is labor demanding and time consuming. Objective: To develop a text extraction tool to automatically identify ADR information from medical adverse effects texts. Evaluate the performance of the tool’s underlying text extraction algorithm and identify what parts of the algorithm contributed to the performance. Method: A text extraction tool was implemented on the .NET platform with functionality for preprocessing text (removal of stop words, Porter stemming and use of synonyms) and matching medical terms using permutations of words and spelling variations (Soundex, Levenshtein distance and Longest common subsequence distance). Its performance was evaluated on both manually extracted medical terms (semi-structuredtexts) from summary of product characteristics (SPC) texts and unstructured adverse effects texts from Martindale (i.e. a medical reference for information about drugs andmedicines) using the WHO-ART and MedDRA medical term dictionaries. Results: For the SPC data set, a verbatim match identified 72% of the SPC terms. The text extraction tool correctly matched 87% of the SPC terms while producing one false positive match using removal of stop words, Porter stemming, synonyms and permutations. The use of the full MedDRA hierarchy contributed the most to performance. Sophisticated text algorithms together contributed roughly equally to the performance. Phonetic codes (i.e. Soundex) is evidently inferior to string distance measures (i.e. Levenshtein distance and Longest common subsequence distance) for fuzzy matching in our implementation. The string distance measures increased the number of matched SPC terms, but at the expense of generating false positive matches. Results from Martindaleshow that 90% of the identified medical terms were correct. The majority of false positive matches were caused by extracting medical terms not describing ADRs. Conclusion: Sophisticated text extraction can considerably improve the identification of ADR information from adverse effects texts compared to a verbatim extraction. text extraction adverse drug reactions permutation soundex porter stemming levenshtein distance longest common subsequence distance
3	Untersuchungen zur Verbesserung der Resultatqualität bei Suchverfahren über Web-Archive Hofmann, Frank 10 February 2003 (has links) (PDF) Eine Übersicht über die Verfahren der Erweiterten Suche (TF,IDF, Stemming, Indexing, Klang von Wörtern) sowie Textkorrektur, dazu deskriptorenbasierte Beschreibung von Dokumenten und Abstracts. Es erfolgt eine Evaluierung dieser Verfahren anhand von ausgewählten XML-Metadaten aus dem MONARCH. Den Abschluß bildet eine Analyse zum Ist-Zustand des MONARCH, bezogen auf Qualität der verwendeten Metadaten und deren Nutzbarkeit für die Erweiterte Suche. Abstract Deskriptoren Webarchive IDF Indexing MONARCH Metaphone Soundex Stemming ddc:004 Suche TF
4	Spell checker for a Java Application / Stavningskontroll till en Java-applikation Viktorsson, Arvid, Kyrychenko, Illya January 2020 (has links) Many text-editor users depend on spellcheckers to correct their typographical errors. The absence of a spellchecker can create a negative experience for the user. In today's advanced technological environment spellchecking is an expected feature. 2Consiliate Business Solutions owns a Java application with a text-editor which does not have a spellchecker. This project aims to investigate and implement available techniques and algorithms for spellcheckers and automated word correction. During implementation, the techniques were tested for their performance and the best solutions were chosen for this project. All the techniques were gathered from earlier written literature on the topic and implemented in Java using default Java libraries. Analysis of the results proves that it is possible to create a complete spellchecker combining available techniques and that the quality of a spellchecker largely depends on a well defined dictionary. Spellchecker Java Trie edit distance Soundex damerau levenshtein Computer Sciences Datavetenskap (datalogi)
5	Untersuchungen zur Verbesserung der Resultatqualität bei Suchverfahren über Web-Archive Hofmann, Frank 10 February 2003 (has links) Eine Übersicht über die Verfahren der Erweiterten Suche (TF,IDF, Stemming, Indexing, Klang von Wörtern) sowie Textkorrektur, dazu deskriptorenbasierte Beschreibung von Dokumenten und Abstracts. Es erfolgt eine Evaluierung dieser Verfahren anhand von ausgewählten XML-Metadaten aus dem MONARCH. Den Abschluß bildet eine Analyse zum Ist-Zustand des MONARCH, bezogen auf Qualität der verwendeten Metadaten und deren Nutzbarkeit für die Erweiterte Suche. info:eu-repo/classification/ddc/004 ddc:004 Suche TF Abstract Deskriptoren Webarchive IDF Indexing MONARCH Metaphone Soundex Stemming

1

Page generated in 0.0449 seconds