Global ETD Search

31	Texture analysis of corpora lutea in ultrasonographic ovarian images using genetic programming and rotation invariant local binary patterns Dong, Meng 16 August 2011 Ultrasonography is widely used in medical diagnosis with the advantages of being low cost, non-invasive and capable of real time imaging. When interpreting ultrasonographic images of mammalian ovaries, the structures of interest are follicles, corpora lutea (CL) and stroma. This thesis presents an approach to perform CL texture analysis, including detection and segmentation, based on the classiers trained by genetic programming (GP). The objective of CL detection is to determine whether there is a CL in the ovarian images, while the goal of segmentation is to localize the CL within the image. Genetic programming (GP) oers a solution through the evolution of computer programs by methods inspired by the mechanisms of natural selection. Herein, we use rotationally invariant local binary patterns (LBP) to encode the local texture features. These are used by the programs which are manipulated by GP to obtain highly t CL classiers. Grayscale standardization was performed on all images in our data set based on the reference grayscale in each image. CL classication programs were evolved by genetic programming and tested on ultrasonographic images of ovaries. On the bovine dataset, our CL detection algorithm is reliable and robust. The detection algorithm correctly determined the presence or absence of a CL in 93.3% of 60 test images. The segmentation algorithm achieved a mean ( standard deviation) sensitivity and specicity of 0.87 (0.14) and 0.91 (0.05), respectively, over the 30 CL images. Our CL segmentation algorithm is an improvement over the only previously published algorithm, since our method is fully automatic and does not require the placement of an initial contour. The success of these algorithms demonstrates that similar algorithms designed for analysis of in vivo human ovaries are likely viable. Texture Analysis Ultrasonography Corpora lutea Local Binary Patterns Genetic Programming
32	Texture analysis of corpora lutea in ultrasonographic ovarian images using genetic programming and rotation invariant local binary patterns Dong, Meng 16 August 2011 (has links) Ultrasonography is widely used in medical diagnosis with the advantages of being low cost, non-invasive and capable of real time imaging. When interpreting ultrasonographic images of mammalian ovaries, the structures of interest are follicles, corpora lutea (CL) and stroma. This thesis presents an approach to perform CL texture analysis, including detection and segmentation, based on the classiers trained by genetic programming (GP). The objective of CL detection is to determine whether there is a CL in the ovarian images, while the goal of segmentation is to localize the CL within the image. Genetic programming (GP) oers a solution through the evolution of computer programs by methods inspired by the mechanisms of natural selection. Herein, we use rotationally invariant local binary patterns (LBP) to encode the local texture features. These are used by the programs which are manipulated by GP to obtain highly t CL classiers. Grayscale standardization was performed on all images in our data set based on the reference grayscale in each image. CL classication programs were evolved by genetic programming and tested on ultrasonographic images of ovaries. On the bovine dataset, our CL detection algorithm is reliable and robust. The detection algorithm correctly determined the presence or absence of a CL in 93.3% of 60 test images. The segmentation algorithm achieved a mean ( standard deviation) sensitivity and specicity of 0.87 (0.14) and 0.91 (0.05), respectively, over the 30 CL images. Our CL segmentation algorithm is an improvement over the only previously published algorithm, since our method is fully automatic and does not require the placement of an initial contour. The success of these algorithms demonstrates that similar algorithms designed for analysis of in vivo human ovaries are likely viable. Texture Analysis Ultrasonography Corpora lutea Local Binary Patterns Genetic Programming
33	Spam Filter Improvement Through Measurement Lynam, Thomas Richard January 2009 (has links) This work supports the thesis that sound quantitative evaluation for spam filters leads to substantial improvement in the classification of email. To this end, new laboratory testing methods and datasets are introduced, and evidence is presented that their adoption at Text REtrieval Conference (TREC)and elsewhere has led to an improvement in state of the art spam filtering. While many of these improvements have been discovered by others, the best-performing method known at this time -- spam filter fusion -- was demonstrated by the author. This work describes four principal dimensions of spam filter evaluation methodology and spam filter improvement. An initial study investigates the application of twelve open-source filter configurations in a laboratory environment, using a stream of 50,000 messages captured from a single recipient over eight months. The study measures the impact of user feedback and on-line learning on filter performance using methodology and measures which were released to the research community as the TREC Spam Filter Evaluation Toolkit. The toolkit was used as the basis of the TREC Spam Track, which the author co-founded with Cormack. The Spam Track, in addition to evaluating a new application (email spam), addressed the issue of testing systems on both private and public data. While streams of private messages are most realistic, they are not easy to come by and cannot be shared with the research community as archival benchmarks. Using the toolkit, participant filters were evaluated on both, and the differences found not to substantially confound evaluation; as a result, public corpora were validated as research tools. Over the course of TREC and similar evaluation efforts, a dozen or more archival benchmarks -- some private and some public -- have become available. The toolkit and methodology have spawned improvements in the state of the art every year since its deployment in 2005. In 2005, 2006, and 2007, the spam track yielded new best-performing systems based on sequential compression models, orthogonal sparse bigram features, logistic regression and support vector machines. Using the TREC participant filters, we develop and demonstrate methods for on-line filter fusion that outperform all other reported on-line personal spam filters. evaluation methodology spam filtering spam corpora spam fusion Computer Science
34	Spam Filter Improvement Through Measurement Lynam, Thomas Richard January 2009 (has links) This work supports the thesis that sound quantitative evaluation for spam filters leads to substantial improvement in the classification of email. To this end, new laboratory testing methods and datasets are introduced, and evidence is presented that their adoption at Text REtrieval Conference (TREC)and elsewhere has led to an improvement in state of the art spam filtering. While many of these improvements have been discovered by others, the best-performing method known at this time -- spam filter fusion -- was demonstrated by the author. This work describes four principal dimensions of spam filter evaluation methodology and spam filter improvement. An initial study investigates the application of twelve open-source filter configurations in a laboratory environment, using a stream of 50,000 messages captured from a single recipient over eight months. The study measures the impact of user feedback and on-line learning on filter performance using methodology and measures which were released to the research community as the TREC Spam Filter Evaluation Toolkit. The toolkit was used as the basis of the TREC Spam Track, which the author co-founded with Cormack. The Spam Track, in addition to evaluating a new application (email spam), addressed the issue of testing systems on both private and public data. While streams of private messages are most realistic, they are not easy to come by and cannot be shared with the research community as archival benchmarks. Using the toolkit, participant filters were evaluated on both, and the differences found not to substantially confound evaluation; as a result, public corpora were validated as research tools. Over the course of TREC and similar evaluation efforts, a dozen or more archival benchmarks -- some private and some public -- have become available. The toolkit and methodology have spawned improvements in the state of the art every year since its deployment in 2005. In 2005, 2006, and 2007, the spam track yielded new best-performing systems based on sequential compression models, orthogonal sparse bigram features, logistic regression and support vector machines. Using the TREC participant filters, we develop and demonstrate methods for on-line filter fusion that outperform all other reported on-line personal spam filters. evaluation methodology spam filtering spam corpora spam fusion Computer Science
35	The problem of polysemy in the first thousand words of the general service list : a corpus study of secondary chemistry texts : a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Education in education in the Department of Educational Studies in the College of Education at the University of Central Florida, Orlando Florida / Clemmons, Karina. January 1900 (has links) Thesis (Ed.D.)--University of Central Florida, 2008. / Reproduction of unpublish thesis typescript. Advisers: Stephen Sivo, Keith Folse. Also available as a PDF on the World Wide Web. Includes bibliographical references (p. 172-182).
36	Generating Topic-Based Chatbot Responses Krantz, Amandus, Lindblom, Petrus January 2017 (has links) With the rising popularity of chatbots, not just in entertainment but in e-commerce and online chat support, it’s become increasingly important to be able to quickly set up chatbots that can respond to simple questions. This study examines which of two algorithms for automatic generation of chatbot knowledge bases, First Word Search or Most Signiﬁcant Word Search, is able to generate the responses that are the most relevant to the topic of a question. It also examines how text corpora might be used as a source from which to generate chatbot knowledge bases. Two chatbots were developed for this project, one for each of the two algorithms that are to be examined. The chatbots are evaluated through a survey where the participants are asked to choose which of the algorithms they thought chose the response that was most relevant to a question. Based on the survey we conclude that Most Signiﬁcant Word Search is the algorithm that picks the most relevant responses. Most Signiﬁcant Word Search has a signiﬁcantly higher chance of generating a response that is relevant to the topic. However, how well a text corpus works as a source for knowledge bases depends entirely on the quality and nature of the corpus. A corpus consisting of written dialogue is likely more suitable for conversion into a knowledge base. Chatbots Corpora AIML A.L.I.C.E. Computer Sciences Datavetenskap (datalogi)
37	Automatic compilation of bilingual terminologies from comparable corpora Kontonatsios, Georgios Nikolaos January 2015 (has links) Bilingual terminological resources play a pivotal role in human and machine translation of technical text. Owing to the immense volume of newly produced terminology in the biomedical domain, existing resources suffer from low coverage and they are only available for a limited number of languages. The need for term alignment methods that accurately identify translations of terms, emerges. In this work, we focus on bilingual terminology induction from freely available comparable corpora, i.e. thematically related documents in two or more languages. We investigate different sources of information that determine translation equivalence, including: (a) the internal structure of terms (compositional clue), (b) the surrounding lexical context (contextual clue) and (c) the topic distribution of terms (topical clue). We present four novel compositional alignment methods and we introduce several extensions over existing compositional, context-based and topic-based approaches. Furthermore, we combine the three translation clues in a single term alignment model and we show substantial improvements over the individual translation signals when considered in isolation. We examine the performance of the proposed term alignment methods on closely related (English-French, English-Spanish) language pairs, on a more distant, low-resource language pair (English-Greek) and on an unrelated (English-Japanese) language pair. As an application, we integrate automatically compiled bilingual terminologies with Statistical Machine Translation systems to more accurately translate unknown terms. Results show that an up-to-date bilingual dictionary of terms improves the translation performance of SMT. 006.3
38	A Corpus-Based Study of the Gender Assignment of Nominal Anglicisms in Brazilian Portuguese Skahill, Taryn Marie 17 June 2020 (has links) The purpose of this study is to analyze the variability of gender assignment to nominal anglicisms in Brazilian Portuguese and to identify how the orthography of English loanwords and their establishment in the language influences such variation. This study also seeks to identify the most important factors that govern such gender assignment. The data were gathered from two Portuguese corpora, one consisting of more formal and edited language (Corpus do Português, News on the Web) and the other consisting of less formal and unaltered language, such as blog posts (Corpus do Português, Web/Dialects). Forty anglicisms were analyzed in order to study the variation in gender assignment based on the anglicisms’ orthography and establishment in the language, as well as to help determine whether the gender of the loanwords’ cognates or calques influences the gender assignment of words borrowed from English into Portuguese. The results of this study indicate that the gender of the anglicisms’ cognates or most frequent calques and the gender found in Portuguese dictionaries equally influence the gender assignment to anglicisms. This research also shows that variability in gender assignment is not significantly affected by an English loanword’s attestation in Portuguese dictionaries nor by the adaptation to Portuguese orthography of English loanwords, though there are trends that indicate a negative correlation between the variability of anglicisms’ genders and their attestation in Portuguese dictionaries. Portuguese variation borrowing anglicisms corpus corpora Arts and Humanities
39	500 Essential English Words for ESL Missionaries Thompson, Carrie A. 06 July 2005 (has links) (PDF) In order to help ESL missionaries teach the gospel from their hearts using their own words, I have developed a 500-word list of core gospel vocabulary in English. To enhance the 500-word list, I included a lexicon with simple definitions, some grammatical information, and examples of the words in context. The resulting product complies with the standards for master's projects established by the Department of Linguistics and English Language. Published literature shows that the development of specialized corpora can be beneficial for students learning another language. Additionally, specialized corpora act as a catalyst for in-depth vocabulary analysis and the development of other materials associated with the field of language acquisition. Using the 5,013 lexical items from the Preach My Gospel manual and related materials, I developed a specialized vocabulary list of 500-words. To achieve this, I used a number of strategies to reduce the larger compilation of words into the most useful and essential core vocabulary: a pre-rating selection that resulted in 2,419 words, a non-native ESL-instructor rating that resulted in the selection of 994 words, a post-rater researcher analysis that resulted in 425 words, a range-and-frequency analysis that resulted in 634 words, and a think-out-loud analysis that resulted in 500 words. After creating the 500-word list, I implemented and tested the materials with ESL missionaries at the Missionary Training Center (MTC) in Provo, Utah. I gathered feedback from ESL teachers and missionaries through interviews and a questionnaire. Based on their responses, I determined that the 500-word list is useful in helping missionaries learn essential vocabulary and to teach gospel topics in English. Furthermore, the materials have drawn attention from administrators and developers at the MTC, creating a springboard for future projects at the MTC. specialized corpus corpora word-lists missionaries ESL Linguistics
40	A Corpus-Based Evaluation of the Common European Framework Vocabulary for French Teaching and Learning Kusseling, Francoise S. 13 December 2012 (has links) (PDF) The CEFR French profiles have been widely used to teach and evaluate language instruction over the past decade. The profiles were specifications of vocabulary that have been largely untested from a corpus-based, empirical perspective. The purpose of this dissertation was to evaluate the CEFR profiles by comparing their content with two sizable contemporary corpora. This study quantified and described the vocabulary overlap and uniqueness across all three of these resources. Four areas of overlap and three areas of uniqueness were analyzed and identified. Slightly over 40% of the lexical content was common to the three resources studied. Additionally, 16.3% was unique to the CEFR. The remaining CEFR content overlapped with one or the other of the two corpora used for the evaluation. The findings led to the general recommendation of keeping about 60% of the current CEFR content and adding a little over 19,000 vocabulary items to the overhauled CEFR profiles. French CEFR profiles corpora vocabulary evaluation Educational Psychology

Search results