Global ETD Search

561	Machine Learning Algorithms for the Analysis of Social Media and Detection of Malicious User Generated Content Unknown Date (has links) One of the de ning characteristics of the modern Internet is its massive connectedness, with information and human connection simply a few clicks away. Social media and online retailers have revolutionized how we communicate and purchase goods or services. User generated content on the web, through social media, plays a large role in modern society; Twitter has been in the forefront of political discourse, with politicians choosing it as their platform for disseminating information, while websites like Amazon and Yelp allow users to share their opinions on products via online reviews. The information available through these platforms can provide insight into a host of relevant topics through the process of machine learning. Speci - cally, this process involves text mining for sentiment analysis, which is an application domain of machine learning involving the extraction of emotion from text. Unfortunately, there are still those with malicious intent and with the changes to how we communicate and conduct business, comes changes to their malicious practices. Social bots and fake reviews plague the web, providing incorrect information and swaying the opinion of unaware readers. The detection of these false users or posts from reading the text is di cult, if not impossible, for humans. Fortunately, text mining provides us with methods for the detection of harmful user generated content. This dissertation expands the current research in sentiment analysis, fake online review detection and election prediction. We examine cross-domain sentiment analysis using tweets and reviews. Novel techniques combining ensemble and feature selection methods are proposed for the domain of online spam review detection. We investigate the ability for the Twitter platform to predict the United States 2016 presidential election. In addition, we determine how social bots in uence this prediction. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection Machine learning. Text mining. User-generated content. Social media.
562	Enhancement of Deep Neural Networks and Their Application to Text Mining Unknown Date (has links) Many current application domains of machine learning and arti cial intelligence involve knowledge discovery from text, such as sentiment analysis, document ontology, and spam detection. Humans have years of experience and training with language, enabling them to understand complicated, nuanced text passages with relative ease. A text classi er attempts to emulate or replicate this knowledge so that computers can discriminate between concepts encountered in text; however, learning high-level concepts from text, such as those found in many applications of text classi- cation, is a challenging task due to the many challenges associated with text mining and classi cation. Recently, classi ers trained using arti cial neural networks have been shown to be e ective for a variety of text mining tasks. Convolutional neural networks have been trained to classify text from character-level input, automatically learn high-level abstract representations and avoiding the need for human engineered features. This dissertation proposes two new techniques for character-level learning, log(m) character embedding and convolutional window classi cation. Log(m) embedding is a new character-vector representation for text data that is more compact and memory e cient than previous embedding vectors. Convolutional window classi cation is a technique for classifying long documents, i.e. documents with lengths exceeding the input dimension of the neural network. Additionally, we investigate the performance of convolutional neural networks combined with long short-term memory networks, explore how document length impacts classi cation performance and compare performance of neural networks against non-neural network-based learners in text classi cation tasks. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection Text Mining Neural networks (Computer science) Machine learning
563	A sinonímia na terminologia do Direito do Trabalho e do Processo Trabalhista: uma análise no texto sentença judicial / The synonymy in terminology of Labor Law and Labor Suit Domains: an analysis innthe text of judicial decisions Thiago Carvalho Gaudêncio 02 March 2012 (has links) Esta pesquisa reconhece a existência de sinonímia e quase-sinonímia na terminologia jurídica, mais especificamente nos ramos do Direito do Trabalho e do Processo Trabalhista, analisa morfológica e semântica-discursivamente as variações denominativas utilizadas pelos magistrados, bem como evidencia que essas variações, no discurso jurídico, tendem a dificultar a comunicação entre os especialistas da área e o usuário comum.O tema se justifica, na medida em que alguns autores, no domínio da Semântica e da Lexicologia, julgam desejável, nos discursos especializados, a eliminação de várias denominações para uma mesma noção, porém outros estudiosos discordam dessa postura e apontam que a sinonímia é presença incontestável em linguagens de especialidade. Discutem-se, assim, dentre outras, as teorias de Barbosa, Alves, Araújo, Faulstich, Ullmann, Bloomfield, Lopes, Geckeler, Lyons, Wuster, Boulanger, Auger, Cabré, Dubuc, Duque-Picard acerca do fenômeno linguístico em análise, como também conceitos e critérios de base para seu reconhecimento. Discutem-se, ainda, tipologias da sinonímia que podem ser aplicados à Terminologia, ainda que formulados no âmbito da Linguística, pois concebe-se que esta ciência não é desvinculada daquela. Demonstram-se termos da área do Direito que estão ou não em relação sinonímica, extraídos de decisões judiciais, as quais possuem estruturação, características linguísticas e objetivos específicos, mas, devido a, sobretudo, o uso de uma terminologia jurídica específica, não possui linguagem simples e objetiva. Percebe-se a necessidade de se analisar as unidades terminológicas em seu habitat natural, isto é, in vivo, dentro da comunicação especializada, no locus material dos discursos, por meio da análise de textos produzidos de maneira real, e não in vitro, fora do contexto de uso habitual. Observa-se que o trato da sinonímia em Terminologia deve ser bastante criterioso, não só quando se visa à elaboração da macroestrutura, da microestrutura e dos processos de remissivas em um trabalho terminográfico, mas também para se evitar ambiguidade nos textos de especialidade. Nessa direção, estabelecem-se, com esta pesquisa, paradigmas para uma futura elaboração de um glossário no domínio do Direito do Trabalho e do Processo Trabalhista. / This research recognizes the existence of synonymy and almost-synonymy in the law terminology, more specifically in the Labor Law and Labor Suit fields; analyses morphologically and semantic-discursively the variations used by the magistrates, and also makes evident that those variations, in the law discourse, tend to make more difficult the communications between the specialists in the field and the common user. The theme is justified, in the sense that some authors, in the Semantics and Lexicology domains, consider preferable, in specialized discourse, to eliminate the various denominations for the same notion. On the other hand, other scholars disagree from that posture and point out that synonymy is an incontestable presence in specialized languages. The theories from Barbosa, Alves, Araújo, Faulstich, Ullmann, Bloomfield, Lopes, Geckeler, Lyons, Wüster, Boulanger, Auger, Cabré, Dubuc, Duque-Picard are then discussed about the linguistic phenomenon under analysis, as well as concepts and basic criteria for its recognition. The typology of synonyms that can be applied to the terminology are also discussed, even though they are formulated in the field of Linguistics, as we consider this science is not detached from the other. Terms from the Law field which are or are not in a synonymic relation are demonstrated, extracted from legal decisions, which have specific structures, linguistic characteristics and objectives, but, due to, mainly, the use of a specific law terminology, does not have a simple and objective language. A necessity to analyze the terminological units in its natural habitat is perceived, that is, in vivo, within the specialized communication, in the material locus of discourse, outside the context of its habitual use. It is observed that the treatment of the synonymy in Terminology must be very judicious, not only for the elaboration of its macrostructure, microstructure and the remissive processes in a terminographic work, but also to avoid ambiguity in the specialized texts. In this direction, with this research we establish paradigms for a future elaboration of a glossary in the Labor Law and Labor Suit domains. Direito Sinonímia Terminologia Texto de especialidade Law Specialized text Synonymy Terminology
564	O valor diagnóstico do acting out e da passagem ao ato no tratamento psicanalítico. Meirelles, Cecilia Carvalho 17 September 2008 (has links) O presente estudo investiga o valor diagnóstico entre neurose e psicose dos atos realizados por um paciente em análise, em particular se se tratavam de acting out e de passagem ao ato. Após algum tempo do início do tratamento psicanalítico o paciente apresentou atos que sugeriam a importância deles para sua estruturação psíquica. Inicialmente apresentou-se a dúvida quanto a se tratarem de um ou de outro. Pensamos que ao se esclarecer esta dúvida o diagnóstico seria possível. Além disso, para desenvolver esta pesquisa, foi preciso estudar mais a fundo o diagnóstico próprio da psicanálise, diferente do diagnóstico médico. A pesquisa teórica constitui-se principalmente no estudo dos textos de Freud e de Lacan acerca dos temas: diagnóstico, transferência, ato, neurose e psicose. Foi apresentado um relato do atendimento psicanalítico com o paciente a partir de anotações realizadas pela analista ao longo do tempo. Salientamos que o relato não retratou exatamente o que foi o tratamento por isso se tratar inviável na psicanálise. O acontecimento clínico é único e irreproduzível. A investigação transcorreu até o ponto de afirmarmos que os atos do paciente tratavamse de acting outs. Mesmo com isso posto, não nos foi possível determinar o seu diagnóstico. Constatamos que tanto o acting out quanto a passagem ao ato podem ser realizados por pacientes neuróticos e por pacientes psicóticos. Foi somente através da relação existente entre a análise da transferência e a análise dos acting outs que foi possível afirmar que se tratava de um paciente psicótico pré-surto. Os acting outs foram determinantes para percebemos que eles tinham uma função defensiva importante e declaramos a presença da foraclusão. Concluímos que, isoladamente, esses atos não são determinantes na realização do diagnóstico, mas, se associados à análise da transferência, podem ser de grande valia. / Text not informed by the author Diagnóstico Passagem ao ato Text not informed by the author Transferência
565	Autoria e aprendizagem da escrita / Authorship and learning of writing Fortunato, Marcia Vescovi 13 May 2009 (has links) Todo ato de escrita é um movimento singular de representação simbólica, é um ato de autoria de um escritor em atividade social de comunicação. A produção de textos exige do escritor uma série de decisões e de ações de linguagem que representam um trabalho intenso, resultado de operações cognitivas complexas. A aprendizagem da escrita compreende o domínio desses procedimentos e de sua gestão durante o processo de produção de textos. Por isso, ensinar a escrever textos é ensinar procedimentos de autoria. Para demonstrar essa tese, buscamos conhecer os contextos histórico e teórico que envolvem as concepções de autoria, de texto e de ensino da escrita para situar nosso objeto. A seguir estudamos as concepções de autoria de Bakhtin e Foucault, que consideramos basilares para este trabalho. Procuramos compreender ainda o processo de composição de textos escritos e interpretá-lo à luz da concepção de autoria adotada. A partir desse estudo, foi possível analisar e descrever procedimentos de autoria e compreendê-los como atos de linguagem que desempenham funções específicas na composição de um texto. Finalmente, a análise de uma amostragem da produção escrita de estudantes do Ensino Fundamental descreveu os procedimentos utilizados pelos aprendizes e a adequação desse uso para sua aprendizagem da escrita. Concluímos que a aprendizagem da escrita requer uma prática de composição de textos contínua para seu desenvolvimento e que o ensino não pode focar um ou outro procedimento, mas o conjunto deles em toda a extensão da escolaridade. / Every act of writing is a unique movement towards symbolic representation; it is an act of authorship performed by a writer engaged in an activity of social communication. Text production requires from the writer a series of decisions and language activities, which represent an intensive work stemmed from complex cognitive operations. The learning of writing implies the mastery of these procedures as well as their management during the process of text production. Therefore, to teach to write texts is to teach authorship procedures. To demonstrate this thesis, first we had to be cognizant of the historical and theoretical contexts related to notions of authorship, text and writing education in order to situate our object. Then, we studied Bakhtin and Foucaults notions of authorship, which we considered to be pivotal to this work. We also searched to understand the process of composing written texts and to interpret them in the light of the authorship notion we endorsed. This research made it possible to analyze and to describe authorship procedures so as we could understand them as acts of language that perform specific functions in the composition of a text. Finally, sample analysis of the written production of students from elementary school helped us describe the procedures used by these novices and the adequacy of this use to their learning of writing. We arrived at the conclusion that the development of writing education presupposes a continuous practice of text composition and that good teaching cannot simply focus on one procedure or another, but on their entire set for the whole duration of schooling. authorship autoria escrita-aprendizagem produção-texto text production writing learning
566	Exploração de informações contextuais para enriquecimento semântico em representações de textos / Exploration of contextual information for semantic enrichment in text representations Ribeiro, João Vítor Antunes 14 November 2018 (has links) Em decorrência da crescente quantidade de documentos disponíveis em formato digital, a importância da análise computacional de grandes volumes de dados torna-se ainda mais evidente na atualidade. Embora grande parte desses documentos esteja disponível em formato de língua natural, a análise por meio de processos como a Mineração de Textos ainda é um desafio a ser superado. Normalmente, abordagens tradicionais de representação de textos como a Bag of Words desconsideram aspectos semânticos e contextuais das coleções de textos analisadas, ignorando informações que podem potencializar o desempenho das tarefas realizadas. Os principais problemas associados a essas abordagens são a alta esparsidade e dimensionalidade que prejudicam consideravelmente o desempenho das tarefas realizadas. Como o enriquecimento de representações de textos é uma das possibilidades efetivas para atenuar esses tipos de problemas, nesta dissertação foi investigada a aplicação conjunta de enriquecimentos semânticos e contextuais. Para isso foi proposta uma nova técnica de representação de textos, cuja principal novidade é a abordagem utilizada para calcular a frequência dos atributos (contextos) baseando-se em suas similaridades. Os atributos extraídos por meio dessa técnica proposta são considerados dependentes já que são formados por conjuntos de termos correlacionados que podem compartilhar informações semelhantes. A efetividade da técnica foi avaliada na tarefa de classificação automática de textos, na qual foram explorados diferentes procedimentos de enriquecimento textual e versões de modelos de linguagem baseados em word embeddings. De acordo com os resultados obtidos, há evidências favoráveis a respeito da efetividade e da aplicabilidade da técnica de representação de textos proposta. Segundo os testes de significância estatística realizados, a aplicação de enriquecimentos textuais baseados em Reconhecimento de Entidades Nomeadas e em Desambiguação Lexical de Sentido pode contribuir efetivamente para o aumento do desempenho da tarefa de classificação automática de textos, principalmente nas abordagens em que também são considerados textos de fontes externas de conhecimento como a Wikipédia. Constatou-se empiricamente que a efetividade dessa técnica proposta pode ser superior às abordagens tradicionais em cenários de aplicação baseados em informações semânticas das coleções de textos, caracterizando-a como uma alternativa promissora para a geração de representações de textos com alta densidade de informações semânticas e contextuais que se destacam pela interpretabilidade. / Due to the increasing number of available documents in digital format, the importance of computational analysis of large volumes of data becomes even more evident recently. Although most of these documents are available in natural language format, analysis through processes such as text mining is still a challenge to be overcome. Normally, traditional text representation approaches such as the bag of words disregard semantic and contextual aspects of the analyzed text collections, ignoring information that can enhance the performance of the tasks performed. The main problems associated with these approaches are the high sparsity and dimensionality that considerably impair the performance of the tasks performed. As the text representations enrichment is one of the effective possibilities to attenuate these types of problems, in this dissertation the joint application of semantic and contextual enrichment was investigated. For that a new text representation technique was proposed, whose main novelty is the approach used to calculate the frequency of attributes (contexts) based on their similarities. The attributes attributes extracted by this proposed technique are considered dependent because they are formed by sets of correlated terms that can share similar information. The effectiveness of the technique was evaluated in the automatic text classification task, in which different procedures of textual enrichment and versions of language models based on word embeddings were explored. According to the results, there is favorable evidence regarding the effectiveness and applicability of the proposed text representation technique. According to the statistical significance tests, the application of textual enrichment based on named entity recognition and word sense disambiguation can effectively contribute to the increase of the performance of the automatic text classification task, especially in the approaches that are also considered texts from external knowledge sources such asWikipedia. It has been empirically verified that the effectiveness of this proposed technique can be superior to the traditional approaches in application scenarios based on semantic information of the text collections, characterizing it as a promising alternative for the generation of text representations with high density of semantic and contextual information that stand out for their interpretability. Informações semânticas e contextuais Representações de textos Semantic and contextual information Text representations
567	Imagem e texto em tradução: uma análise do processo tradutório nas histórias em quadrinhos / Image and text in translation: an analysis of the translation process in comics Aragão, Sabrina Moura 25 October 2012 (has links) O presente trabalho tem como objetivo analisar, do ponto de vista da tradução, as histórias em quadrinhos de língua francesa Astérix, dos quadrinistas Uderzo e Goscinny, e Les aventures de Tintin, do autor belga Hergé, considerando as obras originais em francês e as suas traduções em português. Para a realização da pesquisa, foram considerados aspectos que envolvem a tradução dos elementos textuais inseridos nos balões em conjunção com as imagens, atentando para a relação entre imagem e texto e sua interdependência, bem como as implicações que essa relação traz para o processo tradutório. Vale notar que, ao estabelecer uma relação interdependente entre imagem e texto nos quadrinhos, os autores de ambas as séries buscavam, frequentemente, provocar efeitos de humor, o que se mostrou um dos principais desafios para as traduções em língua portuguesa. A partir da análise da relação entre os códigos verbal e imagético presente nos quadrinhos, foi possível observar que o processo de tradução dessa forma de linguagem apresenta especificidades que indicam dessemelhanças ou similaridades com outras formas de linguagem, como o cinema, e a literatura ilustrada, graças à sua forma de expressão que, como salientamos, é formada por dois sistemas de signos distintos. Nesse sentido, a tradução de histórias em quadrinhos deve tomar como ponto de partida não só o material escrito em língua estrangeira, mas também a imagem, haja vista a função desta última na construção de efeitos interpretativos nos contextos originais e a inteligibilidade da mensagem nos contextos de tradução. Em nosso trabalho, buscamos definir os principais processos em que a relação entre imagem e texto se manifesta nos quadrinhos originais para, assim, propor estratégias de tradução que reproduzissem, de forma análoga, os efeitos desses processos tanto nas traduções publicadas quanto naquelas sugeridas por nós no decorrer das análises. / This thesis reviews tha transalations of the French language comic books Astérix, by Uderzo and Goscinny, and Les aventures de Tintin, by Hergé into Brazilian Portuguese. In the course of our investigation, we have taken into account aspects which involve the translation of the textual elements inserted in the balloons, and particularly the situations in which the interdependence of text and image is evident, with the aim at determining the implications of such interdependence on the translational process. The text/image relationship is often established for the sake of humour, which proved one of the major challenges for the translations into Brazilian Portuguese. Our analysis of the relationship between the verbal and the image codes enabled us to determine certain specificities in the translation of this kind of language, in part similar and in part dissimilar to the translation of other forms of language particularly cinema and illustrated literature. In this sense, the translation of comic stories must take as a starting point not only the written language material, but also the image, essential to the construction of the interpretative effects in te original contexs and to the intelligibility of the message in the translational contexts. The thesis proposes a definition of the major procedures whereby the image/text relationship becomes manifest in the original comics, and of the translation strategies which will generate analogous effects both, as found both in the published translations and in the alternative translations we have suggested in the course of our analyses. Comics Histórias em quadrinhos Image Imagem Text Texto Tradução Translation
568	Text compression for Chinese documents. January 1995 (has links) by Chi-kwun Kan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 133-137). / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Importance of Text Compression --- p.1 / Chapter 1.2 --- Historical Background of Data Compression --- p.2 / Chapter 1.3 --- The Essences of Data Compression --- p.4 / Chapter 1.4 --- Motivation and Objectives of the Project --- p.5 / Chapter 1.5 --- Definition of Important Terms --- p.6 / Chapter 1.5.1 --- Data Models --- p.6 / Chapter 1.5.2 --- Entropy --- p.10 / Chapter 1.5.3 --- Statistical and Dictionary-based Compression --- p.12 / Chapter 1.5.4 --- Static and Adaptive Modelling --- p.12 / Chapter 1.5.5 --- One-Pass and Two-Pass Modelling --- p.13 / Chapter 1.6 --- Benchmarks and Measurements of Results --- p.15 / Chapter 1.7 --- Sources of Testing Data --- p.16 / Chapter 1.8 --- Outline of the Thesis --- p.16 / Chapter 2 --- Literature Survey --- p.18 / Chapter 2.1 --- Data compression Algorithms --- p.18 / Chapter 2.1.1 --- Statistical Compression Methods --- p.18 / Chapter 2.1.2 --- Dictionary-based Compression Methods (Ziv-Lempel Fam- ily) --- p.23 / Chapter 2.2 --- Cascading of Algorithms --- p.33 / Chapter 2.3 --- Problems of Current Compression Programs on Chinese --- p.34 / Chapter 2.4 --- Previous Chinese Data Compression Literatures --- p.37 / Chapter 3 --- Chinese-related Issues --- p.38 / Chapter 3.1 --- Characteristics in Chinese Data Compression --- p.38 / Chapter 3.1.1 --- Large and Not Fixed Size Character Set --- p.38 / Chapter 3.1.2 --- Lack of Word Segmentation --- p.40 / Chapter 3.1.3 --- Rich Semantic Meaning of Chinese Characters --- p.40 / Chapter 3.1.4 --- Grammatical Variance of Chinese Language --- p.41 / Chapter 3.2 --- Definition of Different Coding Schemes --- p.41 / Chapter 3.2.1 --- Big5 Code --- p.42 / Chapter 3.2.2 --- GB (Guo Biao) Code --- p.43 / Chapter 3.2.3 --- Unicode --- p.44 / Chapter 3.2.4 --- HZ (Hanzi) Code --- p.45 / Chapter 3.3 --- Entropy of Chinese and Other Languages --- p.45 / Chapter 4 --- Huffman Coding on Chinese Text --- p.49 / Chapter 4.1 --- The use of the Chinese Character Identification Routine --- p.50 / Chapter 4.2 --- Result --- p.51 / Chapter 4.3 --- Justification of the Result --- p.53 / Chapter 4.4 --- Time and Memory Resources Analysis --- p.58 / Chapter 4.5 --- The Heuristic Order-n Huffman Coding for Chinese Text Com- pression --- p.61 / Chapter 4.5.1 --- The Algorithm --- p.62 / Chapter 4.5.2 --- Result --- p.63 / Chapter 4.5.3 --- Justification of the Result --- p.64 / Chapter 4.6 --- Chapter Conclusion --- p.66 / Chapter 5 --- The Ziv-Lempel Compression on Chinese Text --- p.67 / Chapter 5.1 --- The Chinese LZSS Compression --- p.68 / Chapter 5.1.1 --- The Algorithm --- p.69 / Chapter 5.1.2 --- Result --- p.73 / Chapter 5.1.3 --- Justification of the Result --- p.74 / Chapter 5.1.4 --- Time and Memory Resources Analysis --- p.80 / Chapter 5.1.5 --- Effects in Controlling the Parameters --- p.81 / Chapter 5.2 --- The Chinese LZW Compression --- p.92 / Chapter 5.2.1 --- The Algorithm --- p.92 / Chapter 5.2.2 --- Result --- p.94 / Chapter 5.2.3 --- Justification of the Result --- p.95 / Chapter 5.2.4 --- Time and Memory Resources Analysis --- p.97 / Chapter 5.2.5 --- Effects in Controlling the Parameters --- p.98 / Chapter 5.3 --- A Comparison of the performance of the LZSS and the LZW --- p.100 / Chapter 5.4 --- Chapter Conclusion --- p.101 / Chapter 6 --- Chinese Dictionary-based Huffman coding --- p.103 / Chapter 6.1 --- The Algorithm --- p.104 / Chapter 6.2 --- Result --- p.107 / Chapter 6.3 --- Justification of the Result --- p.108 / Chapter 6.4 --- Effects of Changing the Size of the Dictionary --- p.111 / Chapter 6.5 --- Chapter Conclusion --- p.114 / Chapter 7 --- Cascading of Huffman coding and LZW compression --- p.116 / Chapter 7.1 --- Static Cascading Model --- p.117 / Chapter 7.1.1 --- The Algorithm --- p.117 / Chapter 7.1.2 --- Result --- p.120 / Chapter 7.1.3 --- Explanation and Analysis of the Result --- p.121 / Chapter 7.2 --- Adaptive (Dynamic) Cascading Model --- p.125 / Chapter 7.2.1 --- The Algorithm --- p.125 / Chapter 7.2.2 --- Result --- p.126 / Chapter 7.2.3 --- Explanation and Analysis of the Result --- p.127 / Chapter 7.3 --- Chapter Conclusion --- p.128 / Chapter 8 --- Concluding Remarks --- p.129 / Chapter 8.1 --- Conclusion --- p.129 / Chapter 8.2 --- Future Work Direction --- p.130 / Chapter 8.2.1 --- Improvement in Efficiency and Resources Consumption --- p.130 / Chapter 8.2.2 --- The Compressibility of Chinese and Other Languages --- p.131 / Chapter 8.2.3 --- Use of Grammar Model --- p.131 / Chapter 8.2.4 --- Lossy Compression --- p.131 / Chapter 8.3 --- Epilogue --- p.132 / Bibliography --- p.133 Text processing (Computer science) Chinese language--Data processing
569	On-line learning for adaptive text filtering. January 1999 (has links) Yu Kwok Leung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 91-96). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- The Problem --- p.1 / Chapter 1.2 --- Information Filtering --- p.2 / Chapter 1.3 --- Contributions --- p.7 / Chapter 1.4 --- Organization Of The Thesis --- p.10 / Chapter 2 --- Related Work --- p.12 / Chapter 3 --- Adaptive Text Filtering --- p.22 / Chapter 3.1 --- Representation --- p.22 / Chapter 3.1.1 --- Textual Document --- p.23 / Chapter 3.1.2 --- Filtering Profile --- p.28 / Chapter 3.2 --- On-line Learning Algorithms For Adaptive Text Filtering --- p.29 / Chapter 3.2.1 --- The Sleeping Experts Algorithm --- p.29 / Chapter 3.2.2 --- The EG-based Algorithms --- p.32 / Chapter 4 --- The REPGER Algorithm --- p.37 / Chapter 4.1 --- A New Approach --- p.37 / Chapter 4.2 --- Relevance Prediction By RElevant feature Pool --- p.42 / Chapter 4.3 --- Retrieving Good Training Examples --- p.45 / Chapter 4.4 --- Learning Dissemination Threshold Dynamically --- p.49 / Chapter 5 --- The Threshold Learning Algorithm --- p.50 / Chapter 5.1 --- Learning Dissemination Threshold Dynamically --- p.50 / Chapter 5.2 --- Existing Threshold Learning Techniques --- p.51 / Chapter 5.3 --- A New Threshold Learning Algorithm --- p.53 / Chapter 6 --- Empirical Evaluations --- p.55 / Chapter 6.1 --- Experimental Methodology --- p.55 / Chapter 6.2 --- Experimental Settings --- p.59 / Chapter 6.3 --- Experimental Results --- p.62 / Chapter 7 --- Integrating With Feature Clustering --- p.76 / Chapter 7.1 --- Distributional Clustering Algorithm --- p.79 / Chapter 7.2 --- Integrating With Our REPGER Algorithm --- p.82 / Chapter 7.3 --- Empirical Evaluation --- p.84 / Chapter 8 --- Conclusions --- p.87 / Chapter 8.1 --- Summary --- p.87 / Chapter 8.2 --- Future Work --- p.88 / Bibliography --- p.91 / Chapter A --- Experimental Results On The AP Corpus --- p.97 / Chapter A.1 --- The EG Algorithm --- p.97 / Chapter A.2 --- The EG-C Algorithm --- p.98 / Chapter A.3 --- The REPGER Algorithm --- p.100 / Chapter B --- Experimental Results On The FBIS Corpus --- p.102 / Chapter B.1 --- The EG Algorithm --- p.102 / Chapter B.2 --- The EG-C Algorithm --- p.103 / Chapter B.3 --- The REPGER Algorithm --- p.105 / Chapter C --- Experimental Results On The WSJ Corpus --- p.107 / Chapter C.1 --- The EG Algorithm --- p.107 / Chapter C.2 --- The EG-C Algorithm --- p.108 / Chapter C.3 --- The REPGER Algorithm --- p.110 Text processing (Computer science) Information retrieval Hypertext systems
570	A probabilistic approach for automatic text filtering. January 1998 (has links) Low Kon Fan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 165-168). / Abstract also in Chinese. / Abstract --- p.i / Acknowledgment --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview of Information Filtering --- p.1 / Chapter 1.2 --- Contributions --- p.4 / Chapter 1.3 --- Organization of this thesis --- p.6 / Chapter 2 --- Existing Approaches --- p.7 / Chapter 2.1 --- Representational issues --- p.7 / Chapter 2.1.1 --- Document Representation --- p.7 / Chapter 2.1.2 --- Feature Selection --- p.11 / Chapter 2.2 --- Traditional Approaches --- p.15 / Chapter 2.2.1 --- NewsWeeder --- p.15 / Chapter 2.2.2 --- NewT --- p.17 / Chapter 2.2.3 --- SIFT --- p.19 / Chapter 2.2.4 --- InRoute --- p.20 / Chapter 2.2.5 --- Motivation of Our Approach --- p.21 / Chapter 2.3 --- Probabilistic Approaches --- p.23 / Chapter 2.3.1 --- The Naive Bayesian Approach --- p.25 / Chapter 2.3.2 --- The Bayesian Independence Classifier Approach --- p.28 / Chapter 2.4 --- Comparison --- p.31 / Chapter 3 --- Our Bayesian Network Approach --- p.33 / Chapter 3.1 --- Backgrounds of Bayesian Networks --- p.34 / Chapter 3.2 --- Bayesian Network Induction Approach --- p.36 / Chapter 3.3 --- Automatic Construction of Bayesian Networks --- p.38 / Chapter 4 --- Automatic Feature Discretization --- p.50 / Chapter 4.1 --- Predefined Level Discretization --- p.52 / Chapter 4.2 --- Lloyd's algorithm . . > --- p.53 / Chapter 4.3 --- Class Dependence Discretization --- p.55 / Chapter 5 --- Experiments and Results --- p.59 / Chapter 5.1 --- Document Collections --- p.60 / Chapter 5.2 --- Batch Filtering Experiments --- p.63 / Chapter 5.3 --- Batch Filtering Results --- p.65 / Chapter 5.4 --- Incremental Session Filtering Experiments --- p.87 / Chapter 5.5 --- Incremental Session Filtering Results --- p.88 / Chapter 6 --- Conclusions and Future Work --- p.105 / Appendix A --- p.107 / Appendix B --- p.116 / Appendix C --- p.126 / Appendix D --- p.131 / Appendix E --- p.145 Text processing (Computer science) Bayesian statistical decision theory Information retrieval

Search results