Global ETD Search

531	Does the Medicare principal inpatient diagnostic cost group model adequately adjust for selection bias? Kan, Hongjun. January 2002 (has links) Thesis (Ph.D.)--RAND Graduate School, 2002. / Includes bibliographical references (p. 96-101).
532	Role of topic and comment in linguistic theory Gundel, Jeanette K. January 1977 (has links) Originally published as author's thesis, University of Texas at Austin, 1974. / Bibliography: p. 206-211.
533	Improving search results with machine learning : Classifying multi-source data with supervised machine learning to improve search results Stakovska, Meri January 2018 (has links) Sony’s Support Application team wanted an experiment to be conducted by which they could determine if it was suitable to use Machine Learning to improve the quantity and quality of search results of the in-application search tool. By improving the quantity and quality of the results the team wanted to improve the customer’s journey. A supervised machine learning model was created to classify articles into four categories; Wi-Fi & Connectivity, Apps & Settings, System & Performance, andBattery Power & Charging. The same model was used to create a service that categorized the search terms into one of the four categories. The classified articles and the classified search terms were used to complement the existing search tool. The baseline for the experiment was the result of the search tool without classification. The results of the experiment show that the number of articles did indeed increase but due mainly to the broadness of the categories the search results held low quality. Searcher Frustration Information Retrieval Search Results Topic Classification Machine Learning Supervised Classification Naive Bayes Computer Sciences Datavetenskap (datalogi)
534	Finding early signals of emerging trends in text through topic modeling and anomaly detection Redyuk, Sergey January 2018 (has links) Trend prediction has become an extremely popular practice in many industrial sectors and academia. It is beneficial for strategic planning and decision making, and facilitates exploring new research directions that are not yet matured. To anticipate future trends in academic environment, a researcher needs to analyze an extensive amount of literature and scientific publications, and gain expertise in the particular research domain. This approach is time-consuming and extremely complicated due to abundance of data and its diversity. Modern machine learning tools, on the other hand, are capable of processing tremendous volumes of data, reaching the real-time human-level performance for various applications. Achieving high performance in unsupervised prediction of emerging trends in text can indicate promising directions for future research and potentially lead to breakthrough discoveries in any field of science. This thesis addresses the problem of emerging trend prediction in text in two main steps: it utilizes HDP topic model to represent latent topic space of a given temporal collection of documents, DBSCAN clustering algorithm to detect groups with high-density regions in the document space potentially leading to emerging trends, and applies KLdivergence in order to capture deviating text which might indicate birth of a new not-yet-seen phenomenon. In order to empirically evaluate the effectiveness of the proposed framework and estimate its predictive capability, both synthetically generated corpora and real-world text collections from arXiv.org, an open-access electronic archive of scientific publications (category: Computer Science), and NIPS publications are used. For synthetic data, a text generator is designed which provides ground truth to evaluate the performance of anomaly detection algorithms. This work contributes to the body of knowledge in the area of emerging trend prediction in several ways. First of all, the method of incorporating topic modeling and anomaly detection algorithms for emerging trend prediction is a novel approach and highlights new perspectives in the subject area. Secondly, the three-level word-document-topic topology of anomalies is formalized in order to detect anomalies in temporal text collections which might lead to emerging trends. Finally, a framework for unsupervised detection of early signals of emerging trends in text is designed. The framework captures new vocabulary, documents with deviating word/topic distribution, and drifts in latent topic space as three main indicators of a novel phenomenon to occur, in accordance with the three-level topology of anomalies. The framework is not limited by particular sources of data and can be applied to any temporal text collections in combination with any online methods for soft clustering. Machine learning text mining topic modeling emerging trend prediction novelty detection group anomaly detection Computer Sciences Datavetenskap (datalogi)
535	Computational Analyses of Scientific Publications Using Raw and Manually Curated Data with Applications to Text Visualization Shokat, Imran January 2018 (has links) Text visualization is a field dedicated to the visual representation of textual data by using computer technology. A large number of visualization techniques are available, and now it is becoming harder for researchers and practitioners to choose an optimal technique for a particular task among the existing techniques. To overcome this problem, the ISOVIS Group developed an interactive survey browser for text visualization techniques. ISOVIS researchers gathered papers which describe text visualization techniques or tools and categorized them according to a taxonomy. Several categories were manually assigned to each visualization technique. In this thesis, we aim to analyze the dataset of this browser. We carried out several analyses to find temporal trends and correlations of the categories present in the browser dataset. In addition, a comparison of these categories with a computational approach has been made. Our results show that some categories became more popular than before whereas others have declined in popularity. The cases of positive and negative correlation between various categories have been found and analyzed. Comparison between manually labeled datasets and results of computational text analyses were presented to the experts with an opportunity to refine the dataset. Data which is analyzed in this thesis project is specific to text visualization field, however, methods that are used in the analyses can be generalized for applications to other datasets of scientific literature surveys or, more generally, other manually curated collections of textual documents. Scientific literature analysis meta-analysis trends correlation NLP text mining topic modeling LDA HDP text visualization Software Engineering Programvaruteknik
536	Deep generative models for natural language processing Miao, Yishu January 2017 (has links) Deep generative models are essential to Natural Language Processing (NLP) due to their outstanding ability to use unlabelled data, to incorporate abundant linguistic features, and to learn interpretable dependencies among data. As the structure becomes deeper and more complex, having an effective and efficient inference method becomes increasingly important. In this thesis, neural variational inference is applied to carry out inference for deep generative models. While traditional variational methods derive an analytic approximation for the intractable distributions over latent variables, here we construct an inference network conditioned on the discrete text input to provide the variational distribution. The powerful neural networks are able to approximate complicated non-linear distributions and grant the possibilities for more interesting and complicated generative models. Therefore, we develop the potential of neural variational inference and apply it to a variety of models for NLP with continuous or discrete latent variables. This thesis is divided into three parts. Part I introduces a <b>generic variational inference framework</b> for generative and conditional models of text. For continuous or discrete latent variables, we apply a continuous reparameterisation trick or the REINFORCE algorithm to build low-variance gradient estimators. To further explore Bayesian non-parametrics in deep neural networks, we propose a family of neural networks that parameterise categorical distributions with continuous latent variables. Using the stick-breaking construction, an unbounded categorical distribution is incorporated into our deep generative models which can be optimised by stochastic gradient back-propagation with a continuous reparameterisation. Part II explores <b>continuous latent variable models for NLP</b>. Chapter 3 discusses the Neural Variational Document Model (NVDM): an unsupervised generative model of text which aims to extract a continuous semantic latent variable for each document. In Chapter 4, the neural topic models modify the neural document models by parameterising categorical distributions with continuous latent variables, where the topics are explicitly modelled by discrete latent variables. The models are further extended to neural unbounded topic models with the help of stick-breaking construction, and a truncation-free variational inference method is proposed based on a Recurrent Stick-breaking construction (RSB). Chapter 5 describes the Neural Answer Selection Model (NASM) for learning a latent stochastic attention mechanism to model the semantics of question-answer pairs and predict their relatedness. Part III discusses <b>discrete latent variable models</b>. Chapter 6 introduces latent sentence compression models. The Auto-encoding Sentence Compression Model (ASC), as a discrete variational auto-encoder, generates a sentence by a sequence of discrete latent variables representing explicit words. The Forced Attention Sentence Compression Model (FSC) incorporates a combined pointer network biased towards the usage of words from source sentence, which significantly improves the performance when jointly trained with the ASC model in a semi-supervised learning fashion. Chapter 7 describes the Latent Intention Dialogue Models (LIDM) that employ a discrete latent variable to learn underlying dialogue intentions. Additionally, the latent intentions can be interpreted as actions guiding the generation of machine responses, which could be further refined autonomously by reinforcement learning. Finally, Chapter 8 summarizes our findings and directions for future work.
537	Beethoven’s Op. 28 piano sonata: the pastoral and the enlightenment Anderson, Dustin 29 August 2018 (has links) This thesis examines Beethoven’s Op. 28 Pastoral Sonata as a musical work that is dominated by the pastoral topic, and, through its use of this topic, refers to certain ideals of the Enlightenment. The first chapter presents an overview of the sonata and its relative neglect by modern musicologists, followed by a brief history of the pastoral topic in music and literature. The second chapter examines, and provides examples of, the pastoral signifiers that occur in the Op. 28 sonata: drone bass, compound meter, subdominant emphasis, simple harmonies, lyrical melodies and the weathered storm. The third chapter summarizes aspects of the Enlightenment that influenced Beethoven, and his use of the pastoral topic to communicate these ideals. The primary arguments put forward are: the Op. 28 Sonata demonstrates aspects of reconciliation between the urban and the rural as a metaphor for the reconciliation between man and God; Beethoven uses dance as symbol of both pastoral and of fraternity in the sonata; and the Enlightenment concept of interconnectedness between all things is reflected in the musical motives and structure of the composition. The thesis concludes by suggesting that the sonata’s message may have been obscured over time because of changes in Beethoven reception history, the gendering of his repertoire, and the shifting perception of what nature signifies as the Romantic Era developed. / Graduate Beethoven Pastoral Op. 28 Enlightenment Piano Sonata Troping Topic Signifier Dance Aufklärung Nature God Rural Urban Rustic
538	As construções de tópico do português nos séculos XVIII e XIX Araújo, Edivalda Alves January 2006 (has links) 293f. / Submitted by Suelen Reis (suziy.ellen@gmail.com) on 2013-05-14T17:35:47Z No. of bitstreams: 2 Dissertacao Evanice Lima.pdf: 3922080 bytes, checksum: c7a14dddbd114ecc6a62155e51912b15 (MD5) Tese Edivalda Araújo1.pdf: 1783798 bytes, checksum: f322a381624297e3b950d039e2df989d (MD5) / Approved for entry into archive by Alda Lima da Silva(sivalda@ufba.br) on 2013-06-04T17:28:52Z (GMT) No. of bitstreams: 2 Dissertacao Evanice Lima.pdf: 3922080 bytes, checksum: c7a14dddbd114ecc6a62155e51912b15 (MD5) Tese Edivalda Araújo1.pdf: 1783798 bytes, checksum: f322a381624297e3b950d039e2df989d (MD5) / Made available in DSpace on 2013-06-04T17:28:52Z (GMT). No. of bitstreams: 2 Dissertacao Evanice Lima.pdf: 3922080 bytes, checksum: c7a14dddbd114ecc6a62155e51912b15 (MD5) Tese Edivalda Araújo1.pdf: 1783798 bytes, checksum: f322a381624297e3b950d039e2df989d (MD5) Previous issue date: 2006 / Definimos como objeto de estudo neste trabalho a análise das construções de tópico deslocado à esquerda do português europeu, dos séculos XVIII e XIX, e do português brasileiro, do século XIX, numa perspectiva sintático-discursiva, sob a abordagem da teoria da gramática gerativa e da estrutura da informação, para identificarmos as diferenças e/ou semelhanças sintáticas e discursivas entre essas duas variedades do português a partir da posição sintática ocupada pelo tópico e da sua relação com os outros constituintes da oração, como os advérbios, o sujeito, os elementos interrogativos e os clíticos. Para a realização dessa análise, estabelecemos como corpora cartas pessoais e peças de teatro, retiradas do Corpus do Projeto Tycho Brahe (disponível no site da USP), no caso do português europeu, e coletadas na Biblioteca Pública do Estado da Bahia, no caso do português brasileiro. Levantamos como hipótese a possibilidade de que algumas construções de tópico do português brasileiro poderiam ser explicadas à luz dos dados diacrônicos, principalmente no período selecionado para estudo. A análise dos dados, contudo, revelou que, nesse período, as construções de tópico do português europeu e do português brasileiro ainda não apresentavam diferenças sintáticas que pudessem identificar a primeira como língua de proeminência de sujeito e a segunda como língua de proeminência de tópico. Além disso, não encontramos dados suficientes que pudessem indicar que algumas construções de tópico do português brasileiro atual já estivessem registradas no português europeu. Em termos gerais, entretanto, a partir da comparação entre os dados do português europeu e do português brasileiro, detectamos que: (i) em ambos, é possível a ativação das três posições de tópico no sistema C – TopP1, TopP2 e TopP3 – considerando a proposta de Rizzi (1997, 2002). Essas posições, entretanto, não são ativadas simultaneamente. A escolha do português europeu e do brasileiro é pela seqüência TopP2 + FocP, talvez em função da relação operador-variável entre o foco e a oração. Mas é possível a ocorrência simultânea de TopP1 e TopP2; (ii) não existem, de acordo com os dados, evidências de que tipos diferenciados de tópico ocupem posições diferenciadas na periferia à esquerda; (iii) a identificação da posição do tópico na periferia à esquerda depende da ativação de outras projeções funcionais nessa periferia; (iv) as diferenças observadas entre o português europeu e o português brasileiro não se restringem especificamente às construções de tópico, mas a outros fatores, como: o movimento do verbo auxiliar acima do advérbio baixo, detectado no português europeu, mas com oscilação no português brasileiro; tendência ao preenchimento do sujeito, iniciante no português brasileiro; flutuação na colocação dos clíticos com as construções de tópico, tanto no português europeu quanto no português brasileiro; (v) a posição do tópico tanto no português europeu quanto no português brasileiro sofreu reanálise. O primeiro parece ter desenvolvido construções de tópico mais voltadas para o discurso, daí a sua impossibilidade de colocar IP em segunda posição; enquanto o segundo passou a ter uma direção do tópico mais voltada para a sintaxe, daí a possibilidade de alguns tópicos poderem concordar com o verbo. / Salvador Tópico Estrutura da informação Sintaxe Periferia à esquerda Clíticos Left periphery Ordenação de constituintes Syntax Information structure Topic Clitics Constituent order
539	Essa bolsa, é as minhas coisas do carro : reflexões acerca do tópico marcado em português / This bag, it s my things of the car : reflections on the marked topic in portuguese Silva, Jair Barbosa da 09 September 2011 (has links) Having functionalism as its theoretical support, especially Lambrecht‟s (1994) and Li and Thompson‟s (1976) proposals, this thesis presents a discussion into the rules for marked topic constructions in Brazilian Portuguese (BP). All analyzed data came from authors whose constructions are herein discussed or from informal conversations. Based on several studies (Tarallo et al. (2002a, 2002b), Callou et al. (2002), Pontes (1987), Berlink, Duarte and Oliveira (2009), Perini (1996, 2006, 2008 and 2010), Brito, Duarte and Matos (2003), Li and Thompson (1976), we firstly looked into how different authors conceive of topic and how much disagreement there is among them, most of which as a result of their theoretical affiliation. On the second chapter of the work we stress topic as a linguistic category and bring to light a description of the pragmatic, semantic, and syntactic aspects involved in the codification of topics. The third and last chapter contains our assessment of topics in BP, in which we found a variety of structures that reflects communicative purposes. Ultimately, we drew the following conclusions: a) there are different nomenclatures and perspectives when dealing with topic construction in BP. Thus, the topic may appear in linguistic systems in a variety of ways, and the pragmatic context determines how such structures are codified; b) linguistic studies on topics have scrapped the notion of these structures being stylistic, although some restrictions do remain, such as the one that renders the topic to be a solely pragmatic and not syntactic category, or the one that states that topic constructions are agrammatical; c) we deem the idea of IE (Information Structure), as advanced by Lambrecht (1994), to be one of the most thorough approaches to the description of constructions with marked topic, for pragmatics, semantics, and syntax are equally considered; d) we acknowledge the fact that the current BP does codify subject-predicate and topic-comment structures; and e) we can undoubtedly attest to the high frequency of marked topic constructions within BP, and that these constructions are somehow in accordance with the properties of topic-prominent languages proposed by Li and Thompson (1976), which allows one to assert that Tp and Sp structures are just as basic in BP. / O presente estudo discute o estatuto das construções de tópico marcado em português do Brasil tomando o funcionalismo como suporte teórico, em particular as propostas de Lambrecht (1994) e Li e Thompson (1976). Os dados analisados advieram dos autores de quem lançamos mão para discutir essas construções ou de coleta em conversas informais. Em primeiro lugar, com base em diversos estudos (Tarallo et al. (2002a, 2002b), Callou et al. (2002), Pontes (1987), Berlink, Duarte e Oliveira (2009), Perini (1996, 2006, 2008 e 2010), Brito, Duarte e Matos (2003), Li e Thompson (1976), observamos a diversidade de formas com que diferentes autores concebem as construções de tópico e as divergências entre eles, muitas das quais se dão em função da filiação teórica. Na segunda parte do trabalho, onde defendemos o tópico como categoria linguística, apresentamos uma descrição dos aspectos pragmáticos, semânticos e sintáticos envolvidos na codificação dos tópicos. No terceiro e último capítulo, analisamos as construções de tópico em PB, onde constatamos uma variedade estrutural diversificada, a qual reflete propósitos comunicativos. Por fim, chegamos às seguintes conclusões: a) há variedade de nomenclaturas e perspectivas na abordagem das construções de tópico em PB, Assim, o tópico se apresenta nos sistemas linguísticos de maneiras variadas, sendo o mecanismo de codificação dessas estruturas determinado pelo contexto pragmático; b) os estudos linguísticos acerca do tópico têm superado a concepção de que essas construções são do âmbito do estilismo, embora ainda haja determinadas restrições, como defender que o tópico é uma categoria apenas pragmática e não sintática, ou que construções de tópico são agramaticais; c) consideramos a noção de EI (Estrutura da Informação) proposta por Lambrecht (1994) uma das mais completas abordagens para a descrição das construções de tópico marcado, já que a tríade pragmática, semântica e sintaxe é igualmente contemplada; d) reconhecemos que o PB no estágio atual codifica estruturas do tipo sujeito-predicado e estruturas com tópico-comentário; e e) podemos verificar que o PB apresenta, sem dúvidas, enorme frequência de construções de tópico marcado e estas, de alguma forma, encaixam-se nas propriedades das línguas de tópico propostas por Li e Thompson (1976), o que nos permite afirmar que no PB as estruturas Tp são tão básicas quanto as Sp. Marked topic Linguistic category Topicalization in BP Tópico marcado Categoria lingüística Tópico em PB
540	Indexação da pesquisa científica: uma proposta para o uso adequado dos termos finalizadores dos resumos / Indexing of scientific research: a proposal for the proper use of finalizers from summaries Rocha, Lidianne Mércia Barbosa Malta 09 February 2017 (has links) This scholarly work of conclusion of course (TACC), consisting of a scientific paper and a product of speech, discusses words representing the content (keywords and descriptors) used in abstracts of academic papers defended in 2013 and 2014, the professional master's in Health Education (MEPS), identifying them as terms finalizers and indexers of research. The method adopted was of documentary nature, exploratory and descriptive, with quantitative perspective, investigating 37 research through semi-structured, electronic questionnaire containing a total of 17 questions, with the first five draw the General profile of all jobs and the following 12 identify each of the terms that are described in their respective summaries. The interrogative instrument was developed by the researcher in the own masters through the platform Google drive, to support the documentary analysis, being validated through an Electronic Validation Panel during the discipline technology applied in teaching and research in health (TAEPS), from the same institution. The variables analyzed were: (a) amount of keywords, (b) finalizer Nomenclature of summary: keywords or descriptors, (c) characterization of keywords: free and structured terms, (d) frequency of keywords, (and) Terminology of input keywords, (f) capture of key words in the titles and (g) score used between the keywords. The amount of keywords used pointed out that the summaries of the TACC did not follow an internal standardization of MEPS, but it was necessary to fit the journals chosen for submission, after the defence of academic papers, following the requirements presuppose the possibility of publication. Various terms used as keywords summaries were not found in the main terminology banks (MeSH, DeCS and Thesaurus), but had strength so the construction of the indexing representative academic work as the terms present in the bases of access, being possible to suggest new terms are included in the recovery of information portals. Finally, the authors of the TACC used encoded descriptors. However, do not have cited in their methodologies, registry numbers, or the portal in which they were obtained, leading to complete lack of practice when accessing the databases available, where the distinction between free or structured terms could contribute better with the correct choice of words post-production of their abstracts. Points out that various terminologies used by them, had great relevance in cohesion and coherence in summaries in which they found, showing strong potential for indexing the portal DeCS, which highlights the need for even more dynamic and constant supply of new terms, enabling more research grants, from the enrichment of databases available, and taking into account the existing records. In order to target students, teachers and researchers, more thorough management of the finalizer of nomenclature summaries, which will represent the content of scholarly works, from graduations and post-graduate degrees and academic professionals, was created, as a product of intervention, an educational blog titled ‘Key words in Scientific Production of MEPS’, which guides you through the proper use of the words representative of abstracts (keywords and descriptors). / O presente Trabalho Acadêmico de Conclusão de Curso (TACC), composto por um artigo científico e um produto de intervenção, discute sobre palavras representativas do conteúdo (palavras-chave e descritores) utilizadas nos resumos dos trabalhos acadêmicos defendidos em 2013 e 2014, no Mestrado Profissional em Ensino na Saúde (MPES), identificando-as como termos finalizadores e indexadores das pesquisas. O método adotado foi de cunho documental, exploratório e descritivo, com perspectiva quantitativa, investigando 37 pesquisas, através de questionário eletrônico semiestruturado, contendo um total de 17 perguntas, sendo que as cinco primeiras desenham o perfil geral de todos os trabalhos e as 12 seguintes identificam cada um dos termos descritos nos seus respectivos resumos. O instrumento interrogativo foi desenvolvido pela pesquisadora no próprio mestrado, através da plataforma Google drive, para subsidiar a análise documental, sendo validado por meio de um Painel de Validação Eletrônico durante a disciplina Tecnologia Aplicada no Ensino e Pesquisa na Saúde (TAEPS), da mesma instituição. As variáveis analisadas foram: (a) Quantidade de palavras-chave, (b) Nomenclatura finalizadora de resumo: palavras-chave ou descritores, (c) Caracterização das palavras-chave: termos livres e estruturados, (d) Frequência das palavras-chave, (e) Terminologias de entrada das palavras-chave, (f) Captação de palavras-chave nos títulos e (g) Pontuação usada entre as palavras-chave. A quantidade de palavras-chave usadas apontou que os resumos dos TACC não seguiram uma normatização interna do MPES, mas foi preciso se adequar aos periódicos escolhidos para submissão, após a defesa dos trabalhos acadêmicos, seguindo as exigências pressupostas para a possibilidade de publicação. Vários termos utilizados como palavras-chave dos resumos não foram encontrados nos principais bancos de terminologias (MeSH, DeCS e Thesaurus), mas possuíam força indexadora tão representativa à construção do trabalho acadêmico quanto os termos presentes nas bases de acesso, sendo possível sugerir que novos termos sejam incluídos nos portais de recuperação das informações. Por fim, os autores dos TACC utilizaram descritores codificados. Entretanto, não citaram em suas metodologias, a numeração do registro, nem o portal no qual os mesmos foram obtidos, levando a concluir haver falta de prática ao acessar as bases de dados disponíveis, onde a distinção entre termos livres ou estruturados poderia contribuir melhor com a escolha correta das palavras finalizadoras de seus resumos. Ressalta-se ainda que várias terminologias usadas por eles, possuíam grande relevância na coesão e coerência nos resumos em que se encontravam, apresentando forte potencial de indexação no portal DeCS, o que evidencia a necessidade de alimentação ainda mais dinâmica e constante de novos termos, possibilitando mais subsídios às pesquisas, a partir do enriquecimento das bases de dados disponíveis e, levando em consideração os registros existentes. Com o intuito de direcionar discentes, docentes e pesquisadores, no manejo mais minucioso da nomenclatura finalizadora de resumos, que representarão o conteúdo dos trabalhos acadêmicos, oriundos de graduações e pós-graduações acadêmicas e profissionais, foi criado, como produto de intervenção, um blog educacional intitulado ‘Descritores na Produção Científica do MPES’, o qual orienta o uso adequado das palavras representativas de resumos. Indexação como assunto Cabeçalho de assunto Nomenclatura Linguagem documentária Recuperação da informação Indexing as topic Subject headings Nomenclature Cataloging CNPQ::CIENCIAS DA SAUDE

Search results