Global ETD Search

1	Water policy informatics : a topic and time series analysis of the Texas state water plans Wehner, Jenifer Elizabeth 15 July 2011 (has links) The disciplines of informatics and information visualization have developed in response to societal needs to find new insight in complex datasets and have been enabled by technological advancements. Joint application of these fields can demonstrate themes and connections that are otherwise not apparent. Methodological approaches, such as direct network analysis, can be applied to policy documents to determine if action or policy recommendations match the goals or objectives stated in the within the same documents. Informatics and information visualization can also be used to analyze changes of themes found within the documents over time. This paper seeks to leverage informatics and information visualization methodologies as a novel approach to policy analysis. In particular, directed network and time burst techniques are used to analyze water management policy documents for the State of Texas. The congruency between the stated goals or objectives and recommendations sections is evaluated at a topical level within each planning document and possible changes in important water policy concepts over time are highlighted by comparing among multiple planning documents. Although there limitations to the process at the time of publication due to the newness of the software utilized, this paper demonstrates that the products still lead to unique and insightful conclusions. / text Informatics Texas state water plans Information visualization Environmental informatics Water management policy Directed netowrks Time burst Topical analysis Time series analysis
2	Hur sjutton har vi kommit in påre här? : En studie om samtalsämnen och ämnesbyten i ett samtal mellan personer med demens / How the heck did we get into this? : A study of Topics and Topic Shifts in a Conversation between People with Dementia Holmén, Clara, Johansson, Veronica January 2014 (has links) I Sverige beräknas 130 000 människor leva med en medelsvår till svår demenssjukdom. Demens är en övergripande diagnos för en samling sjukdomar där kognitiva nedsättningar är utmärkande och kommunikativa förmågor påverkade. I tidigare studier har det undersökts hur personer med demens kommunicerar med en samtalspartner utan demens, men hur personer med demens kommunicerar med varandra är fortfarande relativt outforskat. Syftet med föreliggande studie var att undersöka och beskriva hur personer med en demensdiagnos samtalar med varandra och hur de hanterar samtalsämnen och ämnesbyten. Studien genomfördes på en daglig verksamhet för personer med demensdiagnos, där totalt tre samtal spelades in. Ett av samtalen, omfattande 40 minuter, valdes ut för transkription och analyserades enligt samtalsanalytiska principer (CA). I det utvalda samtalet valdes sekvenser ut. Sekvenserna exemplifierade olika typer av ämnesbyten. För en utökad bild av samtalets ämnesflöde gjordes även en topikal analys, där samtalet delades in i totalt 14 olika episoder. I samtalet deltog sex personer med demensdiagnos samt föreliggande studies två författare. Exempel på ämnesbyten som påträffades i det analyserade samtalet var koherenta ämnesbyten i form av preannonseringar, koherent återinförande av ämne och ämnesglidningar. Icke-koherenta ämnesbyten förekom i samtalet i form av icke-koherent återinförande av ett ämne. Samtalet innehöll också exempel på digressioner och inskott. För en utomstående betraktare föreföll det ibland som om deltagarna saknade gemensam grund för samtalet, vilket till synes inte uppmärksammades av samtalsdeltagarna. En slutsats som dras utifrån detta är att för deltagarna är samtalsaktiviteten viktigare än själva innehållet i samtalet. I det aktuella samtalet återkom ofta samma historia och samtalsämne flera gånger. De olika samtalsämnena som uppstod under samtalet hade oftast ett övergripande tema som handlade om hur det var förr i tiden, när deltagarna var unga eller barn. / In Sweden, approximately 130 000 persons suffer from moderate to severe dementia. Dementia is a collective diagnosis for a collection of diseases in which cognitive impairments are distinctive, and communicative abilities are affected. Earlier studies have investigated how people with dementia communicate with an interlocutor without dementia. How people suffering from dementia communicate with each other is still relatively unexplored. The purpose of the present study is to investigate and describe how people suffering from dementia interact with each other, and how they handle topics and topic shifts. The present study was conducted on a day center for people with dementia diagnoses, where a total of three conversations were recorded. One of the conversations, comprising 40 minutes, was transcribed according to conversation analytic principles. From the selected conversation, sequences showing different types of topic changes were selected and exemplified. For an expanded view of the conversation’s topical flow, a topical analysis was made, where the conversation was divided into a total of 14 different episodes. The conversation involved six people with dementia and the two authors of the present study. Examples of topic shifts found in the analyzed conversation were coherent topic shifts in the form of pre-acts, coherent renewal of topic and topic shading. Non-coherent topic shifts occurred in the conversation in the form of non-coherent renewal of a topic. The conversation also contained examples of digressions and inserts. To the outside observer, it seemed at times as if the participants had no common ground for the conversation, which was not noticed by the participants of the conversation. One conclusion drawn from this is that for the participants, the activity is more important than the actual content of the conversation. In the current conversation, the participants returned to the same story and the same topic several times. The different topics occurring during the conversation usually had an overall theme which was what it was like in the old days, when the participants were young or children. dementia communication topics of conversation topic shifts common ground conversation analysis topical analysis demens kommunikation samtalsämnen ämnesbyten gemensam grund samtalsanalys topikal analys
3	Scalable Sprase Bayesian Nonparametric and Matrix Tri-factorization Models for Text Mining Applications Ranganath, B N January 2017 (has links) (PDF) Hierarchical Bayesian Models and Matrix factorization methods provide an unsupervised way to learn latent components of data from the grouped or sequence data. For example, in document data, latent component corn-responds to topic with each topic as a distribution over a note vocabulary of words. For many applications, there exist sparse relationships between the domain entities and the latent components of the data. Traditional approaches for topic modelling do not take into account these sparsity considerations. Modelling these sparse relationships helps in extracting relevant information leading to improvements in topic accuracy and scalable solution. In our thesis, we explore these sparsity relationships for di errant applications such as text segmentation, topical analysis and entity resolution in dyadic data through the Bayesian and Matrix tri-factorization approaches, propos-in scalable solutions. In our rest work, we address the problem of segmentation of a collection of sequence data such as documents using probabilistic models. Existing state-of-the-art Hierarchical Bayesian Models are connected to the notion of Complete Exchangeability or Markov Exchangeability. Bayesian Nonpareil-metric Models based on the notion of Markov Exchangeability such as HDP-HMM and Sticky HDP-HMM, allow very restricted permutations of latent variables in grouped data (topics in documents), which in turn lead to com-mutational challenges for inference. At the other extreme, models based on Complete Exchangeability such as HDP allow arbitrary permutations within each group or document, and inference is significantly more tractable as a result, but segmentation is not meaningful using such models. To over-come these problems, we explored a new notion of exchangeability called Block Exchangeability that lies between Markov Exchangeability and Com-plate Exchangeability for which segmentation is meaningful, but inference is computationally less expensive than both Markov and Complete Exchange-ability. Parametrically, Block Exchangeability contains sparser number of transition parameters, linear in number of states compared to the quadratic order for Markov Exchangeability that is still less than that for Complete Exchangeability and for which parameters are on the order of the number of documents. For this, we propose a nonparametric Block Exchangeable model (BEM) based on the new notion of Block Exchangeability, which we have shown to be a superclass of Complete Exchangeability and subclass of Markov Exchangeability. We propose a scalable inference algorithm for BEM to infer the topics for words and segment boundaries associated with topics for a document using the collapsed Gibbs Sampling procedure. Empirical results show that BEM outperforms state-of-the-art nonparametric models in terms of scalability and generalization ability and shows nearly the same segmentation quality on News dataset, Product review dataset and on a Synthetic dataset. Interestingly, we can tune the scalability by varying the block size through a parameter in our model for a small trade-o with segmentation quality. In addition to exploring the association between documents and words, we also explore the sparse relationships for dyadic data, where associations between one pair of domain entities such as (documents, words) and as-associations between another pair such as (documents, users) are completely observed. We motivate the analysis of such dyadic data introducing an additional discrete dimension, which we call topics, and explore sparse relation-ships between the domain entities and the topic, such as of user-topic and document-topic respectively. In our second work, for this problem of sparse topical analysis of dyadic data, we propose a formulation using sparse matrix tri-factorization. This formulation requires sparsity constraints, not only on the individual factor matrices, but also on the product of two of the factors. To the best of our knowledge, this problem of sparse matrix tri-factorization has not been stud-ide before. We propose a solution that introduces a surrogate for the product of factors and enforces sparsity on this surrogate as well as on the individual factors through L1-regularization. The resulting optimization problem is e - cogently solvable in an alternating minimization framework over sub-problems involving individual factors using the well-known FISTA algorithm. For the sub-problems that are constrained, we use a projected variant of the FISTA algorithm. We also show that our formulation leads to independent sub-problems towards solving a factor matrix, thereby supporting parallel implementation leading to a scalable solution. We perform experiments over bibliographic and product review data to show that the proposed framework based on sparse tri-factorization formulation results in better generalization ability and factorization accuracy compared to baselines that use sparse bi-factorization. Even though the second work performs sparse topical analysis for dyadic data, ending sparse topical associations for the users, the user references with di errant names could belong to the same entity and those with same names could belong to different entities. The problem of entity resolution is widely studied in the research community, where the goal is to identify real users associated with the user references in the documents. Finally, we focus on the problem of entity resolution in dyadic data, where associations between one pair of domain entities such as documents-words and associations between another pair such as documents-users are ob.-served, an example of which includes bibliographic data. In our nil work, for this problem of entity resolution in bibliographic data, we propose a Bayesian nonparametric `Sparse entity resolution model' (SERM) exploring the sparse relationships between the grouped data involving grouping of the documents, and the topics/author entities in the group. Further, we also exploit the sparseness between an author entity and the associated author aliases. Grouping of the documents is achieved with the stick breaking prior for the Dirichlet processes (DP). To achieve sparseness, we propose a solution that introduces separate Indian Bu et process (IBP) priors over topics and the author entities for the groups and k-NN mechanism for selecting author aliases for the author entities. We propose a scalable inference for SERM by appropriately combining partially collapsed Gibbs sampling scheme in Focussed topic model (FTM), the inference scheme used for parametric IBP prior and the k-NN mechanism. We perform experiments over bibliographic datasets, Cite seer and Rexa, to show that the proposed SERM model imp-proves the accuracy of entity resolution by ending relevant author entities through modelling sparse relationships and is scalable, when compared to the state-of-the-art baseline Scalable Sprase Bayesian Matrix Tri-factorization Model Mining Application - Texts Hierarchical Bayesian Models Sparse Entity Resolution Model (SERM) Sparse Topical Analysis Scalable Focussed Entity Resolution Block Exchangeable Model (BEM) Sequential Grouped Data Dirichlet Process Computer Science

1

Page generated in 0.068 seconds