Global ETD Search

11	Enhancing Recommendations for Conference Participants with Community and Topic Modeling Pasham, Bharath January 2013 (has links) § For a researcher it is always important to increase his/her social capital and excel attheir research area. For this, conferences act as perfect medium where researchers meetand present their work. However, due to the structure of the conferences finding similarauthors or interesting talks is not obvious for the researchers. One of most importantobservation made from the conferences is, researchers tend to form communities withcertain research topics as the series of conferences progresses. These communitiesand their research topics could be used in helping researchers find their potentialcollaborators and in attending interesting talks. In this research we present the design and implementation of a recommender systemwhich is built to provide recommendation of authors and talks at the conferences.Various concepts like Social Network Analysis (SNA), context awareness, communityanalysis, and topic modeling are used to build the system. This system can beconsidered as an extension to the previous system CAMRS (Context Aware MobileRecommender System). CAMRS is a mobile application which serves the same purposeas the current system. However, CAMRS uses only SNA and context to providerecommendations. Current system, CAMRS-2, is also an Android application builtusing REST based architecture. The system is successfully is deployed, and as partof thesis the system is evaluated. The evaluation results proved CAMRS-2 providesbetter recommendations over its predecessor. Recommender System Community Detection Topic Modeling Conference Participants
12	Approaches to Natural Language Processing Smith, Sydney 01 January 2018 (has links) This paper explores topic modeling through the example text of Alice in Wonderland. It explores both singular value decomposition as well as non-‐‑negative matrix factorization as methods for feature extraction. The paper goes on to explore methods for partially supervised implementation of topic modeling through introducing themes. A large portion of the paper also focuses on implementation of these techniques in python as well as visualizations of the results which use a combination of python, html and java script along with the d3 framework. The paper concludes by presenting a mixture of SVD, NMF and partially-‐‑supervised NMF as a possible way to improve topic modeling. Topic Modeling Data Mining Machine Learning NMF Other Applied Mathematics
13	Probabilistic Models of Topics and Social Events Wei, Wei 01 December 2016 (has links) Structured probabilistic inference has shown to be useful in modeling complex latent structures of data. One successful way in which this technique has been applied is in the discovery of latent topical structures of text data, which is usually referred to as topic modeling. With the recent popularity of mobile devices and social networking, we can now easily acquire text data attached to meta information, such as geo-spatial coordinates and time stamps. This metadata can provide rich and accurate information that is helpful in answering many research questions related to spatial and temporal reasoning. However, such data must be treated differently from text data. For example, spatial data is usually organized in terms of a two dimensional region while temporal information can exhibit periodicities. While some work existing in the topic modeling community that utilizes some of the meta information, these models largely focused on incorporating metadata into text analysis, rather than providing models that make full use of the joint distribution of metainformation and text. In this thesis, I propose the event detection problem, which is a multidimensional latent clustering problem on spatial, temporal and topical data. I start with a simple parametric model to discover independent events using geo-tagged Twitter data. The model is then improved toward two directions. First, I augmented the model using Recurrent Chinese Restaurant Process (RCRP) to discover events that are dynamic in nature. Second, I studied a model that can detect events using data from multiple media sources. I studied the characteristics of different media in terms of reported event times and linguistic patterns. The approaches studied in this thesis are largely based on Bayesian nonparametric methods to deal with steaming data and unpredictable number of clusters. The research will not only serve the event detection problem itself but also shed light into a more general structured clustering problem in spatial, temporal and textual data. Machine Learning Topic Modeling Graphical Models Non-parametric Bayesian Text Mining
14	Interpretable and Scalable Bayesian Models for Advertising and Text Bischof, Jonathan Michael 04 June 2016 (has links) In the era of "big data", scalable statistical inference is necessary to learn from new and growing sources of quantitative information. However, many commercial and scientific applications also require models to be interpretable to end users in order to generate actionable insights about quantities of interest. We present three case studies of Bayesian hierarchical models that improve the interpretability of existing models while also maintaining or improving the efficiency of inference. The first paper is an application to online advertising that presents an augmented regression model interpretable in terms of the amount of revenue a customer is expected to generate over his or her entire relationship with the company---even if complete histories are never observed. The resulting Poisson Process Regression employs a marginal inference strategy that avoids specifying customer-level latent variables used in previous work that complicate inference and interpretability. The second and third papers are applications to the analysis of text data that propose improved summaries of topic components discovered by these mixture models. While the current practice is to summarize topics in terms of their most frequent words, we show significantly greater interpretability in online experiments with human evaluators by using words that are also relatively exclusive to the topic of interest. In the process we develop a new class of topic models that directly regularize the differential usage of words across topics in order to produce stable estimates of the combined frequency-exclusivity metric as well as proposing efficient and parallelizable MCMC inference strategies. / Statistics Statistics Advertising Bayesian statistics Big data Topic modeling
15	Collaborative Communication Interruption Management System (C-CIMS): Modeling Interruption Timings via Prosodic and Topic Modelling for Human-Machine Teams Peters, Nia S. 01 December 2017 (has links) Human-machine teaming aims to meld human cognitive strengths and the unique capabilities of smart machines to create intelligent teams adaptive to rapidly changing circumstances. One major contributor to the problem of human-machine teaming is a lack of communication skills on the part of the machine. The primary objective of this research is focused on a machine’s interruption timings or when a machine should share and communicate information with human teammates within human-machine teaming interactions. Previous work addresses interruption timings from the perspective of single human, multitasking and multiple human, single task interactions. The primary aim of this dissertation is to augment this area by approaching the same problem from the perspective of a multiple human, multitasking interaction. The proposed machine is the Collaborative Communication Interruption Management System (C-CIMS) which is tasked with leveraging speech information from a human-human task and making inferences on when to interrupt with information related to an orthogonal human-machine task. This study and previous literature both suggest monitoring task boundaries and engagement as candidate moments of interruptibility within multiple human, multitasking interactions. The goal then becomes designing an intermediate step between human teammate communication and points of interruptibility within these interactions. The proposed intermediate step is the mapping of low-level speech information such as prosodic and lexical information onto higher constructs indicative of interruptibility. C-CIMS is composed of a Task Boundary Prosody Model, a Task Boundary Topic Model, and finally a Task Engagement Topic Model. Each of these components are evaluated separately in terms of how they perform within two different simulated human-machine teaming scenarios and the speed vs. accuracy tradeoffs as well as other limitations of each module. Overall the Task Boundary Prosody Model is tractable within a real-time system because of the low-latency in processing prosodic information, but is less accurate at predicting task boundaries even within human-machine interactions with simple dialogue. Conversely, the Task Boundary and Task Engagement Topic Models do well inferring task boundaries and engagement respectively, but are intractable in a real-time system because of the bottleneck in producing automatic speech recognition transcriptions to make interruption decisions. The overall contribution of this work is a novel approach to predicting interruptibility within human-machine teams by modeling higher constructs indicative of interruptibility using low-level speech information. human machine teaming interruption management system prosody modeling topic modeling
16	Examination of Gender Bias in News Articles Damin Zhang (11814182) 19 December 2021 (has links) Reading news articles from online sources has become a major choice of obtaining information for many people. Authors who wrote news articles could introduce their own biases either unintentionally or intentionally by using or choosing to use different words to describe otherwise neutral and factual information. Such intentional word choices could create conflicts among different social groups, showing explicit and implicit biases. Any type of biases within the text could affect the reader’s view of the information. One type of biases in natural language is gender bias that had been discovered in a lot of Natural Language Processing (NLP) models, largely attributed to implicit biases in the training text corpora. Analyzing gender bias or stereotypes in such large corpora is a hard task. Previous methods of bias detection were applied to short text like tweets, and to manually built datasets, but little works had been done on long text like news articles in large corpora. Simply detecting bias on annotated text does not help to understand how it was generated and reproduced. Instead, we used structural topic modeling on a large unlabelled corpus of news articles, incorporated qualitative results and quantitative analysis to examine how gender bias was generated and reproduced. This research extends the prior knowledge of bias detection and proposed a method for understanding gender bias in real-world settings. We found that author gender correlated to the topic-gender prevalence and skewed media-gender distribution assist understanding gender bias within news articles. Natural Language Processing Natural Language Processing Topic Modeling Gender Bias
17	Three Essays on Shared Micromobility Rahim-Taleqani, Ali January 2020 (has links) Shared micromobility defines as the shared use of light and low-speed vehicles such as bike and scooter in which users have short-term access on an as-needed basis. As shared micromobility, as one of the most viable and sustainable modes of transportation, has emerged in the U.S. over the last decade., understanding different aspects of these modes of transportation help decision-makers and stakeholders to have better insights into the problems related to these transportation options. Designing efficient and effective shared micromobility programs improves overall system performance, enhances accessibility, and is essential to increase ridership and benefit commuters. This dissertation aims to address three vital aspects of emerging shared micromobility transportation options with three essays that each contribute to the practice and literature of sustainable transportation. Chapter one of this dissertation investigates public opinion towards dockless bikes sharing using a mix of statistical and natural language processing methods. This study finds the underlying topics and the corresponding polarity in public discussion by analyzing tweets to give better insight into the emerging phenomenon across the U.S. Chapter two of this dissertation proposes a new framework for the micromobility network to improve accessibility and reduce operator costs. The framework focuses on highly centralized clubs (known as k-club) as virtual docking hubs. The study suggests an integer programming model and a heuristic approach as well as a cost-benefit analysis of the proposed model. Chapter three of this dissertation address the risk perception of bicycle and scooter riders’ risky behaviors. This study investigates twenty dangerous maneuvers and their corresponding frequency and severity from U.S. resident’s perspective. The resultant risk matrix and regression model provides a clear picture of the public risk perception associated with these two micromobility options. Overall, the research outcomes will provide decision-makers and stakeholders with scientific information, practical implications, and necessary tools that will enable them to offer better and sustainable micromobility services to their residents. graph theory micro mobility sentiment analysis topic modeling
18	Comparing Pso-Based Clustering Over Contextual Vector Embeddings to Modern Topic Modeling Miles, Samuel 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Efficient topic modeling is needed to support applications that aim at identifying main themes from a collection of documents. In this thesis, a reduced vector embedding representation and particle swarm optimization (PSO) are combined to develop a topic modeling strategy that is able to identify representative themes from a large collection of documents. Documents are encoded using a reduced, contextual vector embedding from a general-purpose pre-trained language model (sBERT). A modified PSO algorithm (pPSO) that tracks particle fitness on a dimension-by-dimension basis is then applied to these embeddings to create clusters of related documents. The proposed methodology is demonstrated on three datasets across different domains. The first dataset consists of posts from the online health forum r/Cancer. The second dataset is a collection of NY Times abstracts and is used to compare Particle Swarm Optimization Topic Modeling Vector Embedding Natural Language Processing
19	Fine-Grained Topic Models Using Anchor Words Lund, Jeffrey A. 20 December 2018 (has links) Topic modeling is an effective tool for analyzing the thematic content of large collections of text. However, traditional probabilistic topic modeling is limited to a small number of topics (typically no more than hundreds). We introduce fine-grained topic models, which have large numbers of nuanced and specific topics. We demonstrate that fine-grained topic models enable use cases not currently possible with current topic modeling techniques, including an automatic cross-referencing task in which short passages of text are linked to other topically related passages. We do so by leveraging anchor methods, a recent class of topic model based on non-negative matrix factorization in which each topic is anchored by a single word. We explore extensions of the anchor algorithm, including tandem anchors, which relaxes the restriction that anchors be formed of single words. By doing so, we are able to produce anchor-based topic models with thousands of fine-grained topics. We also develop metrics for evaluating token level topic assignments and use those metrics to improve the accuracy of fine-grained topic models. Topic Modeling Anchor Words Cross-reference Generation Computer Sciences
20	Three Essays on Social Cognition in the Field of Jazz Music: Innis, Benjamin D. January 2022 (has links) Thesis advisor: Mary Ann Glynn / Categories are persistent features of cultural fields and markets, used to delineate boundaries between different kinds of cultural products and cultural producers. Categories are dynamic social constructions, evolving over time as their constitutive practices and meanings change, through a variety of processes that scholars are still describing and unpacking. This dissertation explores, in three papers, the processes through which categories change over time in the context of the field of jazz music, describing mechanisms of category change and theorizing processes of category evolution and decline. The first paper (chapter two) examines the emergence of a novel subcategory of jazz, called bebop, in the mid-1940’s, and the changes to jazz consumption practices and category meanings that bebop’s emergence wrought. It contributes to the categorization literature by highlighting the role of consumption practices in shaping category meanings. The second paper (chapter three) examines the emergence of another subcategory, called jazz fusion, in the 1960’s and 1970’s, and unpacks gatekeeper responses to its emergence in the form of critical discourse, revealing how category gatekeepers codify category change by reordering their standards of value, quality, and category membership through their discourse. It contributes to the literature by showing how gatekeepers discursively modify categories as they make sense of new practices. The third paper (chapter 4) explores the processes through which subcategories are absorbed into broader umbrella categories, falling out of use even as their constitutive practices and meanings live on. This paper contributes to the literature by expanding our understanding of category decline. Overall, this dissertation contributes to literature on category dynamics and the practice turn in organization theory. / Thesis (PhD) — Boston College, 2022. / Submitted to: Boston College. Carroll School of Management. / Discipline: Management and Organization. categorization cultural production grounded theory music practice theory topic modeling

Search results