Global ETD Search

481	Evaluating Hierarchical LDA Topic Models for Article Categorization Lindgren, Jennifer January 2020 (has links) With the vast amount of information available on the Internet today, helping users find relevant content has become a prioritized task in many software products that recommend news articles. One such product is Opera for Android, which has a news feed containing articles the user may be interested in. In order to easily determine what articles to recommend, they can be categorized by the topics they contain. One approach of categorizing articles is using Machine Learning and Natural Language Processing (NLP). A commonly used model is Latent Dirichlet Allocation (LDA), which finds latent topics within large datasets of for example text articles. An extension of LDA is hierarchical Latent Dirichlet Allocation (hLDA) which is an hierarchical variant of LDA. In hLDA, the latent topics found among a set of articles are structured hierarchically in a tree. Each node represents a topic, and the levels represent different levels of abstraction in the topics. A further extension of hLDA is constrained hLDA, where a set of predefined, constrained topics are added to the tree. The constrained topics are extracted from the dataset by grouping highly correlated words. The idea of constrained hLDA is to improve the topic structure derived by a hLDA model by making the process semi-supervised. The aim of this thesis is to create a hLDA and a constrained hLDA model from a dataset of articles provided by Opera. The models should then be evaluated using the novel metric word frequency similarity, which is a measure of the similarity between the words representing the parent and child topics in a hierarchical topic model. The results show that word frequency similarity can be used to evaluate whether the topics in a parent-child topic pair are too similar, so that the child does not specify a subtopic of the parent. It can also be used to evaluate if the topics are too dissimilar, so that the topics seem unrelated and perhaps should not be connected in the hierarchy. The results also show that the two topic models created had comparable word frequency similarity scores. None of the models seemed to significantly outperform the other with regard to the metric. topic modeling topic models lda latent dirichlet allocation hlda hierarchical latent dirichlet allocation constrained lda constrained latent dirichlet allocation news articles categorization machine learning natural language processing nlp news recommendations
482	基於意見探勘與主題模型之部落格食記剖析研究 / A Study of Opinion Mining and Topic Model Analysis on Food Diaries 賴柏帆, Lai, Po Fan Unknown Date (has links) 隨著Web 2.0興起，社群網站在資訊傳遞與獲取所占比重相當高。以美食領域來看，人們在進餐廳前先行閱覽食記評論之情形越來越常見，而部落格文章因圖文並茂，常被消費者列入參考比較之來源。儘管這一類食記內容相對短篇食評來說較為完整，但評論分散於文章中，且多半沒有評分可供參考，讀者很難在第一時間獲悉評論樣貌，得花上一番心力進行閱覽，才能對餐廳整體有所評鑑。本研究提出一套基於意見探勘與主題模型的食記剖析方法，由部落格中各餐廳貼文情緒量來反映正負面評價，將提及評論歸納為「食物」、「服務」及「環境」三個評分面向，進而提供該家餐廳的整體推薦分數，供讀者快速參閱之。實驗語料自痞客邦美食類貼文中選定添好運台灣－台北站前店、京星港式飲茶PART2、金泰日式料理－內湖店以及喀佈貍（一店）大眾和風串燒居酒洋食堂，合計4家餐廳與200篇語料。透過LDA主題模型對食記敘述進行主題式分群，使擁有相近主題概念的句子分為一群，並歸類至各面向，例如喀佈貍（一店）之語料可分為10群主題語句，食物面向上有6群，服務與環境面向各為2群。另一方面，為了更有效辨別食記中含有的正負向情緒，本研究透過語意導向方法(SO-PMI)來計算食記中常出現情緒詞彙之極性，以建置該領域的意見詞詞庫。實驗結果方面，以線上餐廳評論網站－iPeen愛評網作為驗證對象，顯示其語料的平均情緒量相近，於大眾觀感與評價上傾向一致，且相較一般評論網站，本研究能從較細微的面向來切入，並以情緒量反映真實的餐廳評價。最後提出未來欲探討與改善之處，供後續研究參考之。 / As the time of Web 2.0 rise, social media platform plays a crucial role in transferring and receiving information. More and more people get used to reading the related posts before having meal. Because of its richness in content and referring photographs, blog posts are most frequently used for reference. Although the blog posts are more complete regarding their content than other short reviews, the actual reviews are scattered among words that are simply descriptions, and there are no grading scale to take as reference. These all together gives the reader a hard time to efficiently organize the overview of the review, and for them to, therefore, make the decision if they should go to the restaurant. Our study offers a method of analyzing food diaries based on opinion mining and topic model. The scale of emotion in a blog post about a restaurant is used as the reflection of its review's positive or negative. The comments are categorized into food, service and environment. And the restaurant will be graded based on these three aspects to further provide the user an overall score of recommendation. We collected total of 200 articles written on 4 restaurants in PIXNET, then categorized the contents using LDA (Latent Dirichlet Allocation) model base on their theme. The sentences with similar theme with be put into a group, then be further categorized to the three aspects that was mentioned earlier. On the other hand, to better distinguish if the emotion in certain food diary is positive or negative, our study calculated the polarity of common opinion-based words in food diaries using semantic orientation (SO-PMI), and built an opinion corpus specifically for food diaries. In terms of the result, using iPeen, a restaurant rating website, as test reference, it shows that the average scales of opinion of the restaurants we got using our method are close to iPeen, which in this case we can say are close to the public opinion and review. Furthermore, compare to common rating website, our study touches on even the minute aspect, and use the cumulative opinion to reflect the true blog authors' evaluation of the restaurant. Lastly, we would like to bring up what we intend to discuss and improve in the future for upcoming research's reference. 意見探勘 LDA 主題模型餐廳評分 Opinion Mining LDA Topic Model Restaurant Rating
483	AppReco: 基於行為識別的行動應用服務推薦系統 / AppReco: Behavior-aware Recommendation for iOS Mobile Applications 方子睿, Fang, Zih Ruei Unknown Date (has links) 在現在的社會裡，手機應用程式已經被人們接受與廣泛地利用，然而目前市面上的手機 App 推薦系統，多以使用者實際使用與回報作為參考，若有惡意行為軟體，在使用者介面後竊取使用者資料，這些推薦系統是難以查知其行為的，因此我們提出了 AppReco，一套可以系統化的推薦 iOS App 的推薦系統，而且不需要使用者去實際操作、執行 App。整個分析流程包括三個步驟：(1) 透過無監督式學習法的隱含狄利克雷分布(Latent Dirichlet Allocation, LDA)做出主題模型，再使用增長層級式自我組織映射圖(Growing Hierarchical Self-Organizing Map, GHSOM)進行分群。(2)使用靜態分析程式碼，去找出其應用程式所執行的行為。(3)透過我們的評分公式對於這些 App，進行評分。在分群 App 方面，AppReco 使用這些應用程式的官方敘述來進行分群，讓擁有類似屬性的手機應用程式群聚在一起；在檢視 App 方面，AppReco 透過靜態分析這些 App 的程式碼，來計算其使用行為的多寡；在推薦 App 方面，AppReco 分析類似屬性的 App 與其執行的行為，最後推薦使用者使用較少敏感行為(如使用廣告、使用個人資料、使用社群軟體開發包等)的 App。而本研究使用在 Apple App Store 上面數千個在各個類別中的前兩百名 App 做為我們的實驗資料集來進行實驗。 / Mobile applications have been widely used in life and become dominant software applications nowadays. However there are lack of systematic recommendation systems that can be leveraged in advance without users’ evaluations. We present AppReco, a systematic recommendation system of iOS mobile applications that can evaluate mobile applications without executions. AppReco evaluates apps that have similar interests with static binary analysis, revealing their behaviors according to the embedded functions in the executable. The analysis consists of three stages: (1) unsupervised learning on app descriptions with Latent Dirichlet Allocation for topic discovery and Growing Hierarchical Self-organizing Maps for hierarchical clustering, (2) static binary analysis on executables to discover embedded system calls and (3) ranking common-topic applications from their matched behavior patterns. To find apps that have similar interests, AppReco discovers (unsupervised) topics in official descriptions and clusters apps that have common topics as similar-interest apps. To evaluate apps, AppReco adopts static binary analysis on their executables to count invoked system calls and reveal embedded functions. To recommend apps, AppReco analyzes similar-interest apps with their behaviors of executables, and recommend apps that have less sensitive behaviors such as commercial advertisements, privacy information access, and internet connections, to users. We report our analysis against thousands of iOS apps in the Apple app store including most of the listed top 200 applications in each category. 推薦系統手機應用程式主題模型 Recommender System Mobile Application Topic Model
484	Price, Perceived Value and Customer Satisfaction: A Text-Based Econometric Analysis of Yelp! Reviews Dwyer, Eleanor A 01 January 2015 (has links) We examine the antecedents of customer satisfaction in the restaurant sector, paying particular attention to perceived value and price level. Using Latent Dirichlet Allocation, we extract latent topics from the text of Yelp! reviews, then analyze the relationship between these topics and satisfaction, measured as the difference between review rating and user average review rating. LDA price perception perceived value latent topic yelp! text analysis review customer satisfaction Behavioral Economics Econometrics Mathematics
485	Empirical testing of a conceptual model to evaluate psychoeducational interventions. Sidani, Souraya. January 1994 (has links) Psychoeducational interventions are designed to assist clients to learn about their condition, to enhance their self-care practices, to promote well-being and prevent complications and to ultimately maintain or improve their life quality. Although results of individual and of meta-analytic studies supported the beneficial effects of psychoeducational interventions on multiple health-related outcomes for various client population, investigators expressed concerns regarding the quality of single-study reports. The most important criticism is the lack of explicit reference to a theoretical model guiding the design of the study, the selection of expected outcomes of the interventions, and lack of explicitly stated causal linkages between interventions and outcomes. In this research project, a comprehensive framework was developed and empirically tested as a model for evaluating the effectiveness of psychoeducational interventions, namely self-help classes, uncertainty management, and a combined intervention. Direct and moderating effects of extraneous variables (personal characteristics, severity of illness and resources), intervening variable (state anxiety) and intervention variables (components of psychoeducation and strength of intervention) on outcome variables (cognitive, behavioral, psychological and quality of life) were hypothesized. An experimental repeated measures design was used to test the hypothesized effects. Fifty-six women with breast cancer receiving adjuvant therapy were randomly assigned to one of the experimental groups. Data were collected at six points in time. Hierarchical linear modeling approach was used to analyze the data. Results indicated that although the interventions were effective in producing desired changes in selected outcomes, their effects were moderated by various extraneous and intervening variables. Education, sense of mastery, symptom extension, work status, size and use of social support strengthened the effects of the interventions, while trait anxiety, marital status, and number of symptoms experienced weakened the effects of the interventions on cognitive, behavioral, and psychological outcomes. Based on these findings, clinicians are encouraged to attend to the mode of delivery, intensity, and timing for implementation of the intervention, and to the characteristics of the intervener and clients, when planning, implementing, and evaluating psychoeducational interventions. Psychology. Dissertations, Academic. Patient Education as Topic. Attitude to Health. Breast Neoplasms -- psychology. Treatment Outcome. Quality of Life.
486	Discovering Hidden Networks Using Topic Modeling Cooper, Wyatt 01 January 2017 (has links) This paper explores topic modeling via unsupervised non-negative matrix factorization. This technique is used on a variety of sources in order to extract salient topics. From these topics, hidden entity networks are discovered and visualized in a graph representation. In addition, other visualization techniques such as examining the time series of a topic and examining the top words of a topic are used for evaluation and analysis. There is a large software component to this project, and so this paper will also focus on the design decisions that were made in order to make the program developed as versatile and extensible as possible. Topic Modeling Computer Science Natural Language Processing Non-negative Matrix Factorization Artificial Intelligence and Robotics Other Computer Sciences Software Engineering
487	Violence gratuite et adolescents-bourreaux : Réception, traduction et enjeux de deux romans suédois pour adolescents, en France, au début des années 2000 / "Unprovoked violence" and "nasty adolescents" : Reception, translation and challenges of two Swedish novels for adolescents in France in the early 2000s Alfvén, Valérie January 2016 (has links) The purpose of this thesis is to contribute to a better understanding of the role of Swedish literature for adolescents in the French literary scene in the early 2000s. The sociology of literature constitutes the main theoretical framework of this thesis. Drawing from examples that broach the sensitive topic of "unprovoked violence" as it is treated in two Swedish novels for teenagers, Spelar död [Play Death] by Stefan Casta and När tågen går förbi (Train Wreck) by Malin Lindroth, this thesis shows how these novels are innovative in Even-Zohar’s sense of the term, as addressed in his Polysystem Theory (1990). By introducing "unprovoked violence" and violent teenagers via a realistic genre, such works filled a vacuum in the French system and injected a new dynamic into it. This dynamic makes it possible for new literary models to be introduced in the system and to change the standards of that system. The analyses of the French and Swedish receptions of the two novels mentioned above show that they gave rise to a moral panic in France, which is not an unusual thing to happen in periods of ongoing change. This also clarifies the differences in norms between the two systems. The French system tends to reject dark topics, while the Swedish wishes to discuss them. The investigations of the translations of unprovoked violence show that adherence to Swedish norms determine the translation’s adequacy (Toury), which may be part of the reason for the stormy reception the two works received in France, and their undergoing censure. The position of translators and publishers in the literary system also plays a major role for a translated text not being censured during the transfer from one system to another. Even if the Swedish titles translated into French are few, this thesis shows that the impact of Swedish literature on adolescents in France is certain. By introducing new and sensitive topics, such novels could be early markers of an evolution of the French field of literature for adolescents. Adolescent literature censorship France and Sweden norms Polysystem Theory reception role of the translator sensitive topic taboo translation unprovoked violence Young adult literature
488	Mathematical Modeling of Public Opinion using Traditional and Social Media Cody, Emily 01 January 2016 (has links) With the growth of the internet, data from text sources has become increasingly available to researchers in the form of online newspapers, journals, and blogs. This data presents a unique opportunity to analyze human opinions and behaviors without soliciting the public explicitly. In this research, I utilize newspaper articles and the social media service Twitter to infer self-reported public opinions and awareness of climate change. Climate change is one of the most important and heavily debated issues of our time, and analyzing large-scale text surrounding this issue reveals insights surrounding self-reported public opinion. First, I inquire about public discourse on both climate change and energy system vulnerability following two large hurricanes. I apply topic modeling techniques to a corpus of articles about each hurricane in order to determine how these topics were reported on in the post event news media. Next, I perform sentiment analysis on a large collection of data from Twitter using a previously developed tool called the "hedonometer". I use this sentiment scoring technique to investigate how the Twitter community reports feeling about climate change. Finally, I generalize the sentiment analysis technique to many other topics of global importance, and compare to more traditional public opinion polling methods. I determine that since traditional public opinion polls have limited reach and high associated costs, text data from Twitter may be the future of public opinion polling. environmental communications human behavior opinion polling sentiment analysis social media topic modeling Applied Mathematics Climate Social and Behavioral Sciences
489	Modeling Mortality Rates In The WikiLeaks Afghanistan War Logs Rusch, Thomas, Hofmarcher, Paul, Hatzinger, Reinhold, Hornik, Kurt 09 1900 (has links) (PDF) The WikiLeaks Afghanistan war logs contain more than 76 000 reports about fatalities and their circumstances in the US led Afghanistan war, covering the period from January 2004 to December 2009. In this paper we use those reports to build statistical models to help us understand the mortality rates associated with specific circumstances. We choose an approach that combines Latent Dirichlet Allocation (LDA) with negative binomial based recursive partitioning. LDA is used to process the natural language information contained in each report summary. We estimate latent topics and assign each report to one of them. These topics - in addition to other variables in the data set - subsequently serve as explanatory variables for modeling the number of fatalities of the civilian population, ISAF Forces, Anti-Coalition Forces and the Afghan National Police or military as well as the combined number of fatalities. Modeling is carried out with manifest mixtures of negative binomial distributions estimated with model-based recursive partitioning. For each group of fatalities, we identify segments with different mortality rates that correspond to a small number of topics and other explanatory variables as well as their interactions. Furthermore, we carve out the similarities between segments and connect them to stories that have been covered in the media. This provides an unprecedented description of the war in Afghanistan covered by the war logs. Additionally, our approach can serve as an example as to how modern statistical methods may lead to extra insight if applied to problems of data journalism. (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
490	Genre, gender and nation : ideological and intertextual representation in contemporary Arthurian fiction for children Cook, Adele M. January 2014 (has links) Within late twentieth and early twenty-first century children’s literature there is a significant interest amongst authors and readers for material which recreates the Arthurian myth. Many of these draw on medieval texts, and the canonical texts of the English tradition have been particularly influential. Yet within this intertextual discourse the influence of the Victorian works is noticeable. This thesis explores the relationship between contemporary children’s Arthuriana and the gendered and national ideologies of these earlier works. Using feminist critical discourse analysis, it discusses the evolution of Arthuriana for the child reader, with a particular focus on four contemporary texts: Michael Morpurgo’s (1994) Arthur, High King of Britain, Mary Hoffman’s (2000) Women of Camelot: Queens and Enchantresses at the Court of King Arthur, Diana Wynne Jones’ (1993) Hexwood and the BBC series Merlin (2008-2012). Exploring the historicist and fantasy genres opens up a discourse surrounding the psychology of myth which within the context of Arthurian literature creates a sense of a universal ‘truth’. This work reveals that authorial intent, in both historicist and fantasy narratives, is often undercut by implicit ideologies which reveal unconscious cultural assumptions. The cultural context at the time of textual production and consumption affects the representations of both the ideologies of gender and nation and yet the authority of myth and history combine to create a regressive depiction more in keeping with literature from the Victorian and post-World War II eras. This is explored through a review of the literature for children available since the Age of Reason, and the didactic model which has been prevalent throughout the Arthurian genre. This thesis explores why a regressive representation is appealing within a twenty-first century discourse through an engagement with theories of feminism(s) and postfeminism. This thesis ascertains why the psychology of myth affects the reimagining of Arthuriana, and explores the retrospective nature of intertextuality in order to reflect on the trend for regressive representations in children’s Arthurian literature. 823

Search results