481 |
A Confirmatory Analysis for Automating the Evaluation of Motivation Letters to Emulate Human JudgmentMercado Salazar, Jorge Anibal, Rana, S M Masud January 2021 (has links)
Manually reading, evaluating, and scoring motivation letters as part of the admissions process is a time-consuming and tedious task for Dalarna University's program managers. An automated scoring system would provide them with relief as well as the ability to make much faster decisions when selecting applicants for admission. The aim of this thesis was to analyse current human judgment and attempt to emulate it using machine learning techniques. We used various topic modelling methods, such as Latent Dirichlet Allocation and Non-Negative Matrix Factorization, to find the most interpretable topics, build a bridge between topics and human-defined factors, and finally evaluate model performance by predicting scoring values and finding accuracy using logistic regression, discriminant analysis, and other classification algorithms. Despite the fact that we were able to discover the meaning of almost all human factors on our own, the topic models' accuracy in predicting overall score was unexpectedly low. Setting a threshold on overall score to select applicants for admission yielded a good overall accuracy result, but did not yield a good consistent precision or recall score. During our investigation, we attempted to determine the possible causes of these unexpected results and discovered that not only is topic modelling limitation to blame, but human bias also plays a role.
|
482 |
Modélisation des stratégies verbales d'engagement dans les interactions humain-agent / Modelling verbal engagement strategies in human-agent interactionGlas, Nadine 13 September 2016 (has links)
Dans une interaction humain-agent, l’engagement de l’utilisateur est un élément essentiel pour atteindre l’objectif de l’interaction. Dans cette thèse, nous étudions comment l’engagement de l’utilisateur pourrait être favorisé par le comportement de l’agent. Nous nous concentrons sur les stratégies de comportement verbal de l’agent qui concernent respectivement la forme, le timing et le contenu de ses énoncés. Nous présentons des études empiriques qui concernent certains aspects du comportement de politesse de l’agent, du comportement d’interruption de l’agent, et les sujets de conversation que l’agent adresse lors de l’interaction. Basé sur les résultats de la dernière étude, nous proposons un Gestionnaire de Sujets axé sur l’engagement (modèle computationnel) qui personnalise les sujets d’une interaction dans des conversations où l’agent donne des informations à un utilisateur humain. Le Modèle de Sélection des Sujets du Gestionnaire de Sujets décide sur quoi l’agent devrait parler et quand. Pour cela, il prend en compte la perception par l’agent de l’utilisateur, qui est dynamiquement mis à jour, ainsi que l’état mental et les préférences de l’agent. Le Modèle de Transition de Sujets du Gestionnaire de Sujet, basé sur une étude empirique, calcule comment l’agent doit présenter les sujets dans l’interaction en cours sans perdre la cohérence de l’interaction. Nous avons implémenté et évalué le Gestionnaire de Sujets dans un agent virtuel conversationnel qui joue le rôle d’un visiteur dans un musée. / In human-agent interaction the engagement of the user is an essential aspect to complete the goal of the interaction. In this thesis we study how the user’s engagement could be favoured by the agent’s behaviour. We thereby focus on the agent’s verbal behaviour considering strategies that regard respectively the form, timing, and content of utterances : We present empirical studies that regard (aspects of) the agent’s politeness behaviour, interruption behaviour, and the topics that the agent addresses in the interaction. Based on the outcomes of the latter study we propose an engagement-driven Topic Manager (computational model) that personalises the topics of an interaction in human-agent information-giving chat. The Topic Selection component of the Topic Manager decides what the agent should talk about and when. For this it takes into account the agent’s dynamically updated perception of the user as well as the agent’s own mental state. The Topic Transition component of the Topic Manager, based upon an empirical study, computes how the agent should introduce the topics in the ongoing interaction without loosing the coherence of the interaction. We implemented and evaluated the Topic Manager in a conversational virtual agent that plays the role of a visitor in amuseum.
|
483 |
Evaluating Hierarchical LDA Topic Models for Article CategorizationLindgren, Jennifer January 2020 (has links)
With the vast amount of information available on the Internet today, helping users find relevant content has become a prioritized task in many software products that recommend news articles. One such product is Opera for Android, which has a news feed containing articles the user may be interested in. In order to easily determine what articles to recommend, they can be categorized by the topics they contain. One approach of categorizing articles is using Machine Learning and Natural Language Processing (NLP). A commonly used model is Latent Dirichlet Allocation (LDA), which finds latent topics within large datasets of for example text articles. An extension of LDA is hierarchical Latent Dirichlet Allocation (hLDA) which is an hierarchical variant of LDA. In hLDA, the latent topics found among a set of articles are structured hierarchically in a tree. Each node represents a topic, and the levels represent different levels of abstraction in the topics. A further extension of hLDA is constrained hLDA, where a set of predefined, constrained topics are added to the tree. The constrained topics are extracted from the dataset by grouping highly correlated words. The idea of constrained hLDA is to improve the topic structure derived by a hLDA model by making the process semi-supervised. The aim of this thesis is to create a hLDA and a constrained hLDA model from a dataset of articles provided by Opera. The models should then be evaluated using the novel metric word frequency similarity, which is a measure of the similarity between the words representing the parent and child topics in a hierarchical topic model. The results show that word frequency similarity can be used to evaluate whether the topics in a parent-child topic pair are too similar, so that the child does not specify a subtopic of the parent. It can also be used to evaluate if the topics are too dissimilar, so that the topics seem unrelated and perhaps should not be connected in the hierarchy. The results also show that the two topic models created had comparable word frequency similarity scores. None of the models seemed to significantly outperform the other with regard to the metric.
|
484 |
基於意見探勘與主題模型之部落格食記剖析研究 / A Study of Opinion Mining and Topic Model Analysis on Food Diaries賴柏帆, Lai, Po Fan Unknown Date (has links)
隨著Web 2.0興起,社群網站在資訊傳遞與獲取所占比重相當高。以美食領域來看,人們在進餐廳前先行閱覽食記評論之情形越來越常見,而部落格文章因圖文並茂,常被消費者列入參考比較之來源。儘管這一類食記內容相對短篇食評來說較為完整,但評論分散於文章中,且多半沒有評分可供參考,讀者很難在第一時間獲悉評論樣貌,得花上一番心力進行閱覽,才能對餐廳整體有所評鑑。
本研究提出一套基於意見探勘與主題模型的食記剖析方法,由部落格中各餐廳貼文情緒量來反映正負面評價,將提及評論歸納為「食物」、「服務」及「環境」三個評分面向,進而提供該家餐廳的整體推薦分數,供讀者快速參閱之。實驗語料自痞客邦美食類貼文中選定添好運台灣-台北站前店、京星港式飲茶PART2、金泰日式料理-內湖店以及喀佈貍(一店)大眾和風串燒居酒洋食堂,合計4家餐廳與200篇語料。
透過LDA主題模型對食記敘述進行主題式分群,使擁有相近主題概念的句子分為一群,並歸類至各面向,例如喀佈貍(一店)之語料可分為10群主題語句,食物面向上有6群,服務與環境面向各為2群。另一方面,為了更有效辨別食記中含有的正負向情緒,本研究透過語意導向方法(SO-PMI)來計算食記中常出現情緒詞彙之極性,以建置該領域的意見詞詞庫。
實驗結果方面,以線上餐廳評論網站-iPeen愛評網作為驗證對象,顯示其語料的平均情緒量相近,於大眾觀感與評價上傾向一致,且相較一般評論網站,本研究能從較細微的面向來切入,並以情緒量反映真實的餐廳評價。最後提出未來欲探討與改善之處,供後續研究參考之。 / As the time of Web 2.0 rise, social media platform plays a crucial role in transferring and receiving information. More and more people get used to reading the related posts before having meal. Because of its richness in content and referring photographs, blog posts are most frequently used for reference. Although the blog posts are more complete regarding their content than other short reviews, the actual reviews are scattered among words that are simply descriptions, and there are no grading scale to take as reference. These all together gives the reader a hard time to efficiently organize the overview of the review, and for them to, therefore, make the decision if they should go to the restaurant.
Our study offers a method of analyzing food diaries based on opinion mining and topic model. The scale of emotion in a blog post about a restaurant is used as the reflection of its review's positive or negative. The comments are categorized into food, service and environment. And the restaurant will be graded based on these three aspects to further provide the user an overall score of recommendation.
We collected total of 200 articles written on 4 restaurants in PIXNET, then categorized the contents using LDA (Latent Dirichlet Allocation) model base on their theme. The sentences with similar theme with be put into a group, then be further categorized to the three aspects that was mentioned earlier. On the other hand, to better distinguish if the emotion in certain food diary is positive or negative, our study calculated the polarity of common opinion-based words in food diaries using semantic orientation (SO-PMI), and built an opinion corpus specifically for food diaries.
In terms of the result, using iPeen, a restaurant rating website, as test reference, it shows that the average scales of opinion of the restaurants we got using our method are close to iPeen, which in this case we can say are close to the public opinion and review. Furthermore, compare to common rating website, our study touches on even the minute aspect, and use the cumulative opinion to reflect the true blog authors' evaluation of the restaurant. Lastly, we would like to bring up what we intend to discuss and improve in the future for upcoming research's reference.
|
485 |
AppReco: 基於行為識別的行動應用服務推薦系統 / AppReco: Behavior-aware Recommendation for iOS Mobile Applications方子睿, Fang, Zih Ruei Unknown Date (has links)
在現在的社會裡,手機應用程式已經被人們接受與廣泛地利用,然而目前市面上的手機 App 推薦系統,多以使用者實際使用與回報作為參考,若有惡意行為軟體,在使用者介面後竊取使用者資料,這些推薦系統是難以查知其行為的,因此我們提出了 AppReco,一套可以系統化的推薦 iOS App 的推薦系統,而且不需要使用者去實際操作、執行 App。
整個分析流程包括三個步驟:(1) 透過無監督式學習法的隱含狄利克雷分布(Latent Dirichlet Allocation, LDA)做出主題模型,再使用增長層級式自我組織映射圖(Growing Hierarchical Self-Organizing Map, GHSOM)進行分群。(2)使用靜態分析程式碼,去找出其應用程式所執行的行為。(3)透過我們的評分公式對於這些 App,進行評分。
在分群 App 方面,AppReco 使用這些應用程式的官方敘述來進行分群,讓擁有類似屬性的手機應用程式群聚在一起;在檢視 App 方面,AppReco 透過靜態分析這些 App 的程式碼,來計算其使用行為的多寡;在推薦 App 方面,AppReco 分析類似屬性的 App 與其執行的行為,最後推薦使用者使用較少敏感行為(如使用廣告、使用個人資料、使用社群軟體開發包等)的 App。
而本研究使用在 Apple App Store 上面數千個在各個類別中的前兩百名 App 做為我們的實驗資料集來進行實驗。 / Mobile applications have been widely used in life and become dominant software applications nowadays. However there are lack of systematic recommendation systems that can be leveraged in advance without users’ evaluations. We present AppReco, a systematic recommendation system of iOS mobile applications that can evaluate mobile applications without executions.
AppReco evaluates apps that have similar interests with static binary analysis, revealing their behaviors according to the embedded functions in the executable. The analysis consists of three stages: (1) unsupervised learning on app descriptions with Latent Dirichlet Allocation for topic discovery and Growing Hierarchical Self-organizing Maps for hierarchical clustering, (2) static binary analysis on executables to discover embedded system calls and (3) ranking common-topic applications from their matched behavior patterns.
To find apps that have similar interests, AppReco discovers (unsupervised) topics in official descriptions and clusters apps that have common topics as similar-interest apps. To evaluate apps, AppReco adopts static binary analysis on their executables to count invoked system calls and reveal embedded functions. To recommend apps, AppReco analyzes similar-interest apps with their behaviors of executables, and recommend apps that have less sensitive behaviors such as commercial advertisements, privacy information access, and internet connections, to users.
We report our analysis against thousands of iOS apps in the Apple app store including most of the listed top 200 applications in each category.
|
486 |
Price, Perceived Value and Customer Satisfaction: A Text-Based Econometric Analysis of Yelp! ReviewsDwyer, Eleanor A 01 January 2015 (has links)
We examine the antecedents of customer satisfaction in the restaurant sector, paying particular attention to perceived value and price level. Using Latent Dirichlet Allocation, we extract latent topics from the text of Yelp! reviews, then analyze the relationship between these topics and satisfaction, measured as the difference between review rating and user average review rating.
|
487 |
Empirical testing of a conceptual model to evaluate psychoeducational interventions.Sidani, Souraya. January 1994 (has links)
Psychoeducational interventions are designed to assist clients to learn about their condition, to enhance their self-care practices, to promote well-being and prevent complications and to ultimately maintain or improve their life quality. Although results of individual and of meta-analytic studies supported the beneficial effects of psychoeducational interventions on multiple health-related outcomes for various client population, investigators expressed concerns regarding the quality of single-study reports. The most important criticism is the lack of explicit reference to a theoretical model guiding the design of the study, the selection of expected outcomes of the interventions, and lack of explicitly stated causal linkages between interventions and outcomes. In this research project, a comprehensive framework was developed and empirically tested as a model for evaluating the effectiveness of psychoeducational interventions, namely self-help classes, uncertainty management, and a combined intervention. Direct and moderating effects of extraneous variables (personal characteristics, severity of illness and resources), intervening variable (state anxiety) and intervention variables (components of psychoeducation and strength of intervention) on outcome variables (cognitive, behavioral, psychological and quality of life) were hypothesized. An experimental repeated measures design was used to test the hypothesized effects. Fifty-six women with breast cancer receiving adjuvant therapy were randomly assigned to one of the experimental groups. Data were collected at six points in time. Hierarchical linear modeling approach was used to analyze the data. Results indicated that although the interventions were effective in producing desired changes in selected outcomes, their effects were moderated by various extraneous and intervening variables. Education, sense of mastery, symptom extension, work status, size and use of social support strengthened the effects of the interventions, while trait anxiety, marital status, and number of symptoms experienced weakened the effects of the interventions on cognitive, behavioral, and psychological outcomes. Based on these findings, clinicians are encouraged to attend to the mode of delivery, intensity, and timing for implementation of the intervention, and to the characteristics of the intervener and clients, when planning, implementing, and evaluating psychoeducational interventions.
|
488 |
Discovering Hidden Networks Using Topic ModelingCooper, Wyatt 01 January 2017 (has links)
This paper explores topic modeling via unsupervised non-negative matrix factorization. This technique is used on a variety of sources in order to extract salient topics. From these topics, hidden entity networks are discovered and visualized in a graph representation. In addition, other visualization techniques such as examining the time series of a topic and examining the top words of a topic are used for evaluation and analysis. There is a large software component to this project, and so this paper will also focus on the design decisions that were made in order to make the program developed as versatile and extensible as possible.
|
489 |
Violence gratuite et adolescents-bourreaux : Réception, traduction et enjeux de deux romans suédois pour adolescents, en France, au début des années 2000 / "Unprovoked violence" and "nasty adolescents" : Reception, translation and challenges of two Swedish novels for adolescents in France in the early 2000sAlfvén, Valérie January 2016 (has links)
The purpose of this thesis is to contribute to a better understanding of the role of Swedish literature for adolescents in the French literary scene in the early 2000s. The sociology of literature constitutes the main theoretical framework of this thesis. Drawing from examples that broach the sensitive topic of "unprovoked violence" as it is treated in two Swedish novels for teenagers, Spelar död [Play Death] by Stefan Casta and När tågen går förbi (Train Wreck) by Malin Lindroth, this thesis shows how these novels are innovative in Even-Zohar’s sense of the term, as addressed in his Polysystem Theory (1990). By introducing "unprovoked violence" and violent teenagers via a realistic genre, such works filled a vacuum in the French system and injected a new dynamic into it. This dynamic makes it possible for new literary models to be introduced in the system and to change the standards of that system. The analyses of the French and Swedish receptions of the two novels mentioned above show that they gave rise to a moral panic in France, which is not an unusual thing to happen in periods of ongoing change. This also clarifies the differences in norms between the two systems. The French system tends to reject dark topics, while the Swedish wishes to discuss them. The investigations of the translations of unprovoked violence show that adherence to Swedish norms determine the translation’s adequacy (Toury), which may be part of the reason for the stormy reception the two works received in France, and their undergoing censure. The position of translators and publishers in the literary system also plays a major role for a translated text not being censured during the transfer from one system to another. Even if the Swedish titles translated into French are few, this thesis shows that the impact of Swedish literature on adolescents in France is certain. By introducing new and sensitive topics, such novels could be early markers of an evolution of the French field of literature for adolescents.
|
490 |
Mathematical Modeling of Public Opinion using Traditional and Social MediaCody, Emily 01 January 2016 (has links)
With the growth of the internet, data from text sources has become increasingly available to researchers in the form of online newspapers, journals, and blogs. This data presents a unique opportunity to analyze human opinions and behaviors without soliciting the public explicitly. In this research, I utilize newspaper articles and the social media service Twitter to infer self-reported public opinions and awareness of climate change. Climate change is one of the most important and heavily debated issues of our time, and analyzing large-scale text surrounding this issue reveals insights surrounding self-reported public opinion. First, I inquire about public discourse on both climate change and energy system vulnerability following two large hurricanes. I apply topic modeling techniques to a corpus of articles about each hurricane in order to determine how these topics were reported on in the post event news media. Next, I perform sentiment analysis on a large collection of data from Twitter using a previously developed tool called the "hedonometer". I use this sentiment scoring technique to investigate how the Twitter community reports feeling about climate change. Finally, I generalize the sentiment analysis technique to many other topics of global importance, and compare to more traditional public opinion polling methods. I determine that since traditional public opinion polls have limited reach and high associated costs, text data from Twitter may be the future of public opinion polling.
|
Page generated in 0.0537 seconds