• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 79
  • 7
  • 4
  • 3
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 137
  • 137
  • 43
  • 41
  • 37
  • 35
  • 33
  • 29
  • 26
  • 25
  • 24
  • 23
  • 23
  • 22
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

Neural probabilistic topic modeling of short and messy text / Neuronprobabilistisk ämnesmodellering av kort och stökig text

Harrysson, Mattias January 2016 (has links)
Exploring massive amount of user generated data with topics posits a new way to find useful information. The topics are assumed to be “hidden” and must be “uncovered” by statistical methods such as topic modeling. However, the user generated data is typically short and messy e.g. informal chat conversations, heavy use of slang words and “noise” which could be URL’s or other forms of pseudo-text. This type of data is difficult to process for most natural language processing methods, including topic modeling. This thesis attempts to find the approach that objectively give the better topics from short and messy text in a comparative study. The compared approaches are latent Dirichlet allocation (LDA), Re-organized LDA (RO-LDA), Gaussian Mixture Model (GMM) with distributed representation of words, and a new approach based on previous work named Neural Probabilistic Topic Modeling (NPTM). It could only be concluded that NPTM have a tendency to achieve better topics on short and messy text than LDA and RO-LDA. GMM on the other hand could not produce any meaningful results at all. The results are less conclusive since NPTM suffers from long running times which prevented enough samples to be obtained for a statistical test. / Att utforska enorma mängder användargenererad data med ämnen postulerar ett nytt sätt att hitta användbar information. Ämnena antas vara “gömda” och måste “avtäckas” med statistiska metoder såsom ämnesmodellering. Dock är användargenererad data generellt sätt kort och stökig t.ex. informella chattkonversationer, mycket slangord och “brus” som kan vara URL:er eller andra former av pseudo-text. Denna typ av data är svår att bearbeta för de flesta algoritmer i naturligt språk, inklusive ämnesmodellering. Det här arbetet har försökt hitta den metod som objektivt ger dem bättre ämnena ur kort och stökig text i en jämförande studie. De metoder som jämfördes var latent Dirichlet allocation (LDA), Re-organized LDA (RO-LDA), Gaussian Mixture Model (GMM) with distributed representation of words samt en egen metod med namnet Neural Probabilistic Topic Modeling (NPTM) baserat på tidigare arbeten. Den slutsats som kan dras är att NPTM har en tendens att ge bättre ämnen på kort och stökig text jämfört med LDA och RO-LDA. GMM lyckades inte ge några meningsfulla resultat alls. Resultaten är mindre bevisande eftersom NPTM har problem med långa körtider vilket innebär att tillräckligt många stickprov inte kunde erhållas för ett statistiskt test.
112

The impact of sentiment and misinformation cycling through the social media platform, Twitter, during the initial phase of the COVID-19 vaccine rollout

Burwell, Emily Grace 01 June 2022 (has links)
No description available.
113

Exploring Hybrid Topic Based Sentiment Analysis as Author Identification Method on Swedish Documents

Jakob, Bremer January 2021 (has links)
The Swedish national bank has had shifting policies when it comes to publicity and confidentiality concerning publishing of texts within the bank. For some time, texts written by commissioners within the bank were decided to be published anonymously. Later they revoked the confidentiality policy, publishing all documents publicly again. This led to emerged interests in possible shifting attitudes toward topics discussed by the commissioners when writing anonymously versus publicly. On a request, based on the interests, there are ongoing analyses being conducted with the help of language technology where topics are extracted from the anonymous and public documents respectively. The aim is to find topics related to individual commissioners with the purpose of, as accurately as possible, identifying which of the anonymous documents is written by who. To discover unique relations between the commissioners and the generated topics, this thesis proposes hybrid topic based sentiment analysis as an author identification method to be able to use sentiments of topics as identifying features of commissioners. The results showed promise in the proposed approach. Though, further research is substantial, conducting comparisons with other acknowledged author identification methods, to confirm some level of efficacy, especially on documents containing close similarities among topics.
114

Sentiment Analysis of MOOC learner reviews : What motivates learners to complete a course?

Knöös, Johanna, Rääf, Siri Amanda January 2021 (has links)
In the last decade, development of Information and Communication Technology (ICT) thatsupports online learning has increased the demand for e-learning and Massive Open OnlineCourses (MOOCs). Despite their increased popularity, MOOCs are struggling with highdropout rates and only a small percentage of learners complete the courses they enrolled in. Thepurpose of this thesis is to gain knowledge about MOOC learner behaviour. The aim of thestudy is to identify the motivations of learners and how these differ between learners whocompleted a course and those who dropped out. Research on MOOC learners has mostly beencarried out using a quantitative approach. While quantitative methodologies are effective inhandling the large amount of data produced by MOOCs, qualitative methods can give deeperinsights into online learners’ motivations. Therefore, this thesis employs an explanatorysequential mixed methods research, in which sentiment analysis and topic modeling of learnerreviews from the platform Coursera are further explained by qualitative interviews with MOOClearners. In the study 28,000 reviews scraped from five courses within the fields of data sciencewere analyzed and ten interviews were held with learners who either completed, dropped outfrom or both completed and dropped out from a MOOC. In the quantitative analysis nine coursefactors were found that learners wrote about: content, delivery, assessment, learning experience,tools, video material, teaching style, instructor skills and course provider. In addition, eighteenthemes were yielded from the interviews: self-discipline, just for fun, certificates, personaldevelopment, knowledge, career, time, equipment, practical exercise, interaction, instructor,reality, structure, external material, cost, community, degree of difficulty and other. In thediscussion the empirical findings are reflected upon using the theoretical framework of theresearch and the literature review. The result does not reveal any differences in motivationsbetween learners who completed a course and those who dropped out, however, it does identifyfactors that caused learners’ to drop out and the topics that most negative learner reviews wereabout. This research contributes to the body of knowledge in the field of research on MOOClearner retention and motivations. The topic is relevant for research in education informaticsand for continued improvements in delivery of MOOCs.
115

Text Mining for Social Harm and Criminal Justice Applications

Pandey, Ritika 08 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Increasing rates of social harm events and plethora of text data demands the need of employing text mining techniques not only to better understand their causes but also to develop optimal prevention strategies. In this work, we study three social harm issues: crime topic models, transitions into drug addiction and homicide investigation chronologies. Topic modeling for the categorization and analysis of crime report text allows for more nuanced categories of crime compared to official UCR categorizations. This study has important implications in hotspot policing. We investigate the extent to which topic models that improve coherence lead to higher levels of crime concentration. We further explore the transitions into drug addiction using Reddit data. We proposed a prediction model to classify the users’ transition from casual drug discussion forum to recovery drug discussion forum and the likelihood of such transitions. Through this study we offer insights into modern drug culture and provide tools with potential applications in combating opioid crises. Lastly, we present a knowledge graph based framework for homicide investigation chronologies that may aid investigators in analyzing homicide case data and also allow for post hoc analysis of key features that determine whether a homicide is ultimately solved. For this purpose we perform named entity recognition to determine witnesses, detectives and suspects from chronology, use keyword expansion to identify various evidence types and finally link these entities and evidence to construct a homicide investigation knowledge graph. We compare the performance over several choice of methodologies for these sub-tasks and analyze the association between network statistics of knowledge graph and homicide solvability.
116

How much do you care about education? Exploring fluctuations of public interest in education issues among top national priorities in the U.S.

Nehoran, Dana 01 January 2020 (has links)
It is well known that a strong education system produces citizens who are more engaged in civil and social duties, with obvious benefits to society and the individuals. Policymakers who have the power to help improve the education system frequently rely on the news or the polls to better understand the issues involved, but these tools are often unable to answer customized questions on the public view with a large enough coverage. Monitoring the American public interest in education over the years is not new. In fact, a number of national polling agencies have tracked education as part of their larger polls asking people to name the most burning issues facing the US. While these polls provide a fair indication of the changes in importance of education in the eyes of the public, they do not identify the factors which have historically been associated with the major fluctuations of such importance. Most importantly, these traditional national polls do not track public concern about specific subtopics within education. This mixed methods study includes the creation of a software instrument with the objective of exploring the salience of education as a national priority over time and analyzing the possible factors associated with these fluctuations of interest. In addition to discovering the most prominent latent subtopics affecting education (such as academic achievement, sexual assault and freedom of speech), this study also seeks national-level issues that may have recently been associated with the largest declines. The only source of data utilized is the text of tens of thousands of published news articles. Terms extracted from the text using natural language processing serve as the basis for automated qualitative analysis. As topics emerge from the data, the frequencies of the terms are utilized to associate the articles with the most relevant ones. The analysis shows that public interest in education has declined the most during election times. It is also found that the areas that contributed the most during the largest surges of public interest in education from 2015 to 2020 were school budget, academic achievement gaps and mental health.
117

Twitter and the Affordance of Public Agenda-Setting: A Case Study of #MarchForOurLives

Chong, Mi Young 08 1900 (has links)
In the traditional agenda-setting theory, the agenda-setters were the news media and the public has a minimal role in the process of agenda-setting, which makes the public a passive receiver located at the bottom in the top-down agenda-setting dynamics. This study claims that with the development of Information communication technologies, primarily social media, the networked public may be able to set their own agendas through connective actions, outside the influence of the news media agenda. There is little empirical research focused on development and dynamics of public agenda-setting through social media platforms. Understanding the development and dynamics of public agenda-setting may be key to accounting for and overcoming conflicting findings in previous reverse agenda-setting research. This study examined the public agenda-setting dynamics through a case of gun violence prevention activism Twitter network, the #MarchForOurLives Twitter network. This study determined that the agenda setters of the #MarchForOurLives Twitter network are the key Never Again MSD student leaders and the March For Our Lives. The weekly reflected important events and issues and the identified topics were highly co-related with the themes examined in the tweets created by the agenda setters. The amplifiers comprised the vast majority of the tweets. The advocates and the supporters consisted of 0.44% and 4.43% respectively. The tweets made by the agenda setters accounted for 0.03%. The young activists and the like-minded and participatory public could continuously make changes taking advantage of technologies, and they could be the hope in the current and future society.
118

Parallel Algorithms for Machine Learning

Moon, Gordon Euhyun 02 October 2019 (has links)
No description available.
119

Unearthing the social-ecological cascades of the fall armyworm invasion: A computer-assisted text analysis of digital news articles

Bjorklund, Kathryn January 2023 (has links)
Understanding the complex nature of social-ecological cascades, or chain reactions of events that lead to widespread change in a system, is crucial for navigating the challenges they present. Emerging pests and pathogens, such as the fall armyworm, provide an opportunity to study these cascades in greater detail. I use topic modeling of digital news articles to investigate the potential social-ecological cascades associated with the ongoing fall armyworm invasion of multiple geographic regions. My findings reveal regional thematic shifts in the popular news media discourse surrounding the fall armyworm invasion. Notably, in the discourse surrounding Oceania, I observed a pronounced focus on invasion preparation, a theme significantly more emphasized compared to regions like Africa and Asia. These regional variations shed light on some of the localized priorities in addressing this invasive species. By highlighting the significance of employing comprehensive case studies of emerging pests and pathogens, this research underscores the need for more in- depth analyses of social-ecological cascades to better manage and mitigate their impacts.
120

Query Search VS ChatAI: : The nature of users’ discourse of two search paradigms

Rahman, Mansur January 2023 (has links)
Internet search has been marked by the dominant use of query search, specifically Google, since the mid-1990s. The public release of the AI-based search tool, chatGPT, powered by a recent innovation in deep learning known as large language models (LLMs), marks a paradigm shift in internet search technology. While the essence of both the search technologies, namely, the retrieval of information from the internet, remains the same, there appears to be a marked difference in the manner of their use and perception by both the general public as well as in media. Prior studies have highlighted the importance of assessing perceptions of new technology on users. Examining the impact of this recently-introduced form of search compared to the original query-search can provide valuable insights into users’ perception of search technologies as well as identify underlying attitudes towards AI. This study investigates the distinct discursive patterns characterising user perceptions of these two search paradigms. It uses a collected text corpus of media articles and forum data as its research material, and employs Latent Dirichlet Allocation (LDA) topic modelling to generate a quantitative set of topics. These are then examined qualitatively through the lenses of technological frames and discourse analysis to uncover user perceptions. Findings indicate that user discourse patterns diverge, anticipatory themes differ and there is variation in user concerns as well as media coverage. This research contributes insights into evolving technological perceptions, societal consequences, and the media’s role in shaping user discourse. It also highlights that further investigations into the anthropomorphic aspects of digitalisation and the evolving information landscape may offer promising avenues for future research.

Page generated in 0.0579 seconds