• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 82
  • 7
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 143
  • 143
  • 43
  • 41
  • 41
  • 37
  • 34
  • 29
  • 27
  • 26
  • 24
  • 24
  • 23
  • 23
  • 21
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Miljöpartiet and the never-ending nuclear energy debate : A computational rhetorical analysis of Swedish climate policy

Dickerson, Claire January 2022 (has links)
The domain of rhetoric has changed dramatically since its inception as the art of persuasion. It has adapted to encompass many forms of digital media, including, for example, data visualization and coding as a form of literature, but the approach has frequently been that of an outsider looking in. The use of comprehensive computational tools as a part of rhetorical analysis has largely been lacking. In this report, we attempt to address this lack by means of three case studies in natural language processing tasks, all of which can be used as part of a computational approach to rhetoric. At this same moment in time, it is becoming all the more important to transition to renewable energy in order to keep global warming under 1.5 degrees Celsius and ensure that countries meet the conditions of the Paris Agreement. Thus, we make use of speech data on climate policy from the Swedish parliament to ground these three analyses in semantic textual similarity, topic modeling, and political party attribution. We find that speeches are, to a certain extent, consistent within parties, given that a slight majority of most semantically similar speeches come from the same party. We also find that some of the most common topics discussed in these speeches are nuclear energy and the Swedish Green party, purported environmental risks due to renewable energy sources, and the job market. Finally, we find that though pairs of speeches are semantically similar, party rhetoric on the whole is generally not unique enough for speeches to be distinguishable by party. These results then open the door for a broader exploration of computational rhetoric for Swedish political science in the future.
132

Reclaiming the “C” in ICT4D: A Critical Examination of the Discursive (Un)Freedoms in Digital State Policy and News Media of Bangladesh and Norway

Ala-Uddin, Mohammad 11 May 2022 (has links)
No description available.
133

Neural Methods Towards Concept Discovery from Text via Knowledge Transfer

Das, Manirupa January 2019 (has links)
No description available.
134

[en] EXTRACTING RELIABLE INFORMATION FROM LARGE COLLECTIONS OF LEGAL DECISIONS / [pt] EXTRAINDO INFORMAÇÕES CONFIÁVEIS DE GRANDES COLEÇÕES DE DECISÕES JUDICIAIS

FERNANDO ALBERTO CORREIA DOS SANTOS JUNIOR 09 June 2022 (has links)
[pt] Como uma consequência natural da digitalização do sistema judiciário brasileiro, um grande e crescente número de documentos jurídicos tornou-se disponível na internet, especialmente decisões judiciais. Como ilustração, em 2020, o Judiciário brasileiro produziu 25 milhões de decisões. Neste mesmo ano, o Supremo Tribunal Federal (STF), a mais alta corte do judiciário brasileiro, produziu 99.5 mil decisões. Alinhados a esses valores, observamos uma demanda crescente por estudos voltados para a extração e exploração do conhecimento jurídico de grandes acervos de documentos legais. Porém, ao contrário do conteúdo de textos comuns (como por exemplo, livro, notícias e postagem de blog), o texto jurídico constitui um caso particular de uso de uma linguagem altamente convencionalizada. Infelizmente, pouca atenção é dada à extração de informações em domínios especializados, como textos legais. Do ponto de vista temporal, o Judiciário é uma instituição em constante evolução, que se molda para atender às demandas da sociedade. Com isso, o nosso objetivo é propor um processo confiável de extração de informações jurídicas de grandes acervos de documentos jurídicos, tomando como base o STF e as decisões monocráticas publicadas por este tribunal nos anos entre 2000 e 2018. Para tanto, pretendemos explorar a combinação de diferentes técnicas de Processamento de Linguagem Natural (PLN) e Extração de Informação (EI) no contexto jurídico. Da PLN, pretendemos explorar as estratégias automatizadas de reconhecimento de entidades nomeadas no domínio legal. Do ponto da EI, pretendemos explorar a modelagem dinâmica de tópicos utilizando a decomposição tensorial como ferramenta para investigar mudanças no raciocinio juridico presente nas decisões ao lonfo do tempo, a partir da evolução do textos e da presença de entidades nomeadas legais. Para avaliar a confiabilidade, exploramos a interpretabilidade do método empregado, e recursos visuais para facilitar a interpretação por parte de um especialista de domínio. Como resultado final, a proposta de um processo confiável e de baixo custo para subsidiar novos estudos no domínio jurídico e, também, propostas de novas estratégias de extração de informações em grandes acervos de documentos. / [en] As a natural consequence of the Brazilian Judicial System’s digitization, a large and increasing number of legal documents have become available on the Internet, especially judicial decisions. As an illustration, in 2020, 25 million decisions were produced by the Brazilian Judiciary. Meanwhile, the Brazilian Supreme Court (STF), the highest judicial body in Brazil, alone has produced 99.5 thousand decisions. In line with those numbers, we face a growing demand for studies focused on extracting and exploring the legal knowledge hidden in those large collections of legal documents. However, unlike typical textual content (e.g., book, news, and blog post), the legal text constitutes a particular case of highly conventionalized language. Little attention is paid to information extraction in specialized domains such as legal texts. From a temporal perspective, the Judiciary itself is a constantly evolving institution, which molds itself to cope with the demands of society. Therefore, our goal is to propose a reliable process for legal information extraction from large collections of legal documents, based on the STF scenario and the monocratic decisions published by it between 2000 and 2018. To do so, we intend to explore the combination of different Natural Language Processing (NLP) and Information Extraction (IE) techniques on legal domain. From NLP, we explore automated named entity recognition strategies in the legal domain. From IE, we explore dynamic topic modeling with tensor decomposition as a tool to investigate the legal reasoning changes embedded in those decisions over time through textual evolution and the presence of the legal named entities. For reliability, we explore the interpretability of the methods employed. Also, we add visual resources to facilitate interpretation by a domain specialist. As a final result, we expect to propose a reliable and cost-effective process to support further studies in the legal domain and, also, to propose new strategies for information extraction on a large collection of documents.
135

Extending the explanatory power of factor pricing models using topic modeling / Högre förklaringsgrad hos faktorprismodeller genom topic modeling

Everling, Nils January 2017 (has links)
Factor models attribute stock returns to a linear combination of factors. A model with great explanatory power (R2) can be used to estimate the systematic risk of an investment. One of the most important factors is the industry which the company of the stock operates in. In commercial risk models this factor is often determined with a manually constructed stock classification scheme such as GICS. We present Natural Language Industry Scheme (NLIS), an automatic and multivalued classification scheme based on topic modeling. The topic modeling is performed on transcripts of company earnings calls and identifies a number of topics analogous to industries. We use non-negative matrix factorization (NMF) on a term-document matrix of the transcripts to perform the topic modeling. When set to explain returns of the MSCI USA index we find that NLIS consistently outperforms GICS, often by several hundred basis points. We attribute this to NLIS’ ability to assign a stock to multiple industries. We also suggest that the proportions of industry assignments for a given stock could correspond to expected future revenue sources rather than current revenue sources. This property could explain some of NLIS’ success since it closely relates to theoretical stock pricing. / Faktormodeller förklarar aktieprisrörelser med en linjär kombination av faktorer. En modell med hög förklaringsgrad (R2) kan användas föratt skatta en investerings systematiska risk. En av de viktigaste faktorerna är aktiebolagets industritillhörighet. I kommersiella risksystem bestäms industri oftast med ett aktieklassifikationsschema som GICS, publicerat av ett finansiellt institut. Vi presenterar Natural Language Industry Scheme (NLIS), ett automatiskt klassifikationsschema baserat på topic modeling. Vi utför topic modeling på transkript av aktiebolags investerarsamtal. Detta identifierar ämnen, eller topics, som är jämförbara med industrier. Topic modeling sker genom icke-negativmatrisfaktorisering (NMF) på en ord-dokumentmatris av transkripten. När NLIS används för att förklara prisrörelser hos MSCI USA-indexet finner vi att NLIS överträffar GICS, ofta med 2-3 procent. Detta tillskriver vi NLIS förmåga att ge flera industritillhörigheter åt samma aktie. Vi föreslår också att proportionerna hos industritillhörigheterna för en aktie kan motsvara förväntade inkomstkällor snarare än nuvarande inkomstkällor. Denna egenskap kan också vara en anledning till NLIS framgång då den nära relaterar till teoretisk aktieprissättning.
136

Evaluating Hierarchical LDA Topic Models for Article Categorization

Lindgren, Jennifer January 2020 (has links)
With the vast amount of information available on the Internet today, helping users find relevant content has become a prioritized task in many software products that recommend news articles. One such product is Opera for Android, which has a news feed containing articles the user may be interested in. In order to easily determine what articles to recommend, they can be categorized by the topics they contain. One approach of categorizing articles is using Machine Learning and Natural Language Processing (NLP). A commonly used model is Latent Dirichlet Allocation (LDA), which finds latent topics within large datasets of for example text articles. An extension of LDA is hierarchical Latent Dirichlet Allocation (hLDA) which is an hierarchical variant of LDA. In hLDA, the latent topics found among a set of articles are structured hierarchically in a tree. Each node represents a topic, and the levels represent different levels of abstraction in the topics. A further extension of hLDA is constrained hLDA, where a set of predefined, constrained topics are added to the tree. The constrained topics are extracted from the dataset by grouping highly correlated words. The idea of constrained hLDA is to improve the topic structure derived by a hLDA model by making the process semi-supervised. The aim of this thesis is to create a hLDA and a constrained hLDA model from a dataset of articles provided by Opera. The models should then be evaluated using the novel metric word frequency similarity, which is a measure of the similarity between the words representing the parent and child topics in a hierarchical topic model. The results show that word frequency similarity can be used to evaluate whether the topics in a parent-child topic pair are too similar, so that the child does not specify a subtopic of the parent. It can also be used to evaluate if the topics are too dissimilar, so that the topics seem unrelated and perhaps should not be connected in the hierarchy. The results also show that the two topic models created had comparable word frequency similarity scores. None of the models seemed to significantly outperform the other with regard to the metric.
137

Hierarchical Text Topic Modeling with Applications in Social Media-Enabled Cyber Maintenance Decision Analysis and Quality Hypothesis Generation

SUI, ZHENHUAN 27 October 2017 (has links)
No description available.
138

Sentiment Analysis of COVID-19 Vaccine Discourse on Twitter

Andersson, Patrik January 2024 (has links)
The rapid development and disitribution of COVID-19 vaccines have sparked diverse public reactions globally, often reflected through social media platförms like Twitter. This study aims to analyze the sentiment andd public discourse surrounding COVID-19 vaccines on Twitter, utilizing advanced text classification techniques to navigare the vast, unstructured nature of sicial media dfata. By implementing sentiment analysis, the research categoizes tweets into positive, negative, and neutral sentiments to gauge public opinion more effectively. In-depth analysis thorugh topic modelingtecniques helped identify seven key topicvs influencing public sentiment including aspects related to efficiacy, logisticl challenges, safety concens, and personal experiences, each varying in prominence depending on the country, as well as the specific timeline of vaccine deployment. Additionally, this study explorers geographical variations in sentiment, notig significant differences in public opinion across different countries. These variations could be tied to local cultural, social, and political contexts. Reults from this study show a polarized response towards vaccination, with significant discourse clusers showing either strong supprt for or resistance against the COVID-19 vaccination efforts. This polarization is further pronounced by the logistical challenges and trust issues related to vaccine science, particularly emphasized in tweets from couintries with lower vaccine acceptance rates. This sentiment analysis on Twitter offers valuable insights into the public's perception and acceptancce of COVID-19 vaccines, providing a useful tool for policymakers and public health officials to understand and address publiv concerns effectively. By identifying and understanding the key factors influencing vaccine sentiment, tageted communication strategies can be developed to enhance publiv engagement and vaccine uptake.
139

Traitement automatique du langage naturel pour les textes juridiques : prédiction de verdict et exploitation de connaissances du domaine

Salaün, Olivier 12 1900 (has links)
À l'intersection du traitement automatique du langage naturel et du droit, la prédiction de verdict ("legal judgment prediction" en anglais) est une tâche permettant de représenter la question de la justice prédictive, c'est-à-dire tester les capacités d'un système automatique à prédire le verdict décidé par un juge dans une décision de justice. La thèse présente de bout en bout la mise en place d'une telle tâche formalisée sous la forme d'une classification multilabel, ainsi que différentes stratégies pour tenter d'améliorer les performances des classifieurs. Le tout se base sur un corpus de décisions provenant du Tribunal administratif du logement du Québec (litiges entre propriétaires et locataires). Tout d'abord, un prétraitement préliminaire et une analyse approfondie du corpus permettent d'en tirer les aspects métier les plus saillants. Cette étape primordiale permet de s'assurer que la tâche de prédiction de verdict a du sens, et de mettre en relief des biais devant être pris en considération pour les tâches ultérieures. En effet, à l'issue d'un premier banc d'essai comparant différents modèles sur cette tâche, ces derniers tendent à exacerber des biais préexistant dans le corpus (p. ex. ils donnent encore moins gain de cause aux locataires par rapport à un juge humain). Fort de ce constat, la suite des expériences vise à améliorer les performances de classification et à atténuer ces biais, en se focalisant sur CamemBERT. Pour ce faire, des connaissances du domaine cible (droit du logement) sont exploitées. Une première approche consiste à employer des articles de loi comme données d'entrée qui font l'objet de différentes représentations, mais c'est cependant loin d'être la panacée. Une autre approche employant la modélisation thématique s'intéresse aux thèmes pouvant être extraits à partir du texte décrivant les faits litigieux. Une évaluation automatique et manuelle des thèmes obtenus démontre leur informativité vis-à-vis des motifs amenant des justiciables à se rendre au tribunal. Avec ce constat, la dernière partie de notre travail revisite une nouvelle fois la tâche de prédiction de verdict en s'appuyant à la fois sur des systèmes de recherche d'information (RI), et des thèmes associés aux décisions. Les modèles conçus ici ont la particularité de s'appuyer sur une jurisprudence (décisions passées pertinentes) récoltée selon différents critères de recherche (p. ex. similarité au niveau du texte et/ou des thèmes). Les modèles utilisant des critères de RI basés sur des sacs-de-mots (Lucene) et des thèmes obtiennent des gains significatifs en termes de scores F1 Macro. Cependant, le problème d'amplification des biais persiste encore bien qu'atténué. De manière globale, l'exploitation de connaissances du domaine permet d'améliorer les performances des prédicteurs de verdict, mais la persistance de biais dans les résultats décourage le déploiement de tels modèles à grande échelle dans le monde réel. D'un autre côté, les résultats de la modélisation thématique laissent entrevoir de meilleurs débouchés pour ce qui relève de l'accessibilité et de la lisibilité des documents juridiques par des utilisateurs humains. / At the intersection of natural language processing and law, legal judgment prediction is a task that can represent the problem of predictive justice, or in other words, the capacity of an automated system to predict the verdict decided by a judge in a court ruling. The thesis presents from end to end the implementation of such a task formalized as a multilabel classification, along with different strategies attempting to improve classifiers' performance. The whole work is based on a corpus of decisions from the Administrative housing tribunal of Québec (disputes between landlords and tenants). First of all, a preliminary preprocessing and an in-depth analysis of the corpus highlight its most prominent domain aspects. This crucial step ensures that the verdict prediction task is sound, and also emphasizes biases that must be taken into consideration for future tasks. Indeed, a first testbed comparing different models on this task reveals that they tend to exacerbate biases pre-existing within the corpus (i.e. their verdicts are even less favourable to tenants compared with a human judge). In light of this, the next experiments aim at improving classification performance and at mitigating these biases, by focusing on CamemBERT. In order to do so, knowledge from the target domain (housing law) are exploited. A first approach consists in employing articles of law as input features which are used under different representations, but such method is far from being a panacea. Another approach relying on topic modeling focuses on topics that can be extracted from the text describing the disputed facts. An automatic and manual evaluation of topics obtained shows evidence of their informativeness about reasons leading litigants to go to court. On this basis, the last part of our work revisits the verdict prediction task by relying on both information retrieval (IR) system, and topics assigned to decisions. The models designed here have the particularity to rely on jurisprudence (relevant past cases) retrieved with different search criteria (e.g. similarity at the text or topics level). Models using IR criteria based on bags-of-words (Lucene) and topics obtain significant gains in terms of Macro F1 scores. However, the aforementioned amplified biases issue, though mitigated, still remains. Overall, the exploitation of domain-related knowledge can improve the performance of verdict predictors, but the persistence of biases in the predictions hinders the deployment of such models on a large scale in the real world. On the other hand, results obtained from topic modeling suggest better prospects for anything that can improve the accessibility and readability of legal documents by human users.
140

Student Scientometrics – What do German Students of the Humanities Cite in their Term Papers?

Henning, Tim, Gutiérrez De la Torre, Silvia E., Burghardt, Manuel 11 July 2024 (has links)
No description available.

Page generated in 0.0508 seconds