• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 929
  • 156
  • 74
  • 55
  • 27
  • 23
  • 18
  • 13
  • 10
  • 9
  • 8
  • 7
  • 5
  • 5
  • 4
  • Tagged with
  • 1601
  • 1601
  • 1601
  • 622
  • 565
  • 464
  • 383
  • 376
  • 266
  • 256
  • 245
  • 228
  • 221
  • 208
  • 204
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
931

The Influence of Artificial Intelligence on Songwriting : Navigating Attribution Challenges and Copyright Protection

Norberg, Karin, Norell, Othilia January 2023 (has links)
This report explores the evolving landscape of songwriting and copyright protection, with a focus on the influence of Artificial Intelligence (AI). It highlights the need for objective measures of attribution in music co-creation, including collaborations involving AI. The study explores the potential of employing Natural Language Processing (NLP) methods in song lyric generation, to assign attribution more accurately and transparently. The report also discusses the perspectives of various stakeholders in the music industry highlighting the importance of attribution and addressing concerns related to AI-generated works. The research combines quantitative and qualitative methodologies, including surveys, interviews, and literature reviews, to provide comprehensive insights into the complexities of attribution in songwriting and the implications of AI’s involvement. The survey compared original song choruses to modified versions, gathering insights on the significance of text modifications. Statistical analysis and NLP techniques; levenshtein distance, plagiarism detection, sentiment analysis, and cosine similarity, were used to assess textual changes. The results indicated that primarily sentiment analysis, but also cosine similarity, aligned closer with the survey responses. Interviews provided valuable perspectives on challenges in attribution and copyright, as well as thoughts regarding AI in songwriting and ethical considerations. Current attribution methods often lead to unequal royalty distribution in co-created works. Objective metrics, including NLP techniques, could potentially offer a compliment for tracking attribution in a more quantitative way. Stakeholder analysis reveals the interests and power dynamics of songwriters, artists, labels, consumers, and lawyers. AI’s involvement raises questions about data sources, developer roles, and quantifying creativity, posing challenges in determining attribution, royalty distribution, and copyright protection. The report also underscores the importance of quantifying creativity, preserving creative integrity, and meeting the diverse needs of stakeholders within an AI-driven musical landscape. / Denna rapport utforskar det föränderliga landskapet för låtskrivande och upphovsrättsskydd, med fokus på artificiell intelligens (AI) påverkan. Studien belyser behovet av objektiva mått för attribution i musikaliskt samskapande, inklusive samarbeten som involverar AI. Studien undersöker möjligheterna att använda metoder inom språkteknologi (NLP) i låttextgenerering för att tilldela attribution på ett mer exakt och transparent sätt. Rapporten diskuterar även perspektiven från olika intressenter inom musikindustrin som betonar vikten av attribution och tar upp frågor relaterade till AI-genererade verk. Forskningen kombinerar kvantitativa och kvalitativa metoder, inklusive enkäter, intervjuer och litteraturstudier, för att ge en omfattande inblick i komplexiteten i attribution i låtskrivandet och konsekvenserna av AI:s inblandning. Enkäten jämförde originalversioner av refränger i låtar med modifierade versioner och samlade insikter om betydelsen av textmodifieringar. Statistisk analys och NLP-tekniker; levenshteinavstånd, plagiatdetektering, sentimentanalys och cosinuslikhet, användes för att bedöma textförändringar. Resultaten visade att främst sentimentanalys, men även cosinuslikhet, stämde bättre överens med enkätsvaren. Intervjuerna gav värdefulla perspektiv på utmaningar med attribution och upphovsrätt, samt tankar kring AI i låtskrivande och etiska överväganden. Nuvarande metoder för attribution leder ofta till ojämn fördelning av royalties i samskapade verk. Objektiva metoder, inklusive NLP-tekniker, skulle kunna erbjuda ett komplement för att spåra attribution på ett mer kvantitativt sätt. En intressentanalys avslöjar låtskrivares, artisters, skivbolags, konsumenters och advokaters intressen och maktdynamik. AI:s inblandning väcker frågor om datakällor, utvecklares roller och kvantifiering av kreativitet, vilket skapar utmaningar i att fastställa attribution, fördelning av royalties och upphovsrättsskydd. Rapporten understryker också vikten av att kvantifiera kreativitet, bevara kreativ integritet och möta de olika behoven hos intressenter inom ett AI-drivet musiklandskap.
932

Revisiting Item Semantics in Measurement: A New Perspective Using Modern Natural Language Processing Embedding Techniques

Guo, Feng 11 August 2023 (has links)
No description available.
933

Comparing Different Transformer Models’ Performance for Identifying Toxic Language Online

Sundelin, Carl January 2023 (has links)
There is a growing use of the internet and alongside that, there has been an increase in the use of toxic language towards other people that can be harmful to those that it targets. The usefulness of artificial intelligence has exploded in recent years with the development of natural language processing, especially with the use of transformers. One of the first ones was BERT, and that has spawned many variations including ones that aim to be more lightweight than the original ones. The goal of this project was to train three different kinds of transformer models, RoBERTa, ALBERT, and DistilBERT, and find out which one was best at identifying toxic language online. The models were trained on a handful of existing datasets that had labelled data as abusive, hateful, harassing, and other kinds of toxic language. These datasets were combined to create a dataset that was used to train and test all of the models. When tested against data collected in the datasets, there was very little difference in the overall performance of the models. The biggest difference was how long it took to train them with ALBERT taking approximately 2 hours, RoBERTa, around 1 hour and DistilBERT just over half an hour. To understand how well the models worked in a real-world scenario, the models were evaluated by labelling text as toxic or non-toxic on three different subreddits. Here, a larger difference in performance showed up. DistilBERT labelled significantly fewer instances as toxic compared to the other models. A sample of the classified data was manually annotated, and it showed that the RoBERTa and DistilBERT models still performed similarly to each other. A second evaluation was done on the data from Reddit and a threshold of 80% certainty was required for the classification to be considered toxic. This led to an average of 28% of instances being classified as toxic by RoBERTa, whereas ALBERT and DistilBERT classified an average of 14% and 11% as toxic respectively. When the results from the RoBERTa and DistilBERT models were manually annotated, a significant improvement could be seen in the performance of the models. This led to the conclusion that the DistilBERT model was the most suitable model for training and classifying toxic language of the lightweight models tested in this work.
934

Development of robust language models for speech recognition of under-resourced language

Sindana, Daniel January 2020 (has links)
Thesis (M.Sc.(Computer Science )) -- University of Limpopo, 2020 / Language modelling (LM) work for under-resourced languages that does not consider most linguistic information inherent in a language produces language models that in adequately represent the language, thereby leading to under-development of natural language processing tools and systems such as speech recognition systems. This study investigated the influence that the orthography (i.e., writing system) of a lan guage has on the quality and/or robustness of the language models created for the text of that language. The unique conjunctive and disjunctive writing systems of isiN debele (Ndebele) and Sepedi (Pedi) were studied. The text data from the LWAZI and NCHLT speech corpora were used to develop lan guage models. The LM techniques that were implemented included: word-based n gram LM, LM smoothing, LM linear interpolation, and higher-order n-gram LM. The toolkits used for development were: HTK LM, SRILM, and CMU-Cam SLM toolkits. From the findings of the study – found on text preparation, data pooling and sizing, higher n-gram models, and interpolation of models – it is concluded that the orthogra phy of the selected languages does have effect on the quality of the language models created for their text. The following recommendations are made as part of LM devel opment for the concerned languages. 1) Special preparation and normalisation of the text data before LM development – paying attention to within sentence text markers and annotation tags that may incorrectly form part of sentences, word sequences, and n-gram contexts. 2) Enable interpolation during training. 3) Develop pentagram and hexagram language models for Pedi texts, and trigrams and quadrigrams for Ndebele texts. 4) Investigate efficient smoothing method for the different languages, especially for different text sizes and different text domains / National Research Foundation (NRF) Telkom University of Limpopo
935

Natural Language Document and Event Association Using Stochastic Petri Net Modeling

Mills, Michael Thomas 29 May 2013 (has links)
No description available.
936

Extracting Causal Relations between News Topics from Distributed Sources

Miranda Ackerman, Eduardo Jacobo 08 November 2013 (has links)
The overwhelming amount of online news presents a challenge called news information overload. To mitigate this challenge we propose a system to generate a causal network of news topics. To extract this information from distributed news sources, a system called Forest was developed. Forest retrieves documents that potentially contain causal information regarding a news topic. The documents are processed at a sentence level to extract causal relations and news topic references, these are the phases used to refer to a news topic. Forest uses a machine learning approach to classify causal sentences, and then renders the potential cause and effect of the sentences. The potential cause and effect are then classified as news topic references, these are the phrases used to refer to a news topics, such as “The World Cup” or “The Financial Meltdown”. Both classifiers use an algorithm developed within our working group, the algorithm performs better than several well known classification algorithms for the aforementioned tasks. In our evaluations we found that participants consider causal information useful to understand the news, and that while we can not extract causal information for all news topics, it is highly likely that we can extract causal relation for the most popular news topics. To evaluate the accuracy of the extractions made by Forest, we completed a user survey. We found that by providing the top ranked results, we obtained a high accuracy in extracting causal relations between news topics.
937

MUTUAL LEARNING ALGORITHMS IN MACHINE LEARNING

Sabrina Tarin Chowdhury (14846524) 18 May 2023 (has links)
<p>    </p> <p>Mutual learning algorithm is a machine learning algorithm where multiple machine learning algorithms learns from different sources and then share their knowledge among themselves so that all the agents can improve their classification and prediction accuracies simultaneously. Mutual learning algorithm can be an efficient mechanism for improving the machine learning and neural network efficiency in a multi-agent system. Usually, in knowledge distillation algorithms, a big network plays the role of a static teacher and passes the data to smaller networks, known as student networks, to improve the efficiency of the latter. In this thesis, it is showed that two small networks can dynamically and interchangeably play the changing roles of teacher and student to share their knowledge and hence, the efficiency of both the networks improve simultaneously. This type of dynamic learning mechanism can be very useful in mobile environment where there is resource constraint for training with big dataset. Data exchange in multi agent, teacher-student network system can lead to efficient learning.  </p>
938

THE ROLE OF INFORMATION SYSTEMS IN HEALTHCARE

Jianing Ding (15340786) 26 April 2023 (has links)
<p>Fundamental changes have been happening in healthcare organizations and delivery in these decades, including more accessible physician information, the low-cost collection and sharing of clinical records, and decision support systems, among others. Emerging information systems and technologies play a signification role in these transformations. To extend the understanding and the implications of information systems on healthcare, my dissertation investigates the influence of information systems on enhancing healthcare operations. The findings reveal the practical value of digitalization in indicating healthcare providers' cognitive behaviors, responding to healthcare crises, and improving medical performance.</p> <p><br></p> <p>The first essay investigates the unrevealed value of a special type of user-generated content in healthcare operations. In today's social media world, individuals are willing to express themselves on various online platforms. This user-generated content posted online help readers get easy assess to individuals' features, including but not limited to personality traits. To study the impact of physicians' personality traits on medicine behaviours and performance, we take a view from the perspective of user generated content posted by their supplier side as well as using physician statements which have been made available in medical review websites. It has been found that a higher openness score leads to lower mortality rates, reduced lab test costs, shorter time usage in hospitals treated by physicians with greater openness scores. Furthermore, taking these personality traits into consideration in an optimization problem of ED scheduling, the estimation of counterfactual analysis shows an average of 11.4%, 18.4%, and 17.8% reduction in in-hospital mortality rates, lab test expenditures, and lengths of stay, respectively. In future operation of healthcare, physicians' personalities should be taken into account when healthcare resources are insufficient in times of healthcare pandemics like COVID-19, as our study indicates that health service providers personality is an actual influence on clinical quality.</p> <p><br></p> <p>In the second essay, we focus on the influences of the most severe healthcare pandemic in these decades, COVID-19, on digital goods consumption and examine whether digital goods consumption is resilient to an individual’s physical restriction induced by the pandemic. Leveraging the enforced quarantine policy during the COVID-19 pandemic as a quasi-experiment, we identify the influence of a specific factor, quarantine policy, on mobile app consumption in every Apple app store category in the short and long terms. In the perspective of better responding in the post-pandemic era, the quantitative findings provide managerial implications to the app industry as well as the stock market for accurately understanding the long-term impact of a significant intervention, quarantine, in the pandemic. Moreover, by using the conditional exogenous quarantine policy to instrument app users’ daily movement patterns, we are able to further investigate the digital resilience of physical mobility in different app categories and quantify the impact of an individual’s physical mobility on human behavior in app usage. For results, we find that the reduction in 10% of one’s physical mobility (measured in the radius of gyration) leads to a 2.68% increase in general app usage and a 5.44% rise in app usage time dispersion, suggesting practitioners should consider users’ physical mobility in future mobile app design, pricing, and marketing.</p> <p><br></p> <p>In the third essay, we investigate the role of an emerging AI-based clinical treatment method, robot-assisted surgery (RAS), in transforming the healthcare delivery. As an advanced technique to help diminish the human physical and intellectual limitations in surgeries, RAS is expected to but has not been empirically proven to improve clinical performance. In this work, we first investigate the effect of RAS on clinical outcomes, controlling physicians' self-selection behavior in choosing whether or not to use RAS treatment methods. In particular, we focus on the accessibility of RAS and explore how physician and patient heterogeneity affect the adoption of the RAS method, including learning RAS and using RAS. Investigating the decision-making process on RAS implementation in both the learning and using stages, we show the synergy of RAS implementation in alleviating healthcare racial disparity. Ultimately, the mechanism analysis will be conducted to reveal the underlying mechanism that induces the enhancement of surgical outcomes. For instance, the estimations tend to reveal that, more than surging clinical performance, RAS tends to increase standardization in time and steps when applying the treatment procedures. </p>
939

[pt] SEGMENTAÇÃO SEMÂNTICA DE VAGAS DE EMPREGO: ESTUDO COMPARATIVO DE ALGORITMOS CLÁSSICOS DE APRENDIZADO DE MÁQUINA / [en] SEMANTIC JOB VACANCY SEGMENTATION: COMPARATIVE STUDY OF CLASSICAL MACHINE LEARNING ALGORITHMS

DAVID EVANDRO AMORIM MARTINS 18 August 2020 (has links)
[pt] Este trabalho demonstra como web mining, processamento de linguagem natural e aprendizado de máquina podem ser combinados para melhorar a compreensão de vagas de emprego segmentando semanticamente os textos de suas descrições. Para atingir essa finalidade, foram coletados dados textuais de três grandes sites de vagas de emprego: Catho, LinkedIn e VAGAS.com.br. Baseado na literatura, este trabalho propôe uma estrutura semântica simplificada em que cada sentença da descrição da vaga de emprego pode pertencer a uma dessas classes: Responsabilidades, Requisitos, Benefícios e Outros. De posse dessa ideia, a tarefa de segmentação semântica pode ser repensada como uma segmentação de sentenças seguida de uma classificação. Usando o Python como ferramenta, são experimentadas algumas formas de construção de atributos a partir de textos, tanto léxicas quanto semânticas, e quatro algoritmos clássicos de aprendizado de máquina: Naive Bayes, Regressão Logística, Máquina de Vetores de Suporte e Floresta Aleatória. Como resultados, este trabalho traz um classificador (Regressão Logística com representação binária) com 95.58 porcento de acurácia, sem sobreajuste de modelo e sem degenerar as classificações por desbalanceio de classes, que é comparável ao estado da arte para Classificação de Texto. Esse classificador foi treinado e validado usando dados do Catho, mas foi testado também nos dados do VAGAS.com.br (88.60 porcento) e do LinkedIn (91.14 porcento), apresentando uma evidência de que seu aprendizado é generalizável para dados de outros sites. Além disso, o classificador foi usado para segmentação semântica das vagas de emprego e obteve uma métrica Pk de 3.67 porcento e uma métrica WindowDiff de 4.78 porcento, que é comparável ao estado da arte de Segmentação de Texto. Por fim, vale salientar duas contribuições indiretas deste trabalho: 1) uma estrutura para pensar e analisar vagas de emprego e 2) uma indicação de que algoritmos clássicos também podem alcançar o estado da arte e, portanto, sempre devem experimentados. / [en] This dissertation demonstrates how web mining, natural language processing, and machine learning can be combined to improve understanding of job openings by semantically segmenting the texts of their descriptions. To achieve this purpose, textual data were collected from three major job sites: Catho, LinkedIn and VAGAS.com.br. Based on the literature, this work proposes a simplified semantic structure in which each sentence of the job description can belong to one of these classes: Responsibilities, Requirements, Benefits and Others. With this idea, the semantic segmentation task can be rethought as a sentence segmentation followed by a classification. Using Python as a tool, some ways of constructing features from texts are tried out, both lexical and semantic, and four classic machine learning algorithms: Naïve Bayes, Logistic Regression, Support Vector Machine, and Random Forest. As a result, this work presents a classifier (Logistic Regression with binary representation) with 95.58 percent accuracy, without model overfitting and without degeneration by class unbalance, which is comparable to state-of-the-art for Text Classification. This classifier was trained and validated using Catho data, but was also tested on VAGAS.com.br (88.60 percent) and LinkedIn (91.14 percent) data, providing evidence that its learning is generalizable to data from other sites. In addition, the classifier was used for semantic segmentation of job openings and obtained a Pk metric equals to 3.67 percent and a WindowDiff metric equals to 4.78 percent, which is comparable to state-of-the-art for Text Segmentation. Finally, it is worth highlighting two indirect contributions of this work: 1) a structure for thinking and analyzing job openings and 2) an indication that classical algorithms can also reach the state of the art and therefore should always be tried.
940

[pt] MODELAGEM DE EVENTOS DE TRÂNSITO COM BASE EM CLIPPING DE GRANDES MASSAS DE DADOS DA WEB / [en] TRAFFIC EVENTS MODELING BASED ON CLIPPING OF HUGE QUANTITY OF DATA FROM THE WEB

LUCIANA ROSA REDLICH 28 January 2015 (has links)
[pt] Este trabalho consiste no desenvolvimento de um modelo que auxilie na análise de eventos ocorridos no trânsito das grandes cidades. Utilizando uma grande massa de dados publicados na Internet, em especial no twitter, por usuários comuns, este trabalho fornece uma ontologia para eventos do trânsito publicados em notícias da internet e uma aplicação que use o modelo proposto para realizar consultas aos eventos modelados. Para isso, as notícias publicadas em linguagem natural são processadas, isto é, as entidades relevantes no texto são identificadas e depois estruturadas de tal forma que seja feita uma analise semântica da notícia publicada. As notícias publicadas são estruturadas no modelo proposto de eventos e com isso é possível que sejam feitas consultas sobre suas propriedades e relacionamentos, facilitando assim a análise do processo do trânsito e dos eventos ocorridos nele. / [en] This work proposes a traffic event model to assist the analysis of traffic events on big cities. This paper aims to provide not only an ontology for traffic events considering published news over the Internet, but also a prototype of a software architecture that uses the proposed model to perform queries on the events, using a huge quantity of published data on the Internet by regular users, especially on twitter. To do so, the news published in natural language is processed, and the relevant entities in the text are identified and structured in order to make a semantic analysis of them. The news reported is structured in the proposed model of events and thus the queries about their properties and relationships could be answered. As a consequence, the result of this work facilitates the analysis of the events occurred on the traffic process.

Page generated in 0.1048 seconds