• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

The crucial parts of text classification with TensorFlow.js and categorisation of news articles

Nordberg, Gustav, Grandien, Jesper January 2020 (has links)
Text classification is a subset of machine learning which is used to classify texts such as tweets, email, news headlines or articles, with tags or categories. As news publishing can have uncertainty in their categorisations, text classification could categorise articles autonomously and distinguish unclear categorisations. The library TensorFlow helps with operations and tools for the machine learning workflow.  This paper takes focus on the crucial parts of working with machine learning using TensorFlow.js and to what extent this model can categorise a news article. The authors evaluates different models to analyse how optimising the settings will affect the accuracy of the model. Results of this paper was researched with a literature study of official documentations and peer reviewed reports. An empirical experiment where machine learning models were trained in TensorFlow.js was also performed. The results showed that the model with the highest accuracy with 87.17% accuracy was trained with 1000 articles using Relu and Softmax activation functions and the Mean squared error loss function. While the model with lowest accuracy had 75.5% using Sigmoid activation functions and Categorical cross-entropy on the 5000 articles training set. Crucial parts for this development were: optimizer function, loss function, batch size, activation functions, training data and test data with labels, normalise function, shapes of layers and computing power. There are several parts and functions to take in consideration when developing a machine learning model with text classification in TensorFlow.js. The training process needs to be performed multiple times as there are many parameters which has an affect on the model results. The model results can be improved by optimising and finding the best combination between different functions and parameters.
2

Disorderclassifier: classificação de texto para categorização de transtornos mentais

NUNES, Francisca Pâmela Carvalho 23 August 2016 (has links)
Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2017-04-19T13:35:36Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) DISSERTAÇÃO_Franscisca Pamela Carvalho.pdf: 2272114 bytes, checksum: 83ff79a7d05409b93fe71ce4c307dc30 (MD5) / Made available in DSpace on 2017-04-19T13:35:36Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) DISSERTAÇÃO_Franscisca Pamela Carvalho.pdf: 2272114 bytes, checksum: 83ff79a7d05409b93fe71ce4c307dc30 (MD5) Previous issue date: 2016-08-23 / Nos últimos anos, através da Internet, a comunicação se tornou mais ampla e acessível. Com o grande crescimento das redes sociais, blogs, sites em geral, foi possível estabelecer uma extensa base de conteúdo diversificado, onde os usuários apresentam suas opiniões e relatos pessoais. Esses informes podem ser relevantes para observações futuras ou até mesmo para o auxílio na tomada de decisão de outras pessoas. No entanto, essa massa de informação está esparsa na Web, em formato livre, dificultando a análise manual dos textos para categorização dos mesmos. Tornar esse trabalho automático é a melhor opção, porém a compreensão desses textos em formato livre não é um trabalho simples para o computador, devido a irregularidades e imprecisões da língua natural. Nessas circunstâncias, estão surgindo sistemas que classificam textos, de forma automática, por tema, gênero, características, entre outros, através dos conceitos da área de Mineração de Texto (MT). A MT objetiva extrair informações importantes de um texto, através da análise de um conjunto de documentos textuais. Diversos trabalhos de MT foram sugeridos em âmbitos variados como, por exemplo, no campo da psiquiatria. Vários dos trabalhos propostos, nessa área, buscam identificar características textuais para percepção de distúrbios psicológicos, para análise dos sentimentos de pacientes, para detecção de problemas de segurança de registros médicos ou até mesmo para exploração da literatura biomédica. O trabalho aqui proposto, busca analisar depoimentos pessoais de potenciais pacientes para categorização dos textos por tipo de transtorno mental, seguindo a taxonomia DSM-5. O procedimento oferecido classifica os relatos pessoais coletados, em quatro tipos de transtorno (Anorexia, TOC, Autismo e Esquizofrenia). Utilizamos técnicas de MT para o pré-processamento e classificação de texto, com o auxilio dos pacotes de software do Weka. Resultados experimentais mostraram que o método proposto apresenta alto índice de precisão e que a fase de pré-processamento do texto tem impacto nesses resultados. A técnica de classificação Support Vector Machine (SVM) apresentou melhor desempenho, para os fins apresentados, em comparação a outras técnicas usadas na literatura. / In the last few years, through the internet, communication became broader and more accessible. With the growth of social media, blogs, and websites in general, it became possible to establish a broader, diverse content base, where users present their opinions and personal stories. These data can be relevant to future observations or even to help other people’s decision process. However, this mass information is dispersing on the web, in free format, hindering the manual analysis for text categorization. Automating is the best option. However, comprehension of these texts in free format is not a simple task for the computer, taking into account irregularities and imprecisions of natural language. Giving these circumstances, automated text classification systems, by theme, gender, features, among others, are arising, through Text Mining (MT) concepts. MT aims to extract information from a text, by analyzing a set of text documents. Several MT papers were suggested on various fields, as an example, psychiatric fields. A number of proposed papers, in this area, try to identify textual features to perceive psychological disorders, to analyze patient’s sentiments, to detect security problems in medical records or even biomedical literature exploration. The paper here proposed aim to analyze potential patient’s personal testimonies for text categorization by mental disorder type, according to DSM-5 taxonomy. The offered procedure classifies the collected personal testimonies in four disorder types (anorexia, OCD, autism, and schizophrenia). MT techniques were used for pre-processing and text classification, with the support of software packages of Weka. Experimental results showed that the proposed method presents high precision values and the text pre-processing phase has impact in these results. The Support Vector Machine (SVM) classification technique presented better performance, for the presented ends, in comparison to other techniques used in literature.
3

Classifying Urgency : A Study in Machine Learning for Classifying the Level of Medical Emergency of an Animal’s Situation

Strallhofer, Daniel, Ahlqvist, Jonatan January 2018 (has links)
This paper explores the use of Naive Bayes as well a Linear Support Vector Machines in order to classify a text based on the level of medical emergency. The primary source of testing will be an online veterinarian service’s customer data. The aspects explored are whether a single text gives enough information for a medical decision to be made and if there are alternative data gathering processes that would be preferred. Past research has proven that text classifiers based on Naive Bayes and SVMs can often give good results. We show how to optimize the results so that important decisions can be made with these classifications as a basis. Optimal data gathering procedures will be a part of this optimization process. The business applications of such a venture will also be discussed since implementing such a system in an online medical service will possibly affect customer flow, goodwill, cost/revenue, and online competitiveness. / Denna studie utforskar användandet av Naive Bayes samt Linear Support Vector Machines för att klassificera en text på en medicinsk skala. Den huvudsakliga datamängden som kommer att användas för att göra detta är kundinformation från en online veterinär. Aspekter som utforskas är om en enda text kan innehålla tillräckligt med information för att göra ett medicinskt beslut och om det finns alternativa metoder för att samla in mer anpassade datamängder i framtiden. Tidigare studier har bevisat att både Naive Bayes och SVMs ofta kan nå väldigt bra resultat. Vi visar hur man kan optimera resultat för att främja framtida studier. Optimala metoder för att samla in datamängder diskuteras som en del av optimeringsprocessen. Slutligen utforskas även de affärsmässiga aspekterna utigenom implementationen av ett datalogiskt system och hur detta kommer påverka kundflödet, goodwill, intäkter/kostnader och konkurrenskraft.

Page generated in 0.0824 seconds