Global ETD Search

41	Email Classification : An evaluation of Deep Neural Networks with Naive Bayes Michailoff, John January 2019 (has links) Machine learning (ML) is an area of computer science that gives computers the ability to learn data patterns without prior programming for those patterns. Using neural networks in this area is based on simulating the biological functions of neurons in brains to learn patterns in data, giving computers a predictive ability to comprehend how data can be clustered. This research investigates the possibilities of using neural networks for classifying email, i.e. working as an email case manager. A Deep Neural Network (DNN) are multiple layers of neurons connected to each other by trainable weights. The main objective of this thesis was to evaluate how the three input arguments - data size, training time and neural network structure – affects the accuracy of Deep Neural Networks pattern recognition; also an evaluation of how the DNN performs compared to the statistical ML method, Naïve Bayes, in the form of prediction accuracy and complexity; and finally the viability of the resulting DNN as a case manager. Results show an improvement of accuracy on our networks with the increase of training time and data size respectively. By testing increasingly complex network structures (larger networks of neurons with more layers) it is observed that overfitting becomes a problem with increased training time, i.e. how accuracy decrease after a certain threshold of training time. Naïve Bayes classifiers performs worse than DNN in terms of accuracy, but better in reduced complexity; making NB viable on mobile platforms. We conclude that our developed prototype may work well in tangent with existing case management systems, tested by future research. Machine learning neural network DNN Naive Bayes network complexity Software Engineering Programvaruteknik
42	Predicting SNI Codes from Company Descriptions : A Machine Learning Solution Lindholm, Erik, Nilsson, Jonas January 2023 (has links) This study aims to develop an automated solution for assigning area of industry codes to businesses based on the contents of their business descriptions. The Swedish standard industrial classification (SNI) is a system used by Statistics Sweden (SCB) for categorizing businesses for their statistics reports. Assignment of SNI codes has so far been done manually by the person registering a new company, but this is a far from optimal solution. Some of the 88 main group areas of industry are hard to tell apart from one another, and this often leads to incorrect assignments. Our approach to this problem was to train a machine learning model using the Naive Bayes and SVM classifier algorithms and conduct an experiment. In 2019, Dahlqvist and Strandlund had attempted this and reached an accuracy score of 52 percent by use of the gradient boosting classifier, but this was considered too low for real-world implementation. Our main goal was to achieve a higher accuracy than that of Dahlqvist and Strandlund, which we eventually succeeded in - our best-performing SVM model reached a score of 60.11 percent. Similarly to Dahlqvist and Strandlund, we concluded that the low quality of the dataset was the main obstacle for achieving higher scores. The dataset we used was severely imbalanced, and much time was spent on investigating and applying oversampling and undersampling as strategies for mitigating this problem. However, we found during the testing phase that none of these strategies had any positive effect on the accuracy scores. Machine learning text classification SNI Naive Bayes SVM oversampling undersampling Computer Sciences Datavetenskap (datalogi)
43	Variant Detection Using Next Generation Sequencing Data Pyon, Yoon Soo 08 March 2013 (has links) No description available. Bioinformatics Computer Science Genetics structural variation SNP next generation sequencing naive Bayes classifier
44	Relationships Among Learning Algorithms and Tasks Lee, Jun won 27 January 2011 (has links) (PDF) Metalearning aims to obtain knowledge of the relationship between the mechanism of learning and the concrete contexts in which that mechanisms is applicable. As new mechanisms of learning are continually added to the pool of learning algorithms, the chances of encountering behavior similarity among algorithms are increased. Understanding the relationships among algorithms and the interactions between algorithms and tasks help to narrow down the space of algorithms to search for a given learning task. In addition, this process helps to disclose factors contributing to the similar behavior of different algorithms. We first study general characteristics of learning tasks and their correlation with the performance of algorithms, isolating two metafeatures whose values are fairly distinguishable between easy and hard tasks. We then devise a new metafeature that measures the difficulty of a learning task that is independent of the performance of learning algorithms on it. Building on these preliminary results, we then investigate more formally how we might measure the behavior of algorithms at a ner grained level than a simple dichotomy between easy and hard tasks. We prove that, among all many possible candidates, the Classifi er Output Difference (COD) measure is the only one possessing the properties of a metric necessary for further use in our proposed behavior-based clustering of learning algorithms. Finally, we cluster 21 algorithms based on COD and show the value of the clustering in 1) highlighting interesting behavior similarity among algorithms, which leads us to a thorough comparison of Naive Bayes and Radial Basis Function Network learning, and 2) designing more accurate algorithm selection models, by predicting clusters rather than individual algorithms. MetaLearning Classifier Output Difference Naive Bayes radial basis function network clustering algorithm selection model Computer Sciences
45	Identifying Interesting Posts on Social Media Sites Seethakkagari, Swathi, M.S. 21 September 2012 (has links) No description available. Computer Science Social networks k-nearest neighbors Naive Bayes Classi- fication Confusion Matrix
46	Exploring the Noise Resilience of Combined Sturges Algorithm Agarwal, Akrita January 2015 (has links) No description available. Computer Science Noise Resilience Machine Learning Algorithms Combined Sturges Naive Bayes k nearest neighbor
47	A Massively Parallel Algorithm for Cell Classification Using CUDA Schmidt, Samuel January 2015 (has links) No description available. Computer Science Cell Classification Parallel Computing CUDA Machine Learning Naive Bayes CUDA Reduce
48	Using sentiment analysis to craft a narrative of the COVID-19 pandemic from the perspective of social media Ray, Taylor Breanna 06 August 2021 (has links) Throughout the COVID-19 pandemic, people have turned to social media to share their experiences with the coronavirus and their feelings regarding subjects like social distancing, mask-wearing, COVID-19 vaccines, and other related topics. The publicly available nature of these social media posts provides researchers the chance to obtain a consensus on an array of issues, topics, people, and entities. For the COVID-19 pandemic, this is valuable information that can prepare communities and governing bodies for future epidemics or events of a similar magnitude. However, clearly defining such a consensus can be difficult, especially if researchers want to limit the amount of bias they introduce. The process of sentiment analysis helps to address this need by categorizing text sources into one of three distinct polarities. Namely, those polarities are often positive, neutral, and negative. While sentiment analysis can take form as a completely manual task, this becomes incredibly burdensome for projects that involve substantial amounts of data. This thesis attempts to overcome this challenge by programmatically classifying the sentiment of COVID-19 posts from 10 social media and web-based forums using a multinomial Naive Bayes classifier. The unique and contrasting qualities of the social networks being analyzed provide a robust take on the public's perception of the pandemic that has not yet been offered up to the present. sentiment sentiment analysis naive bayes machine learning classification COVID-19 pandemic social media
49	Social media analysis for product safety using text mining and sentiment analysis Isa, H., Trundle, Paul R., Neagu, Daniel January 2014 (has links) No / The growing incidents of counterfeiting and associated economic and health consequences necessitate the development of active surveillance systems capable of producing timely and reliable information for all stake holders in the anti-counterfeiting fight. User generated content from social media platforms can provide early clues about product allergies, adverse events and product counterfeiting. This paper reports a work in progress with contributions including: the development of a framework for gathering and analyzing the views and experiences of users of drug and cosmetic products using machine learning, text mining and sentiment analysis; the application of the proposed framework on Facebook comments and data from Twitter for brand analysis, and the description of how to develop a product safety lexicon and training data for modeling a machine learning classifier for drug and cosmetic product sentiment prediction. The initial brand and product comparison results signify the usefulness of text mining and sentiment analysis on social media data while the use of machine learning classifier for predicting the sentiment orientation provides a useful tool for users, product manufacturers, regulatory and enforcement agencies to monitor brand or product sentiment trends in order to act in the event of sudden or significant rise in negative sentiment. Yes
50	Topological data analysis: applications in machine learning / Análise topológica de dados: aplicações em aprendizado de máquina Calcina, Sabrina Graciela Suárez 05 December 2018 (has links) Recently computational topology had an important development in data analysis giving birth to the field of Topological Data Analysis. Persistent homology appears as a fundamental tool based on the topology of data that can be represented as points in metric space. In this work, we apply techniques of Topological Data Analysis, more precisely, we use persistent homology to calculate topological features more persistent in data. In this sense, the persistence diagrams are processed as feature vectors for applying Machine Learning algorithms. In order to classification, we used the following classifiers: Partial Least Squares-Discriminant Analysis, Support Vector Machine, and Naive Bayes. For regression, we used Support Vector Regression and KNeighbors. Finally, we will give a certain statistical approach to analyze the accuracy of each classifier and regressor. / Recentemente a topologia computacional teve um importante desenvolvimento na análise de dados dando origem ao campo da Análise Topológica de Dados. A homologia persistente aparece como uma ferramenta fundamental baseada na topologia de dados que possam ser representados como pontos num espaço métrico. Neste trabalho, aplicamos técnicas da Análise Topológica de Dados, mais precisamente, usamos homologia persistente para calcular características topológicas mais persistentes em dados. Nesse sentido, os diagramas de persistencia são processados como vetores de características para posteriormente aplicar algoritmos de Aprendizado de Máquina. Para classificação, foram utilizados os seguintes classificadores: Análise de Discriminantes de Minimos Quadrados Parciais, Máquina de Vetores de Suporte, e Naive Bayes. Para a regressão, usamos a Regressão de Vetores de Suporte e KNeighbors. Finalmente, daremos uma certa abordagem estatística para analisar a precisão de cada classificador e regressor. Betti numbers Classificação de proteínas Classificador Naive Bayes Classificador PLS-DA Classificador SVM Diagramas de persistencia Homologia persistente KNeighbors regressor Naive Bayes classifier Números de Betti Persistence diagrams Persistent homology PLS-DA classifier Protein classification Regressor KNeighbors Regressor SVR SVM classifier SVR regressor

Search results