• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 57
  • 18
  • 13
  • 7
  • 5
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 121
  • 121
  • 64
  • 57
  • 49
  • 42
  • 28
  • 28
  • 27
  • 26
  • 24
  • 21
  • 20
  • 18
  • 17
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

Sentiment-Driven Topic Analysis Of Song Lyrics

Sharma, Govind 08 1900 (has links) (PDF)
Sentiment Analysis is an area of Computer Science that deals with the impact a document makes on a user. The very field is further sub-divided into Opinion Mining and Emotion Analysis, the latter of which is the basis for the present work. Work on songs is aimed at building affective interactive applications such as music recommendation engines. Using song lyrics, we are interested in both supervised and unsupervised analyses, each of which has its own pros and cons. For an unsupervised analysis (clustering), we use a standard probabilistic topic model called Latent Dirichlet Allocation (LDA). It mines topics from songs, which are nothing but probability distributions over the vocabulary of words. Some of the topics seem sentiment-based, motivating us to continue with this approach. We evaluate our clusters using a gold dataset collected from an apt website and get positive results. This approach would be useful in the absence of a supervisor dataset. In another part of our work, we argue the inescapable existence of supervision in terms of having to manually analyse the topics returned. Further, we have also used explicit supervision in terms of a training dataset for a classifier to learn sentiment specific classes. This analysis helps reduce dimensionality and improve classification accuracy. We get excellent dimensionality reduction using Support Vector Machines (SVM) for feature selection. For re-classification, we use the Naive Bayes Classifier (NBC) and SVM, both of which perform well. We also use Non-negative Matrix Factorization (NMF) for classification, but observe that the results coincide with those of NBC, with no exceptions. This drives us towards establishing a theoretical equivalence between the two.
112

Optimalizace strojového učení pro predikci KPI / Machine Learning Optimization of KPI Prediction

Haris, Daniel January 2018 (has links)
This thesis aims to optimize the machine learning algorithms for predicting KPI metrics for an organization. The organization is predicting whether projects meet planned deadlines of the last phase of development process using machine learning. The work focuses on the analysis of prediction models and sets the goal of selecting new candidate models for the prediction system. We have implemented a system that automatically selects the best feature variables for learning. Trained models were evaluated by several performance metrics and the best candidates were chosen for the prediction. Candidate models achieved higher accuracy, which means, that the prediction system provides more reliable responses. We suggested other improvements that could increase the accuracy of the forecast.
113

Analýza klasifikačních metod / Analysis of Classification Methods

Juríček, Jakub January 2019 (has links)
This work deals with the classification methods used in the knowledge discovery from data process and discusses the possibilities of their validation and comparison. Through experiments, the work focuses on the analysis of four selected methods: Naive Bayes classificator, decision tree, neural network and SVM. Factors influencing basic characteristics such as training speed, classification speed, accuracy are examined. A part of the thesis is a desktop application, which is a tool for training, testing and validation of individual methods. Eleven reference data sets are selected for experimental purposes. At the end of this work experimental results of comparison and observed characteristics of classification methods are summarized.
114

Identifikace zařízení na základě jejich chování v síti / Behaviour-Based Identification of Network Devices

Polák, Michael Adam January 2020 (has links)
Táto práca sa zaoberá problematikou identifikácie sieťových zariadení na základe ich chovania v sieti. S neustále sa zvyšujúcim počtom zariadení na sieti je neustále dôležitejšia schopnosť identifikovať zariadenia z bezpečnostných dôvodov. Táto práca ďalej pojednáva o základoch počítačových sietí a metódach, ktoré boli využívané v minulosti na identifikáciu sieťových zariadení. Následne sú popísané algoritmy využívané v strojovom učení a taktiež sú popísané ich výhody i nevýhody. Nakoniec, táto práca otestuje dva tradičné algorithmy strojového učenia a navrhuje dva nové prístupy na identifikáciu sieťových zariadení. Výsledný navrhovaný algoritmus v tejto práci dosahuje 89% presnosť identifikácii sieťových zariadení na reálnej dátovej sade s viac ako 10000 zariadeniami.
115

Využití metod dolování dat pro analýzu sociálních sítí / Using of Data Mining Method for Analysis of Social Networks

Novosad, Andrej January 2013 (has links)
Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined.
116

Inteligentní emailová schránka / Intelligent Mailbox

Pohlídal, Antonín January 2012 (has links)
This master's thesis deals with the use of text classification for sorting of incoming emails. First, there is described the Knowledge Discovery in Databases and there is also analyzed in detail the text classification with selected methods. Further, this thesis describes the email communication and SMTP, POP3 and IMAP protocols. The next part contains design of the system that classifies incoming emails and there are also described realated technologie ie Apache James Server, PostgreSQL and RapidMiner. Further, there is described the implementation of all necessary components. The last part contains an experiments with email server using Enron Dataset.
117

Elektronický modul pro akustickou detekci / Electronic module for acoustic detection

Maršál, Martin January 2016 (has links)
This diploma thesis deals with the design and implementation of an electronic module for acoustic detection. The module has the task of detecting a predetermined acoustic signals through them learned classification model. The module is used mainly for security purposes. To identify and classify the proposed model using machine learning techniques. Given the possibility of retraining for a different set of sounds, the module becomes a universal sound detector. With acoustic sound using the digital MEMS microphone, for which it is designed and implemented conversion filter. The resulting system is implemented into firmware microcontroller with real time operating system. The various functions of the system are realized with regard to the possible optimization (less powerful MCU or battery power). The module transmits the detection results of the master station via Ethernet network. In the case of multiple modules connected to the network to create a distributed system, which is designed for precise time synchronization using PTP protocol defined by the IEEE-1588 standard.
118

Performance Comparison of Public Bike Demand Predictions: The Impact of Weather and Air Pollution

Min Namgung (9380318) 15 December 2020 (has links)
Many metropolitan cities motivate people to exploit public bike-sharing programs as alternative transportation for many reasons. Due to its’ popularity, multiple types of research on optimizing public bike-sharing systems is conducted on city-level, neighborhood-level, station-level, or user-level to predict the public bike demand. Previously, the research on the public bike demand prediction primarily focused on discovering a relationship with weather as an external factor that possibly impacted the bike usage or analyzing the bike user trend in one aspect. This work hypothesizes two external factors that are likely to affect public bike demand: weather and air pollution. This study uses a public bike data set, daily temperature, precipitation data, and air condition data to discover the trend of bike usage using multiple machine learning techniques such as Decision Tree, Naïve Bayes, and Random Forest. After conducting the research, each algorithm’s output is evaluated with performance comparisons such as accuracy, precision, or sensitivity. As a result, Random Forest is an efficient classifier for the bike demand prediction by weather and precipitation, and Decision Tree performs best for the bike demand prediction by air pollutants. Also, the three class labelings in the daily bike demand has high specificity, and is easy to trace the trend of the public bike system.
119

All Negative on the Western Front: Analyzing the Sentiment of the Russian News Coverage of Sweden with Generic and Domain-Specific Multinomial Naive Bayes and Support Vector Machines Classifiers / På västfronten intet gott: attitydanalys av den ryska nyhetsrapporteringen om Sverige med generiska och domänspecifika Multinomial Naive Bayes- och Support Vector Machines-klassificerare

Michel, David January 2021 (has links)
This thesis explores to what extent Multinomial Naive Bayes (MNB) and Support Vector Machines (SVM) classifiers can be used to determine the polarity of news, specifically the news coverage of Sweden by the Russian state-funded news outlets RT and Sputnik. Three experiments are conducted.  In the first experiment, an MNB and an SVM classifier are trained with the Large Movie Review Dataset (Maas et al., 2011) with a varying number of samples to determine how training data size affects classifier performance.  In the second experiment, the classifiers are trained with 300 positive, negative, and neutral news articles (Agarwal et al., 2019) and tested on 95 RT and Sputnik news articles about Sweden (Bengtsson, 2019) to determine if the domain specificity of the training data outweighs its limited size.  In the third experiment, the movie-trained classifiers are put up against the domain-specific classifiers to determine if well-trained classifiers from another domain perform better than relatively untrained, domain-specific classifiers.  Four different types of feature sets (unigrams, unigrams without stop words removal, bigrams, trigrams) were used in the experiments. Some of the model parameters (TF-IDF vs. feature count and SVM’s C parameter) were optimized with 10-fold cross-validation.  Other than the superior performance of SVM, the results highlight the need for comprehensive and domain-specific training data when conducting machine learning tasks, as well as the benefits of feature engineering, and to a limited extent, the removal of stop words. Interestingly, the classifiers performed the best on the negative news articles, which made up most of the test set (and possibly of Russian news coverage of Sweden in general).
120

Improved in silico methods for target deconvolution in phenotypic screens

Mervin, Lewis January 2018 (has links)
Target-based screening projects for bioactive (orphan) compounds have been shown in many cases to be insufficiently predictive for in vivo efficacy, leading to attrition in clinical trials. Phenotypic screening has hence undergone a renaissance in both academia and in the pharmaceutical industry, partly due to this reason. One key shortcoming of this paradigm shift is that the protein targets modulated need to be elucidated subsequently, which is often a costly and time-consuming procedure. In this work, we have explored both improved methods and real-world case studies of how computational methods can help in target elucidation of phenotypic screens. One limitation of previous methods has been the ability to assess the applicability domain of the models, that is, when the assumptions made by a model are fulfilled and which input chemicals are reliably appropriate for the models. Hence, a major focus of this work was to explore methods for calibration of machine learning algorithms using Platt Scaling, Isotonic Regression Scaling and Venn-Abers Predictors, since the probabilities from well calibrated classifiers can be interpreted at a confidence level and predictions specified at an acceptable error rate. Additionally, many current protocols only offer probabilities for affinity, thus another key area for development was to expand the target prediction models with functional prediction (activation or inhibition). This extra level of annotation is important since the activation or inhibition of a target may positively or negatively impact the phenotypic response in a biological system. Furthermore, many existing methods do not utilize the wealth of bioactivity information held for orthologue species. We therefore also focused on an in-depth analysis of orthologue bioactivity data and its relevance and applicability towards expanding compound and target bioactivity space for predictive studies. The realized protocol was trained with 13,918,879 compound-target pairs and comprises 1,651 targets, which has been made available for public use at GitHub. Consequently, the methodology was applied to aid with the target deconvolution of AstraZeneca phenotypic readouts, in particular for the rationalization of cytotoxicity and cytostaticity in the High-Throughput Screening (HTS) collection. Results from this work highlighted which targets are frequently linked to the cytotoxicity and cytostaticity of chemical structures, and provided insight into which compounds to select or remove from the collection for future screening projects. Overall, this project has furthered the field of in silico target deconvolution, by improving the performance and applicability of current protocols and by rationalizing cytotoxicity, which has been shown to influence attrition in clinical trials.

Page generated in 0.0409 seconds