• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 60
  • 18
  • 13
  • 7
  • 5
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 124
  • 124
  • 66
  • 60
  • 49
  • 43
  • 28
  • 28
  • 27
  • 27
  • 24
  • 21
  • 20
  • 19
  • 17
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
121

All Negative on the Western Front: Analyzing the Sentiment of the Russian News Coverage of Sweden with Generic and Domain-Specific Multinomial Naive Bayes and Support Vector Machines Classifiers / På västfronten intet gott: attitydanalys av den ryska nyhetsrapporteringen om Sverige med generiska och domänspecifika Multinomial Naive Bayes- och Support Vector Machines-klassificerare

Michel, David January 2021 (has links)
This thesis explores to what extent Multinomial Naive Bayes (MNB) and Support Vector Machines (SVM) classifiers can be used to determine the polarity of news, specifically the news coverage of Sweden by the Russian state-funded news outlets RT and Sputnik. Three experiments are conducted.  In the first experiment, an MNB and an SVM classifier are trained with the Large Movie Review Dataset (Maas et al., 2011) with a varying number of samples to determine how training data size affects classifier performance.  In the second experiment, the classifiers are trained with 300 positive, negative, and neutral news articles (Agarwal et al., 2019) and tested on 95 RT and Sputnik news articles about Sweden (Bengtsson, 2019) to determine if the domain specificity of the training data outweighs its limited size.  In the third experiment, the movie-trained classifiers are put up against the domain-specific classifiers to determine if well-trained classifiers from another domain perform better than relatively untrained, domain-specific classifiers.  Four different types of feature sets (unigrams, unigrams without stop words removal, bigrams, trigrams) were used in the experiments. Some of the model parameters (TF-IDF vs. feature count and SVM’s C parameter) were optimized with 10-fold cross-validation.  Other than the superior performance of SVM, the results highlight the need for comprehensive and domain-specific training data when conducting machine learning tasks, as well as the benefits of feature engineering, and to a limited extent, the removal of stop words. Interestingly, the classifiers performed the best on the negative news articles, which made up most of the test set (and possibly of Russian news coverage of Sweden in general).
122

Language identification for typologicallysimilar low-resource languages : A case study of Meänkieli, Kven and Finnish / Språkidentifering för typologiskt närbesläktadelågresursspråk : En fallstudie för meänkieli, kvänska och finska

Larsson, Jacob January 2024 (has links)
This studies examines different methods of language identification for the languages Meänkieli,Kven and Finnish. The methods explored are two n-gram based classifiers; Naive Bayes andTextCat and one word embedding based classifier; fastText. These models were trained on ap-proximately 100 000 sentences taken from the three languages and further divided into fourseparate datasets to examine how the availability of data impacts the final performance of thetrained models. The study found that the best model for the examined dataset was the fastTextclassifier, but for languages with less available material a naive Bayes classifier might be moreappropriate. / Denna studie utforskar olika metoder av språkidentifering för språken meänkieli, kvänska ochfinska. Två metoder baserade på n-gram undersöks; naive Bayes och TextCat samt en metodmed ordinbäddningar; fastText. Dessa modeller tränades på sammanlagt 100 000 meningartaget från dessa tre språk och delades vidare in i fyra delmängder för att utvärdera hur storinverkan storleken av träningsdata har på de tränade modellerna. Studien fann att den bästaimplementationen utifrån den undersökta datamängden var fastText, medans språk med färreresurser skulle förmodligen gynnas bättre av en språkidentifering byggd med en naive Bayesklassifierare.
123

Improved in silico methods for target deconvolution in phenotypic screens

Mervin, Lewis January 2018 (has links)
Target-based screening projects for bioactive (orphan) compounds have been shown in many cases to be insufficiently predictive for in vivo efficacy, leading to attrition in clinical trials. Phenotypic screening has hence undergone a renaissance in both academia and in the pharmaceutical industry, partly due to this reason. One key shortcoming of this paradigm shift is that the protein targets modulated need to be elucidated subsequently, which is often a costly and time-consuming procedure. In this work, we have explored both improved methods and real-world case studies of how computational methods can help in target elucidation of phenotypic screens. One limitation of previous methods has been the ability to assess the applicability domain of the models, that is, when the assumptions made by a model are fulfilled and which input chemicals are reliably appropriate for the models. Hence, a major focus of this work was to explore methods for calibration of machine learning algorithms using Platt Scaling, Isotonic Regression Scaling and Venn-Abers Predictors, since the probabilities from well calibrated classifiers can be interpreted at a confidence level and predictions specified at an acceptable error rate. Additionally, many current protocols only offer probabilities for affinity, thus another key area for development was to expand the target prediction models with functional prediction (activation or inhibition). This extra level of annotation is important since the activation or inhibition of a target may positively or negatively impact the phenotypic response in a biological system. Furthermore, many existing methods do not utilize the wealth of bioactivity information held for orthologue species. We therefore also focused on an in-depth analysis of orthologue bioactivity data and its relevance and applicability towards expanding compound and target bioactivity space for predictive studies. The realized protocol was trained with 13,918,879 compound-target pairs and comprises 1,651 targets, which has been made available for public use at GitHub. Consequently, the methodology was applied to aid with the target deconvolution of AstraZeneca phenotypic readouts, in particular for the rationalization of cytotoxicity and cytostaticity in the High-Throughput Screening (HTS) collection. Results from this work highlighted which targets are frequently linked to the cytotoxicity and cytostaticity of chemical structures, and provided insight into which compounds to select or remove from the collection for future screening projects. Overall, this project has furthered the field of in silico target deconvolution, by improving the performance and applicability of current protocols and by rationalizing cytotoxicity, which has been shown to influence attrition in clinical trials.
124

Porovnání klasifikačních metod / Comparison of Classification Methods

Dočekal, Martin January 2019 (has links)
This thesis deals with a comparison of classification methods. At first, these classification methods based on machine learning are described, then a classifier comparison system is designed and implemented. This thesis also describes some classification tasks and datasets on which the designed system will be tested. The evaluation of classification tasks is done according to standard metrics. In this thesis is presented design and implementation of a classifier that is based on the principle of evolutionary algorithms.

Page generated in 0.0409 seconds