• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 131
  • 9
  • 9
  • 5
  • 4
  • 3
  • 3
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 191
  • 69
  • 60
  • 57
  • 56
  • 43
  • 40
  • 39
  • 38
  • 36
  • 36
  • 35
  • 31
  • 28
  • 25
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
101

Rozpoznávání pojmenovaných entit pomocí neuronových sítí / Neural Network Based Named Entity Recognition

Straková, Jana January 2017 (has links)
Title: Neural Network Based Named Entity Recognition Author: Jana Straková Institute: Institute of Formal and Applied Linguistics Supervisor of the doctoral thesis: prof. RNDr. Jan Hajič, Dr., Institute of Formal and Applied Linguistics Abstract: Czech named entity recognition (the task of automatic identification and classification of proper names in text, such as names of people, locations and organizations) has become a well-established field since the publication of the Czech Named Entity Corpus (CNEC). This doctoral thesis presents the author's research of named entity recognition, mainly in the Czech language. It presents work and research carried out during CNEC publication and its evaluation. It fur- ther envelops the author's research results, which improved Czech state-of-the-art results in named entity recognition in recent years, with special focus on artificial neural network based solutions. Starting with a simple feed-forward neural net- work with softmax output layer, with a standard set of classification features for the task, the thesis presents methodology and results, which were later used in open-source software solution for named entity recognition, NameTag. The thesis finalizes with a recurrent neural network based recognizer with word embeddings and character-level word embeddings,...
102

Aplikace metody učení bez učitele na hledání podobných grafů / Application of Unsupervised Learning Methods in Graph Similarity Search

Sabo, Jozef January 2021 (has links)
Goal of this master's thesis was in cooperation with the company Avast to design a system, which can extract knowledge from a database of graphs. Graphs, used for data mining, describe behaviour of computer systems and they are anonymously inserted into the company's database from systems of the company's products users. Each graph in the database can be assigned with one of two labels: clean or malware (malicious) graph. The task of the proposed self-learning system is to find clusters of graphs in the graph database, in which the classes of graphs do not mix. Graph clusters with only one class of graphs can be interpreted as different types of clean or malware graphs and they are a useful source of further analysis on the graphs. To evaluate the quality of the clusters, a custom metric, named as monochromaticity, was designed. The metric evaluates the quality of the clusters based on how much clean and malware graphs are mixed in the clusters. The best results of the metric were obtained when vector representations of graphs were created by a deep learning model (variational  graph autoencoder with two relation graph convolution operators) and the parameterless method MeanShift was used for clustering over vectors.
103

On the effect of asymmetry and dimension on computational geometric problems

Sridhar, Vijay, Sridhar 07 November 2018 (has links)
No description available.
104

Implementation and evaluation of two prediction techniques for the Lorenz time series

Huddlestone, Grant E 03 1900 (has links)
Thesis (MSc)-- Stellenbosch University, 2003. / ENGLISH ABSTRACT: This thesis implements and evaluates two prediction techniques used to forecast deterministic chaotic time series. For a large number of such techniques, the reconstruction of the phase space attractor associated with the time series is required. Embedding is presented as the means of reconstructing the attractor from limited data. Methods for obtaining the minimal embedding dimension and optimal time delay from the false neighbour heuristic and average mutual information method are discussed. The first prediction algorithm that is discussed is based on work by Sauer, which includes the implementation of the singular value decomposition on data obtained from the embedding of the time series being predicted. The second prediction algorithm is based on neural networks. A specific architecture, suited to the prediction of deterministic chaotic time series, namely the time dependent neural network architecture is discussed and implemented. Adaptations to the back propagation training algorithm for use with the time dependent neural networks are also presented. Both algorithms are evaluated by means of predictions made for the well-known Lorenz time series. Different embedding and algorithm-specific parameters are used to obtain predicted time series. Actual values corresponding to the predictions are obtained from Lorenz time series, which aid in evaluating the prediction accuracies. The predicted time series are evaluated in terms of two criteria, prediction accuracy and qualitative behavioural accuracy. Behavioural accuracy refers to the ability of the algorithm to simulate qualitative features of the time series being predicted. It is shown that for both algorithms the choice of the embedding dimension greater than the minimum embedding dimension, obtained from the false neighbour heuristic, produces greater prediction accuracy. For the neural network algorithm, values of the embedding dimension greater than the minimum embedding dimension satisfy the behavioural criterion adequately, as expected. Sauer's algorithm has the greatest behavioural accuracy for embedding dimensions smaller than the minimal embedding dimension. In terms of the time delay, it is shown that both algorithms have the greatest prediction accuracy for values of the time delay in a small interval around the optimal time delay. The neural network algorithm is shown to have the greatest behavioural accuracy for time delay close to the optimal time delay and Sauer's algorithm has the best behavioural accuracy for small values of the time delay. Matlab code is presented for both algorithms. / AFRIKAANSE OPSOMMING: In hierdie tesis word twee voorspellings-tegnieke geskik vir voorspelling van deterministiese chaotiese tydreekse ge"implementeer en geevalueer. Vir sulke tegnieke word die rekonstruksie van die aantrekker in fase-ruimte geassosieer met die tydreeks gewoonlik vereis. Inbedmetodes word aangebied as 'n manier om die aantrekker te rekonstrueer uit beperkte data. Metodes om die minimum inbed-dimensie te bereken uit gemiddelde wedersydse inligting sowel as die optimale tydsvertraging te bereken uit vals-buurpunt-heuristiek, word bespreek. Die eerste voorspellingsalgoritme wat bespreek word is gebaseer op 'n tegniek van Sauer. Hierdie algoritme maak gebruik van die implementering van singulierwaarde-ontbinding van die ingebedde tydreeks wat voorspel word. Die tweede voorspellingsalgoritme is gebaseer op neurale netwerke. 'n Spesifieke netwerkargitektuur geskik vir deterministiese chaotiese tydreekse, naamlik die tydafhanklike neurale netwerk argitektuur word bespreek en ge"implementeer. 'n Modifikasie van die terugprapagerende leer-algoritme vir gebruik met die tydafhanklike neurale netwerk word ook aangebied. Albei algoritmes word geevalueer deur voorspellings te maak vir die bekende Lorenz tydreeks. Verskeie inbed parameters en ander algoritme-spesifieke parameters word gebruik om die voorspelling te maak. Die werklike waardes vanuit die Lorentz tydreeks word gebruik om die voorspellings te evalueer en om voorspellingsakkuraatheid te bepaal. Die voorspelde tydreekse word geevalueer op grand van twee kriteria, naamlik voorspellingsakkuraatheid, en kwalitatiewe gedragsakkuraatheid. Gedragsakkuraatheid verwys na die vermoe van die algoritme om die kwalitatiewe eienskappe van die tydreeks korrek te simuleer. Daar word aangetoon dat vir beide algoritmes die keuse van inbed-dimensie grater as die minimum inbeddimensie soos bereken uit die vals-buurpunt-heuristiek, grater akkuraatheid gee. Vir die neurale netwerkalgoritme gee 'n inbed-dimensie grater as die minimum inbed-dimensie ook betel' gedragsakkuraatheid soos verwag. Vir Sauer se algoritme, egter, word betel' gedragsakkuraatheid gevind vir 'n inbed-dimensie kleiner as die minimale inbed-dimensie. In terme van tydsvertraging word dit aangetoon dat vir beide algoritmes die grootste voorspellingsakkuraatheid verkry word by tydvertragings in 'n interval rondom die optimale tydsvetraging. Daar word ook aangetoon dat die neurale netwerk-algoritme die beste gedragsakkuraatheid gee vir tydsvertragings naby aan die optimale tydsvertraging, terwyl Sauer se algoritme betel' gedragsakkuraatheid gee by kleineI' waardes van die tydsvertraging. Die Matlab kode van beide algoritmes word ook aangebied.
105

Using Word Embeddings to Explore the Language of Depression on Twitter

Gopchandani, Sandhya 01 January 2019 (has links)
How do people discuss mental health on social media? Can we train a computer program to recognize differences between discussions of depression and other topics? Can an algorithm predict that someone is depressed from their tweets alone? In this project, we collect tweets referencing “depression” and “depressed” over a seven year period, and train word embeddings to characterize linguistic structures within the corpus. We find that neural word embeddings capture the contextual differences between “depressed” and “healthy” language. We also looked at how context around words may have changed over time to get deeper understanding of contextual shifts in the word usage. Finally, we trained a deep learning network on a much smaller collection of tweets authored by individuals formally diagnosed with depression. The best performing model for the prediction task is Convolutional LSTM (CNN-LSTM) model with a F-score of 69% on test data. The results suggest social media could serve as a valuable screening tool for mental health.
106

Singularity theorems and the abstract boundary construction

Ashley, Michael John Siew Leung, ashley@gravity.psu.edu January 2002 (has links)
The abstract boundary construction of Scott and Szekeres has proven a practical classification scheme for boundary points of pseudo-Riemannian manifolds. It has also proved its utility in problems associated with the re-embedding of exact solutions containing directional singularities in space-time. Moreover it provides a model for singularities in space-time - essential singularities. However the literature has been devoid of abstract boundary results which have results of direct physical applicability.¶ This thesis presents several theorems on the existence of essential singularities in space-time and on how the abstract boundary allows definition of optimal em- beddings for depicting space-time. Firstly, a review of other boundary constructions for space-time is made with particular emphasis on the deficiencies they possess for describing singularities. The abstract boundary construction is then pedagogically defined and an overview of previous research provided.¶ We prove that strongly causal, maximally extended space-times possess essential singularities if and only if they possess incomplete causal geodesics. This result creates a link between the Hawking-Penrose incompleteness theorems and the existence of essential singularities. Using this result again together with the work of Beem on the stability of geodesic incompleteness it is possible to prove the stability of existence for essential singularities.¶ Invariant topological contact properties of abstract boundary points are presented for the first time and used to define partial cross sections, which are an generalization of the notion of embedding for boundary points. Partial cross sections are then used to define a model for an optimal embedding of space-time.¶ Finally we end with a presentation of the current research into the relationship between curvature singularities and the abstract boundary. This work proposes that the abstract boundary may provide the correct framework to prove curvature singularity theorems for General Relativity. This exciting development would culminate over 30 years of research into the physical conditions required for curvature singularities in space-time.
107

Klasifikace žánrů pomocí strojového učení / Genres classification by means of machine learning

Bílek, Jan January 2018 (has links)
In this thesis, we compare the bag of words approach with doc2vec doc- ument embeddings on the task of classification of book genres. We cre- ate 3 datasets with different text lengths by extracting short snippets from books in Project Gutenberg repository. Each dataset comprises of more than 200000 documents and 14 different genres. For 3200-character documents, we achieve F1-score of 0.862 when stacking models trained on both bag of words and doc2vec representations. We also explore the relationships be- tween documents, genres and words using similarity metrics on their vector representations and report typical words for each genre. As part of the thesis, we also present an online webapp for book genre classification. 1
108

Automatic Poetry Classification Using Natural Language Processing

Kesarwani, Vaibhav January 2018 (has links)
Poetry, as a special form of literature, is crucial for computational linguistics. It has a high density of emotions, figures of speech, vividness, creativity, and ambiguity. Poetry poses a much greater challenge for the application of Natural Language Processing algorithms than any other literary genre. Our system establishes a computational model that classifies poems based on similarity features like rhyme, diction, and metaphor. For rhyme analysis, we investigate the methods used to classify poems based on rhyme patterns. First, the overview of different types of rhymes is given along with the detailed description of detecting rhyme type and sub-types by the application of a pronunciation dictionary on our poetry dataset. We achieve an accuracy of 96.51% in identifying rhymes in poetry by applying a phonetic similarity model. Then we achieve a rhyme quantification metric RhymeScore based on the matching phonetic transcription of each poem. We also develop an application for the visualization of this quantified RhymeScore as a scatter plot in 2 or 3 dimensions. For diction analysis, we investigate the methods used to classify poems based on diction. First the linguistic quantitative and semantic features that constitute diction are enumerated. Then we investigate the methodology used to compute these features from our poetry dataset. We also build a word embeddings model on our poetry dataset with 1.5 million words in 100 dimensions and do a comparative analysis with GloVe embeddings. Metaphor is a part of diction, but as it is a very complex topic in its own right, we address it as a stand-alone issue and develop several methods for it. Previous work on metaphor detection relies on either rule-based or statistical models, none of them applied to poetry. Our methods focus on metaphor detection in a poetry corpus, but we test on non-poetry data as well. We combine rule-based and statistical models (word embeddings) to develop a new classification system. Our first metaphor detection method achieves a precision of 0.759 and a recall of 0.804 in identifying one type of metaphor in poetry, by using a Support Vector Machine classifier with various types of features. Furthermore, our deep learning model based on a Convolutional Neural Network achieves a precision of 0.831 and a recall of 0.836 for the same task. We also develop an application for generic metaphor detection in any type of natural text.
109

Exploring the Compositionality of German Particle Verbs

Rawein, Carina January 2018 (has links)
In this thesis we explore the compositionality of particle verbs using distributional similarity and pre-trained word embeddings. We investigate the compositionality of 100 pairs of particle verbs with their base verbs. The ranking of our findings are compared to a ranking of human ratings on compositionality. In our distributional approach we use features such as context window size, content words, and only use particle verbs with one word sense. We then compare the distributional approach to a ranking done with pre-trained word embeddings. While none of the results are statistically significant, it is shown that word embeddings are not automatically superior to the more traditional distributional approach.
110

DiSH: Democracy in State Houses

Russo, Nicholas A 01 February 2019 (has links)
In our current political climate, state level legislators have become increasingly impor- tant. Due to cuts in funding and growing focus at the national level, public oversight for these legislators has drastically decreased. This makes it difficult for citizens and activists to understand the relationships and commonalities between legislators. This thesis provides three contributions to address this issue. First, we created a data set containing over 1200 features focused on a legislator’s activity on bills. Second, we created embeddings that represented a legislator’s level of activity and engagement for a given bill using a custom model called Democracy2Vec. Third, we provided a case study focused on the 2015-2016 California State Legislator and had our results verified by a political expert. Our results show that our embeddings can explain relationships between legislator and how they will likely act during the legislative process.

Page generated in 0.0734 seconds