• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • Tagged with
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Extracting conflict-free information from multi-labeled trees

Deepak, Akshay, Fernandez-Baca, David, McMahon, Michelle January 2013 (has links)
BACKGROUND:A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious.RESULTS:We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T, and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation among MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved.CONCLUSIONS:Our measure of conflict-free information content based on quartets is simple and topologically appealing. In the experiments, the maximally reduced form is often much smaller than the original tree, yet retains most of the taxa. The reduction algorithm is quadratic in the number of leaves and its complexity is unaffected by the multiplicity of leaf labels or the degree of the nodes.
2

Leveraging Sequential Nature of Conversations for Intent Classification

Gotteti, Shree January 2021 (has links)
No description available.
3

OLLDA: Dynamic and Scalable Topic Modelling for Twitter : AN ONLINE SUPERVISED LATENT DIRICHLET ALLOCATION ALGORITHM

Jaradat, Shatha January 2015 (has links)
Providing high quality of topics inference in today's large and dynamic corpora, such as Twitter, is a challenging task. This is especially challenging taking into account that the content in this environment contains short texts and many abbreviations. This project proposes an improvement of a popular online topics modelling algorithm for Latent Dirichlet Allocation (LDA), by incorporating supervision to make it suitable for Twitter context. This improvement is motivated by the need for a single algorithm that achieves both objectives: analyzing huge amounts of documents, including new documents arriving in a stream, and, at the same time, achieving high quality of topics’ detection in special case environments, such as Twitter. The proposed algorithm is a combination of an online algorithm for LDA and a supervised variant of LDA - labeled LDA. The performance and quality of the proposed algorithm is compared with these two algorithms. The results demonstrate that the proposed algorithm has shown better performance and quality when compared to the supervised variant of LDA, and it achieved better results in terms of quality in comparison to the online algorithm. These improvements make our algorithm an attractive option when applied to dynamic environments, like Twitter. An environment for analyzing and labelling data is designed to prepare the dataset before executing the experiments. Possible application areas for the proposed algorithm are tweets recommendation and trends detection. / Tillhandahålla högkvalitativa ämnen slutsats i dagens stora och dynamiska korpusar, såsom Twitter, är en utmanande uppgift. Detta är särskilt utmanande med tanke på att innehållet i den här miljön innehåller korta texter och många förkortningar. Projektet föreslår en förbättring med en populär online ämnen modellering algoritm för Latent Dirichlet Tilldelning (LDA), genom att införliva tillsyn för att göra den lämplig för Twitter sammanhang. Denna förbättring motiveras av behovet av en enda algoritm som uppnår båda målen: analysera stora mängder av dokument, inklusive nya dokument som anländer i en bäck, och samtidigt uppnå hög kvalitet på ämnen "upptäckt i speciella fall miljöer, till exempel som Twitter. Den föreslagna algoritmen är en kombination av en online-algoritm för LDA och en övervakad variant av LDA - Labeled LDA. Prestanda och kvalitet av den föreslagna algoritmen jämförs med dessa två algoritmer. Resultaten visar att den föreslagna algoritmen har visat bättre prestanda och kvalitet i jämförelse med den övervakade varianten av LDA, och det uppnådde bättre resultat i fråga om kvalitet i jämförelse med den online-algoritmen. Dessa förbättringar gör vår algoritm till ett attraktivt alternativ när de tillämpas på dynamiska miljöer, som Twitter. En miljö för att analysera och märkning uppgifter är utformad för att förbereda dataset innan du utför experimenten. Möjliga användningsområden för den föreslagna algoritmen är tweets rekommendation och trender upptäckt.

Page generated in 0.3019 seconds