• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • Tagged with
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

BEAR: Benchmarking the Efficiency of RDF Archiving

Fernandez Garcia, Javier David, Umbrich, Jürgen, Polleres, Axel January 2015 (has links) (PDF)
There is an emerging demand on techniques addressing the problem of efficiently archiving and (temporal) querying different versions of evolving semantic Web data. While systems archiving and/or temporal querying are still in their early days, we consider this a good time to discuss benchmarks for evaluating storage space efficiency for archives, retrieval functionality they serve, and the performance of various retrieval operations. To this end, we provide a blueprint on benchmarking archives of semantic data by defining a concise set of operators that cover the major aspects of querying of and interacting with such archives. Next, we introduce BEAR, which instantiates this blueprint to serve a concrete set of queries on the basis of real-world evolving data. Finally, we perform an empirical evaluation of current archiving techniques that is meant to serve as a first baseline of future developments on querying archives of evolving RDF data. (authors' abstract) / Series: Working Papers on Information Systems, Information Business and Operations
2

A text mining framework in R and its applications

Feinerer, Ingo 08 1900 (has links) (PDF)
Text mining has become an established discipline both in research as in business intelligence. However, many existing text mining toolkits lack easy extensibility and provide only poor support for interacting with statistical computing environments. Therefore we propose a text mining framework for the statistical computing environment R which provides intelligent methods for corpora handling, meta data management, preprocessing, operations on documents, and data export. We present how well established text mining techniques can be applied in our framework and show how common text mining tasks can be performed utilizing our infrastructure. The second part in this thesis is dedicated to a set of realistic applications using our framework. The first application deals with the implementation of a sophisticated mailing list analysis, whereas the second example identifies the potential of text mining methods for business to consumer electronic commerce. The third application shows the benefits of text mining for law documents. Finally we present an application which deals with authorship attribution on the famous Wizard of Oz book series. (author's abstract)
3

Scalable Inference in Latent Gaussian Process Models

Wenzel, Florian 05 February 2020 (has links)
Latente Gauß-Prozess-Modelle (latent Gaussian process models) werden von Wissenschaftlern benutzt, um verborgenen Muster in Daten zu er- kennen, Expertenwissen in probabilistische Modelle einfließen zu lassen und um Vorhersagen über die Zukunft zu treffen. Diese Modelle wurden erfolgreich in vielen Gebieten wie Robotik, Geologie, Genetik und Medizin angewendet. Gauß-Prozesse definieren Verteilungen über Funktionen und können als flexible Bausteine verwendet werden, um aussagekräftige probabilistische Modelle zu entwickeln. Dabei ist die größte Herausforderung, eine geeignete Inferenzmethode zu implementieren. Inferenz in probabilistischen Modellen bedeutet die A-Posteriori-Verteilung der latenten Variablen, gegeben der Daten, zu berechnen. Die meisten interessanten latenten Gauß-Prozess-Modelle haben zurzeit nur begrenzte Anwendungsmöglichkeiten auf großen Datensätzen. In dieser Doktorarbeit stellen wir eine neue effiziente Inferenzmethode für latente Gauß-Prozess-Modelle vor. Unser neuer Ansatz, den wir augmented variational inference nennen, basiert auf der Idee, eine erweiterte (augmented) Version des Gauß-Prozess-Modells zu betrachten, welche bedingt konjugiert (conditionally conjugate) ist. Wir zeigen, dass Inferenz in dem erweiterten Modell effektiver ist und dass alle Schritte des variational inference Algorithmus in geschlossener Form berechnet werden können, was mit früheren Ansätzen nicht möglich war. Unser neues Inferenzkonzept ermöglicht es, neue latente Gauß-Prozess- Modelle zu studieren, die zu innovativen Ergebnissen im Bereich der Sprachmodellierung, genetischen Assoziationsstudien und Quantifizierung der Unsicherheit in Klassifikationsproblemen führen. / Latent Gaussian process (GP) models help scientists to uncover hidden structure in data, express domain knowledge and form predictions about the future. These models have been successfully applied in many domains including robotics, geology, genetics and medicine. A GP defines a distribution over functions and can be used as a flexible building block to develop expressive probabilistic models. The main computational challenge of these models is to make inference about the unobserved latent random variables, that is, computing the posterior distribution given the data. Currently, most interesting Gaussian process models have limited applicability to big data. This thesis develops a new efficient inference approach for latent GP models. Our new inference framework, which we call augmented variational inference, is based on the idea of considering an augmented version of the intractable GP model that renders the model conditionally conjugate. We show that inference in the augmented model is more efficient and, unlike in previous approaches, all updates can be computed in closed form. The ideas around our inference framework facilitate novel latent GP models that lead to new results in language modeling, genetic association studies and uncertainty quantification in classification tasks.

Page generated in 0.0172 seconds