Spelling suggestions: "subject:"query"" "subject:"guery""
671 |
Model-checking based data retrieval an application to semistructured and temporal data /Quintarelli, Elisa. January 1900 (has links)
Texte remanié de : PhD : Politecnico di Milano : 2002. / Bibliogr. p. [129]-134.
|
672 |
PCM Backfill: Providing PCM to the Control Room Without DropoutsMorgan, Jon, Jones, Charles H. 10 1900 (has links)
ITC/USA 2014 Conference Proceedings / The Fiftieth Annual International Telemetering Conference and Technical Exhibition / October 20-23, 2014 / Town and Country Resort & Convention Center, San Diego, CA / One of the initial control room capabilities to be demonstrated by iNET program is the ability to provide data displays in the control room that do not contain data dropouts. This concept is called PCM Backfill where PCM data is both transmitted via traditional SST and recorded onboard via an iNET compatible recorder. When data dropouts occur, data requests are made over the telemetry network to the recorder for the missing portions of the PCM data stream. The retrieved data is sent over the telemetry network to the backfill application and ultimately delivered to a pristine data display. The integration of traditional SST and the PCM Backfill capability provides both real-time safety of flight data side-by-side with pristine data suitable for advanced analysis.
|
673 |
Learning with Markov logic networks : transfer learning, structure learning, and an application to Web query disambiguationMihalkova, Lilyana Simeonova 18 March 2011 (has links)
Traditionally, machine learning algorithms assume that training data is provided as a set of independent instances, each of which can be described as a feature vector. In contrast, many domains of interest are inherently multi-relational, consisting of entities connected by a rich set of relations. For example, the participants in a social network are linked by friendships, collaborations, and shared interests. Likewise, the users of a search engine are related by searches for similar items and clicks to shared sites. The ability to model and reason about such relations is essential not only because better predictive accuracy is achieved by exploiting this additional information, but also because frequently the goal is to predict whether a set of entities are related in a particular way. This thesis falls within the area of Statistical Relational Learning (SRL), which combines ideas from two traditions within artificial intelligence, first-order logic and probabilistic graphical models to address the challenge of learning from multi-relational data. We build on one particular SRL model, Markov logic networks (MLNs), which consist of a set of weighted first-order-logic formulae and provide a principled way of defining a probability distribution over possible worlds. We develop algorithms for learning of MLN structure both from scratch and by transferring a previously learned model, as well as an application of MLNs to the problem of Web query disambiguation. The ideas we present are unified by two main themes: the need to deal with limited training data and the use of bottom-up learning techniques. Structure learning, the task of automatically acquiring a set of dependencies among the relations in the domain, is a central problem in SRL. We introduce BUSL, an algorithm for learning MLN structure from scratch that proceeds in a more bottom-up fashion, breaking away from the tradition of top-down learning typical in SRL. Our approach first constructs a novel data structure called a Markov network template that is used to restrict the search space for clauses. Our experiments in three relational domains demonstrate that BUSL dramatically reduces the search space for clauses and attains a significantly higher accuracy than a structure learner that follows a top-down approach. Accurate and efficient structure learning can also be achieved by transferring a model obtained in a source domain related to the current target domain of interest. We view transfer as a revision task and present an algorithm that diagnoses a source MLN to determine which of its parts transfer directly to the target domain and which need to be updated. This analysis focuses the search for revisions on the incorrect portions of the source structure, thus speeding up learning. Transfer learning is particularly important when target-domain data is limited, such as when data on only a few individuals is available from domains with hundreds of entities connected by a variety of relations. We also address this challenging case and develop a general transfer learning approach that makes effective use of such limited target data in several social network domains. Finally, we develop an application of MLNs to the problem of Web query disambiguation in a more privacy-aware setting where the only information available about a user is that captured in a short search session of 5-6 previous queries on average. This setting contrasts with previous work that typically assumes the availability of long user-specific search histories. To compensate for the scarcity of user-specific information, our approach exploits the relations between users, search terms, and URLs. We demonstrate the effectiveness of our approach in the presence of noise and show that it outperforms several natural baselines on a large data set collected from the MSN search engine. / text
|
674 |
Εννοιολογικός προσανατολισμός της αναζήτησης στον Παγκόσμιο ΙστόΒεργέτη, Δανάη 09 October 2014 (has links)
Tα τελευταία χρόνια, η εξάπλωση του διαδικτύου και το εύρος της πληροφορίας
που διατίθεται στο χρήστη,
καθιστούν
αναγκαία
τη
χρησιμοποίηση
σημασιολογικών τεχνικών προσωποποίησης, προκειμένου να βελτιώσουν την
εμπειρία του χρήστη στο διαδίκτυο. Στις μηχανές αναζήτησης, οι χρήστες
βελτιώνουν το επερώτημά τους με την προσθήκη, την αφαίρεση ή την
αντικατάσταση των λέξεων. Παρ 'όλα αυτά , εκτός από την αλληλεπίδραση με
μια μηχανή αναζήτησης, η εμπειρία ενός χρήστη στο διαδίκτυο κατά την
αναζήτηση της σωστής πληροφορίας, περιλαμβάνει και την περιήγησή του σε
σελίδες ενός δικτυακού τόπου ή μια σειρά από δικτυακούς τόπους. Κατά τη
διάρκεια της συνεδρίας του, ο χρήστης αναδιαμορφώνει την αναζήτησή του.
Ωστόσο, τόσο ο καθορισμός της σημασιολογίας της αναζήτησής του, όσο και ο
προσανατολισμός της αναζήτησής του (γενίκευση ή εξειδίκευση σε ένα
σημασιολογικό πεδίο) με βάση την πλοήγηση μέσα από τις σελίδες, δεν είναι
τόσο εύκολοι. Κάθε σελίδα περιέχει περισσότερες από μία έννοιες. Επιπλέον, η
επιλογή των αντιπροσωπευτικότερων είναι πολύπλοκη διαδικασία.
Σκοπός της παρούσας εργασίας είναι η παρουσίαση της μεθοδολογίας SOSACT.
Η
μεθοδολογία
SOSACT
αποτελεί
μια
σημασιολογική
μεθοδολογία
εξατομίκευσης που παρακολουθεί τις επιλογές του χρήστη κατά τη συνεδρία του
και καθορίζει αν ο χρήστης ειδικεύει ή γενικεύει την πλοήγηση του μέσα από τη
σημασιολογική ανάλυση των σελίδων,
σε ένα εννοιολογικό πεδίο.
Η
μεθοδολογία SOSACT ορίζει το σημασιολογικό προσανατολισμό της πλοήγησης
του χρήστη. Επιπλέον,
στην παρούσα εργασία προτείνεται ο αλγόριθμος
SOSACT, ο οποίος εντοπίζει το σημασιολογικό προσανατολισμό του χρήστη με
τη βοήθεια μίας ταξινομίας.
Η μεθοδολογία SOSACT υλοποιείται από το σύστημα SOSACT. Το σύστημα
SOSACT εφαρμόζει τον αλγόριθμο SOSACT και προτείνει χρήσιμες συστάσεις
προς το χρήστη για τη βελτίωση της διαδικτυακής αναζήτησής του . Το σύστημα
SOSACT αξιολογήθηκε με τη χρησιμοποίηση πραγματικής δραστηριότητας
χρηστών σε μια ιστοσελίδα, για ορισμένο χρονικό διάστημα.
Η μεθοδολογία SOSACT μπορεί να εφαρμοστεί και σε ένα σώμα κειμένων και
όχι μόνο σε διαδικτυακές πηγές. Μπορεί να γίνει ένα χρήσιμο εργαλείο για τη
βελτίωση της πλοήγησης στο διαδίκτυο. Επιπλέον, η προτεινόμενη μεθοδολογία
μπορεί να γεφυρώσει τις τεχνικές αποσαφήνισης του επερωτήματος στις
μηχανές αναζήτησης και τις τεχνικές αναδιαμόρφωσης του αντικειμένου
περιήγησης. Η μεθοδολογία SOSACT θα μπορούσε να χρησιμοποιηθεί σε μια
συγκριτική μελέτη μεταξύ των δύο αυτών τομέων και να οδηγήσει σε νέες
τεχνικές και στις δύο περιοχές έρευνας του Σημασιολογικού Ιστού. / In recent years, the spread of the World Wide Web, as well as the range of
information available to the user make the use of semantic personalization
techniques a necessity in order to enhance the user experience on the web. In search
engines, users refine their query by adding, removing or replacing the keywords in
their query. Thus, query refinement is easy to be detected and tell whether a user
generalizes or specializes his web search. Nevertheless, besides interaction with a
search engine, a user web search involves browsing and navigating through the
pages of a web site or a number of web sites while seeking the right information.
During this session the user reformulates his search. But, defining search orientation
(generalization or specialization) based on navigation through web pages is not that
easy. Each page contains more than one concept. Furthermore, the concepts may be
developed in the same extend and it is difficult to tell about the representative
semantics of a certain page and thus a user session’s orientation.
In order to define user navigation’s orientation a semantic web personalization
methodology is developed, the SOSACT methodology, which tracks user’s hits
through a session and defines whether a user specializes or generalizes his
navigation through semantics analysis of the pages in his session window. Moreover,
the SOSACT algorithm is proposed of capturing user session orientation based on
concept taxonomy.
The SOSACT methodology is implemented by the SOSACT system. The SOSACT
system applies the SOSACT algorithm and proposes useful recommendation to the
user to improve his web search. The SOSACT system is evaluated on real user
activity in a web site for a certain period of time. The experimental outcomes
satisfied the prospective results.
The SOSACT methodology could become a useful tool for navigation refinement.
Furthermore, this work is proved to bridge search engine query refinement and
browsing reformulation techniques. It could be a comparative study between these
two fields and lead to new techniques in both areas or migration techniques between
both areas.
|
675 |
Automatic Concept-Based Query Expansion Using Term Relational Pathways Built from a Collection-Specific Association ThesaurusLyall-Wilson, Jennifer Rae January 2013 (has links)
The dissertation research explores an approach to automatic concept-based query expansion to improve search engine performance. It uses a network-based approach for identifying the concept represented by the user's query and is founded on the idea that a collection-specific association thesaurus can be used to create a reasonable representation of all the concepts within the document collection as well as the relationships these concepts have to one another. Because the representation is generated using data from the association thesaurus, a mapping will exist between the representation of the concepts and the terms used to describe these concepts. The research applies to search engines designed for use in an individual website with content focused on a specific conceptual domain. Therefore, both the document collection and the subject content must be well-bounded, which affords the ability to make use of techniques not currently feasible for general purpose search engine used on the entire web.
|
676 |
Extracting and exploiting word relationships for information retrievalCao, Guihong January 2008 (has links)
Thèse numérisée par la Division de la gestion de documents et des archives de l'Université de Montréal
|
677 |
Approximation of OLAP queries on data warehousesCao, Phuong Thao 20 June 2013 (has links) (PDF)
We study the approximate answers to OLAP queries on data warehouses. We consider the relative answers to OLAP queries on a schema, as distributions with the L1 distance and approximate the answers without storing the entire data warehouse. We first introduce three specific methods: the uniform sampling, the measure-based sampling and the statistical model. We introduce also an edit distance between data warehouses with edit operations adapted for data warehouses. Then, in the OLAP data exchange, we study how to sample each source and combine the samples to approximate any OLAP query. We next consider a streaming context, where a data warehouse is built by streams of different sources. We show a lower bound on the size of the memory necessary to approximate queries. In this case, we approximate OLAP queries with a finite memory. We describe also a method to discover the statistical dependencies, a new notion we introduce. We are looking for them based on the decision tree. We apply the method to two data warehouses. The first one simulates the data of sensors, which provide weather parameters over time and location from different sources. The second one is the collection of RSS from the web sites on Internet.
|
678 |
Scalable Preservation, Reconstruction, and Querying of Databases in terms of Semantic Web RepresentationsStefanova, Silvia January 2013 (has links)
This Thesis addresses how Semantic Web representations, in particular RDF, can enable flexible and scalable preservation, recreation, and querying of databases. An approach has been developed for selective scalable long-term archival of relational databases (RDBs) as RDF, implemented in the SAQ (Semantic Archive and Query) system. The archival of user-specified parts of an RDB is specified using an extension of SPARQL, A-SPARQL. SAQ automatically generates an RDF view of the RDB, the RD-view. The result of an archival query is RDF triples stored in: i) a data archive file containing the preserved RDB content, and ii) a schema archive file containing sufficient meta-data to reconstruct the archived database. To achieve scalable data preservation and recreation, SAQ uses special query rewriting optimizations for the archival queries. It was experimentally shown that they improve query execution and archival time compared with naïve processing. The performance of SAQ was compared with that of other systems supporting SPARQL queries to views of existing RDBs. When an archived RDB is to be recreated, the reloader module of SAQ first reads the schema archive file and executes a schema reconstruction algorithm to automatically construct the RDB schema. The thus created RDB is populated by reading the data archive and converting the read data into relational attribute values. For scalable recreation of RDF archived data we have developed the Triple Bulk Load (TBL) approach where the relational data is reconstructed by using the bulk load facility of the RDBMS. Our experiments show that the TBL approach is substantially faster than the naïve Insert Attribute Value (IAV) approach, despite the added sorting and post-processing. To view and query semi-structured Topic Maps data as RDF the prototype system TM-Viewer was implemented. A declarative RDF view of Topic Maps, the TM-view, is automatically generated by the TM-viewer using a developed conceptual schema for the Topic Maps data model. To achieve efficient query processing of SPARQL queries to the TM-view query rewrite transformations were developed and evaluated. It was shown that they significantly improve the query execution time. / eSSENCE
|
679 |
Automata methods and techniques for graph-structured dataShoaran, Maryam 23 April 2011 (has links)
Graph-structured data (GSD) is a popular model to represent complex information
in a wide variety of applications such as social networks, biological data management,
digital libraries, and traffic networks. The flexibility of this model allows
the information to evolve and easily integrate with heterogeneous data from many
sources.
In this dissertation we study three important problems on GSD. A consistent
theme of our work is the use of automata methods and techniques to process and
reason about GSD.
First, we address the problem of answering queries on GSD in a distributed environment.
We focus on regular path queries (RPQs) – given by regular expressions
matching paths in graph-data. RPQs are the building blocks of almost any mechanism
for querying GSD. We present a fault-tolerant, message-efficient, and truly
distributed algorithm for answering RPQs. Our algorithm works for the larger class
of weighted RPQs on weighted GSDs.
Second, we consider the problem of answering RPQs on incomplete GSD, where
different data sources are represented by materialized database views. We explore the
connection between “certain answers” (CAs) and answers obtained from “view-based
rewritings” (VBRs) for RPQs. CAs are answers that can be obtained on each database
consistent with the views. Computing all of CAs for RPQs is NP-hard, and one has to
resort to an exponential algorithm in the size of the data–view materializations. On
the other hand, VBRs are query reformulations in terms of the view definitions. They
can be used to obtain query answers in polynomial time in the size of the data. These
answers are CAs, but unfortunately for RPQs, not all of the CAs can be obtained
in this way. In this work, we show the surprising result that for RPQs under local
semantics, using VBRs to answer RPQs gives all the CAs. The importance of this
result is that under such semantics, the CAs can be obtained in polynomial time in
the size of the data.
Third, we focus on XML–an important special case of GSD. The scenario we consider
is streaming XML between exchanging parties. The problem we study is flexible
validation of streaming XML under the realistic assumption that the schemas of the
exchanging parties evolve, and thus diverge from one another. We represent schemas
by using Visibly Pushdown Automata (VPAs), which recognize Visibly Pushdown
Languages (VPLs). We model evolution for XML by defining formal language operators
on VPLs. We show that VPLs are closed under the defined language operators
and this enables us to expand the schemas (for XML) in order to account for flexible
or constrained evolution. / Graduate
|
680 |
Data Distribution Management In Large-scale Distributed EnvironmentsGu, Yunfeng 15 February 2012 (has links)
Data Distribution Management (DDM) deals with two basic problems: how to distribute data generated at the application layer among underlying nodes in a distributed system and how to retrieve data back whenever it is necessary. This thesis explores DDM in two different network environments: peer-to-peer (P2P) overlay networks and cluster-based network environments. DDM in P2P overlay networks is considered a more complete concept of building and maintaining a P2P overlay architecture than a simple data fetching scheme, and is closely related to the more commonly known associative searching or queries. DDM in the cluster-based network environment is one of the important services provided by the simulation middle-ware to support real-time distributed interactive simulations. The only common feature shared by DDM in both environments is that they are all built to provide data indexing service. Because of these fundamental differences, we have designed and developed a novel distributed data structure, Hierarchically Distributed Tree (HD Tree), to support range queries in P2P overlay networks. All the relevant problems of a distributed data structure, including the scalability, self-organizing, fault-tolerance, and load balancing have been studied. Both theoretical analysis and experimental results show that the HD Tree is able to give a complete view of system states when processing multi-dimensional range queries at different levels of selectivity and in various error-prone routing environments. On the other hand, a novel DDM scheme, Adaptive Grid-based DDM scheme, is proposed to improve the DDM performance in the cluster-based network environment. This new DDM scheme evaluates the input size of a simulation based on probability models. The optimum DDM performance is best approached by adapting the simulation running in a mode that is most appropriate to the size of the simulation.
|
Page generated in 0.0456 seconds