Global ETD Search

151	Searching and ranking structured documents Trotman, Andrew, n/a January 2007 (has links) It is common to see documents with explicit structure marked up in languages such as XML. Queries, on the other hand, typically have no structure. There is a clear mismatch, although documents contain structure it is typically not used in information retrieval. An efficient index structure for document-centric searching is proposed and its efficiency is discussed. It is shown to be at worst linear with respect to the number of occurrences of a given search term. The algorithm is then extended to accommodate element-centric information retrieval. Ranking algorithms for structured documents are examined. Genetic Algorithms are used to learn different weights for each structure present in a document. Applying these weights as part of a function is shown to yield significant precision improvements in some functions. Genetic Programming is then used to learn an entire ranking function. This function is shown to be portable between document collections. A query language for structured information retrieval is proposed. Use of this language in the 2004 INEX workshop resulted in a large decrease in query errors. Structured information retrieval is now a viable alternative to its unstructured counterpart. A successful query language, efficient indexing structures, and improved ranking functions are all presented. information retrieval query languages (computer science) algorithms computer programs
152	Langages de requêtes pour XML à base de patterns : conception, optimisation et implantation Miachon, Cédric 13 December 2006 (has links) (PDF) Dans les dernières années XML est devenu un véritable modèle de bases de données permettant de représenter, stocker et échanger des données semi-structurées. Il est devenu alors nécessaire de développer des langages de requêtes efficaces pour ce modèle. Différents langages de requêtes existent qui utilisent une primitive de déconstruction dans le but de capturer des parties de documents XML, qui peuvent êtes vus comme des arbres. Il existe deux déconstructeurs : (i) la navigation par chemins qui permet de naviguer en profondeur (par des projections) à l'intérieur d'un arbre afin de capturer un sous-arbre (XPath), (ii)le filtrage par motifs qui permet de capturer en largeur différents sous-arbres (XDuce, CDuce). L'objectif de cette thèse est d'offrir au langage CDuce un langage de requêtes déclaratif, qui puisse tirer parti du typage fort et statique de CDuce. Ce langage de requêtes (appelé CQL) est formellement défini et permet d'utiliser et de combiner en une requête les deux déconstructeurs, dans le but d'écrire des requêtes concises et expressives. Partant du postulat que le filtrage par motifs est plus performant que la navigation descendante, nous donnons une traduction optimisante qui réécrit toutes les projections d'une requête en motifs. Cette traduction et d'autres optimisations ont été validées par des jeux de tests et des "microbenchmarks", ainsi que comparées avec d'autres moteurs de requêtes. L'écriture de requêtes avec motifs pouvant être laborieux pour un utilisateur non averti, une interface graphique (appelée PBE) est proposée qui permet de faciliter cette écriture en étant guidée par les types de la DTD. [INFO] Computer Science xml pattern cduce query language
153	Query Processing for Peer Mediator Databases Katchaounov, Timour January 2003 (has links) <p>The ability to physically interconnect many distributed, autonomous and heterogeneous software systems on a large scale presents new opportunities for sharing and reuse of existing, and for the creataion of new information and new computational services. However, finding and combining information in many such systems is a challenge even for the most advanced computer users. To address this challenge, mediator systems logically integrate many sources to hide their heterogeneity and distribution and give the users the illusion of a single coherent system.</p><p>Many new areas, such as scientific collaboration, require cooperation between many autonomous groups willing to share their knowledge. These areas require that the data integration process can be distributed among many autonomous parties, so that large integration solutions can be constructed from smaller ones. For this we propose a decentralized mediation architecture, peer mediator systems (PMS), based on the peer-to-peer (P2P) paradigm. In a PMS, reuse of human effort is achieved through logical composability of the mediators in terms of other mediators and sources by defining mediator views in terms of views in other mediators and sources.</p><p>Our thesis is that logical composability in a P2P mediation architecture is an important requirement and that composable mediators can be implemented efficiently through query processing techniques.</p><p>In order to compute answers of queries in a PMS, logical mediator compositions must be translated to query execution plans, where mediators and sources cooperate to compute query answers. The focus of this dissertation is on query processing methods to realize composability in a PMS architecture in an efficient way that scales over the number of mediators.</p><p>Our contributions consist of an investigation of the interfaces and capabilities for peer mediators, and the design, implementation and experimental study of several query processing techniques that realize composability in an efficient and scalable way.</p> Datalogi data integration mediators query processing Datalogi Computer science Datalogi
154	FoXQ : a visual query language for XML Abraham, Robin 26 September 2003 (has links) XML is a very versatile data format that has been used to represent many different kinds of data, including web pages, books, business and accounting data, programming interfaces, vector graphics, system logs, and games. In a short span of time, it has gained wide acceptance as the document and data standard on the web. As more and more XML data gets generated everyday, a lot of research focus has been on query languages for XML. The World-Wide Web Consortium (W3C) has chosen XQuery as the standard language for querying XML. From an end-user point of view, XQuery sacrifices usability for expressiveness. We introduce FoXQ, a visual language that enables end users to query XML. FoXQ brings a lot of the functionality of XQuery within the reach of the end users without getting them embroiled in the intricacies of XQuery syntax. The query interface is form-based and the query model is based on a document metaphor in which the users formulate queries by filling out forms. / Graduation date: 2004 XML (Document markup language) Query languages (Computer science) FoXQ
155	A browser-based tool for designing query interfaces to scientific databases Newsome, Mark Ronald, 1960- 15 November 1996 (has links) Scientists in the biological sciences need to retrieve information from a variety of data collections, traditionally maintained in SQL databases, in order to conduct research. Because current assistant tools are designed primarily for business and financial users, scientists have been forced to use the notoriously difficult command-line SQL interface, supplied as standard by most database vendors. The goal of our research has been to establish the requirements of scientific researchers and develop specialized query assistance tools to help them query data collections across the Internet. This thesis describes our work in developing HyperSQL, a Web-to-database scripting language, and most importantly, Query Designer, a user-oriented tool for designing query interfaces directly on Web browsers. Current browsers (i.e., Netscape, Internet Explorer) do not easily interoperate with databases without extensive "CGI" (Common Gateway Interface) programming. HyperSQL is a scripting language that enables database administrators to construct forms-based query interfaces intended for end-users who are not proficient with SQL. Query results are formatted as hypertext-clickable links which can be used to browse the database for related information, bring up Web pages, or access remote search engines. HyperSQL query interfaces are independent of the database computer, making it possible to construct different interfaces targeting distinct groups of users. Capitalizing on our experience with HyperSQL, we developed Query Designer, a user-oriented tool for building query interfaces directly on Web browsers. No experience in SQL and HTML programming is necessary. After choosing a target database, the user can build a personalized query interface by making menu selections and filling out forms--the tool automatically establishes network connections, and composes HTML and SQL code. The automatically generated query form can be used immediately to issue a query, customized, or saved for later use. Results returned from the database are dynamically formatted into hypertext for navigating related information in the database. / Graduation date: 1997 Science -- Databases Query languages (Computer science) User interfaces (Computer systems)
156	Keyword Join: Realizing Keyword Search in P2P-based Database Systems Yu, Bei, Liu, Ling, Ooi, Beng Chin, Tan, Kian Lee 01 1900 (has links) In this paper, we present a P2P-based database sharing system that provides information sharing capabilities through keyword-based search techniques. Our system requires neither a global schema nor schema mappings between different databases, and our keyword-based search algorithms are robust in the presence of frequent changes in the content and membership of peers. To facilitate data integration, we introduce keyword join operator to combine partial answers containing different keywords into complete answers. We also present an efficient algorithm that optimize the keyword join operations for partial answer integration. Our experimental study on both real and synthetic datasets demonstrates the effectiveness of our algorithms, and the efficiency of the proposed query processing strategies. / Singapore-MIT Alliance (SMA) keyword join keyword query Peer-to-Peer database
157	Exploring the Use of Evidence Based Practice Questions to Improve the Search Process Elizabeth A. Appleton 10 April 2007 (has links) Evidence Based Practice (EBP) is a relatively new approach that professionals are using to cope with the ever-growing body of literature in their fields. The goal of EBP is to effectively use this body of literature to improve professional practice, thus improving the quality of services. A major component of EBP is asking a focused, well-built question, referred to in this paper as an Evidence Based Practice Question (EBPQ). This paper reports the findings of an exploratory study that examines the use an EBPQ to respond to reference questions emailed to a university library reference desk. A purposive sample of 30 randomly selected reference emails was divided into two groups, the EBPQ group and the control group. The professional searcher who conducted the searches used the same approach in responding to each emailed reference question, except that the EBPQ group searches were guided by EBPQs, and the control group’s responses were not. The results indicate that searches guided by using EBPQs are more focused, apply more resources to the search process, and take less time than searches not guided by using EBPQs. These conclusions suggest that EBPQs appear to be useful for improving that search process and that further research is warranted.
158	Query Optimization for On-Demand Information Extraction Tasks over Text Databases Farid, Mina H. 12 March 2012 (has links) Many modern applications involve analyzing large amounts of data that comes from unstructured text documents. In its original format, data contains information that, if extracted, can give more insight and help in the decision-making process. The ability to answer structured SQL queries over unstructured data allows for more complex data analysis. Querying unstructured data can be accomplished with the help of information extraction (IE) techniques. The traditional way is by using the Extract-Transform-Load (ETL) approach, which performs all possible extractions over the document corpus and stores the extracted relational results in a data warehouse. Then, the extracted data is queried. The ETL approach produces results that are out of date and causes an explosion in the number of possible relations and attributes to extract. Therefore, new approaches to perform extraction on-the-fly were developed; however, previous efforts relied on specialized extraction operators, or particular IE algorithms, which limited the optimization opportunities of such queries. In this work, we propose an on-line approach that integrates the engine of the database management system with IE systems using a new type of view called extraction views. Queries on text documents are evaluated using these extraction views, which get populated at query-time with newly extracted data. Our approach enables the optimizer to apply all well-defined optimization techniques. The optimizer selects the best execution plan using a defined cost model that considers a user-defined balance between the cost and quality of extraction, and we explain the trade-off between the two factors. The main contribution is the ability to run on-demand information extraction to consider latest changes in the data, while avoiding unnecessary extraction from irrelevant text documents. Database Query Optimization Information Extraction Data Quality Computer Science
159	Query-Based data mining for the web Poblete Labra, Bárbara 01 October 2009 (has links) El objetivo de esta tesis es estudiar diferentes aplicaciones de la minería de consultas Web para mejorar el ranking en motores de búsqueda, mejorar la recuperación de información en la Web y mejorar los sitios Web. La principal motivación de este trabajo es aprovechar la información implícita que los usuarios dejan como rastro al navegar en la Web. A través de este trabajo buscamos demostrar el valor de la "sabiduría de las masas", que entregan las consultas, para muchas aplicaciones. Estas aplicaciones permiten un mejor entendimiento de las necesidades de los usuarios en la Web, mejorando en forma directa la interacción general que tienen los visitantes con los sitios Web y los buscadores. / The objective of this thesis is to study different applications of Web query mining for the improvement of search engine ranking, Web information retrieval and Web site enhancement. The main motivation of this work is to take advantage of the implicit feedback left in the trail of users while navigating through the Web. Throughout this work we seek to demonstrate the value of queries to extract interesting rules, patterns and information about the documents they reach. The models, created in this doctoral work, show that the "wisdom of the crowds" conveyed in queries has many applications that overall provide a better understanding of users' needs in the Web. This allows to improve the general interaction of visitors with Web sites and search engines in a straightforward way. query mining web data mining information retrieval 316
160	CGU: A common graph utility for DL Reasoning and Conjunctive Query Optimization Palacios Villa, Jesus Alejandro January 2005 (has links) We consider the overlap between reasoning involved in <em>conjunctive query optimization</em> (CQO) and in tableaux-based approaches to reasoning about subsumption in <em>description logics</em> (DLs). In both cases, an underlying graph is created, searched and modified. This process is determined by a given <em>query</em> and <em>database schema</em> in the first case and by a given <em>description</em> and <em>terminology</em> in the second. The opportunities for overlap derive from an abundance of reductions of various schema languages to terminologies for common DL dialects, and from the fact that descriptions can in turn be viewed as queries that compute a single column. <br /><br /> Our main contributions are as follows. We present the design and implementation of a common graph utility that integrates the requirements for both CQO and DL reasoning. We then verify this model by also presenting the design and implementation for two drivers, one that implements a query optimizer for a conjunctive query language extended with descriptions, and one that implements a complete DL reasoner for a feature based DL dialect. Computer Science Description Logics Databases Conjunctive Query Optimization Tableaux Algorithms

Search results