Global ETD Search

251	A C++ Distributed Database Select-project-join Queryprocessor On A Hpc Cluster Ceran, Erhan 01 May 2012 (has links) (PDF) High performance computer clusters have become popular as they are more scalable, affordable and reliable than their centralized counterparts. Database management systems are particularly suitable for distributed architectures / however distributed DBMS are still not used widely because of the design difficulties. In this study, we aim to help overcome these difficulties by implementing a simulation testbed for a distributed query plan processor. This testbed works on our departmental HPC cluster machine and is able to perform select, project and join operations. A data generation module has also been implemented which preserves the foreign key and primary key constraints in the database schema. The testbed has capability to measure, simulate and estimate the response time of a given query execution plan using specified communication network parameters. Extensive experimental work is performed to show the correctness of the produced results. The estimated execution time costs are also compared with the actual run-times obtained from the testbed to verify the proposed estimation functions. Thus, we make sure that these estimation iv functions can be used in distributed database query optimization and distributed database design tools. QA Computer Software 76.75-76.765
252	Cluster-based Query Expansion Technique Huang, Chun-Neng 14 August 2003 (has links) As advances in information and networking technologies, huge amount of information typically in the form of text documents are available online. To facilitate efficient and effective access to documents relevant to users¡¦ information needs, information retrieval systems have been imposed a more significant role than ever. One challenging issue in information retrieval is word mismatch that refers to the phenomenon that concepts may be described by different words in user queries and/or documents. The word mismatch problem, if not appropriately addressed, would degrade retrieval effectiveness critically of an information retrieval system. In this thesis, we develop a cluster-based query expansion technique to solve the word mismatch problem. Using the traditional query expansion techniques (i.e., global analysis and local feedback) as performance benchmarks, the empirical results suggest that when a user query only consists of one query term, the global analysis technique is more effective. However, if a user query consists of two or more query terms, the cluster-based query expansion technique can provide a more accurate query result, especially within the first few top-ranked documents retrieved. Term Association Query Expansion Thesaurus Text Mining Information Retrieval Word Mismatch Cluster-based Query Expansion Document Clustering
253	A visual query language served by a multi-sensor environment Camara (Silvervarg), Karin January 2007 (has links) <p>A problem in modern command and control situations is that much data is available from different sensors. Several sensor data sources also require that the user has knowledge about the specific sensor types to be able to interpret the data.</p><p>To alleviate the working situation for a commander, we have designed and constructed a system that will take input from several different sensors and subsequently present the relevant combined information to the user. The users specify what kind of information is of interest at the moment by means of a query language. The main issues when designing this query language have been that (a) the users should not have to have any knowledge about sensors or sensor data analysis, and (b) that the query language should be powerful and flexible, yet easy to use. The solution has been to (a) use sensor data independence and (b) have a visual query language.</p><p>A visual query language was developed with a two-step interface. First, the users pose a “rough”, simple query that is evaluated by the underlying knowledge system. The system returns the relevant information that can be found in the sensor data. Then, the users have the possibility to refine the result by setting conditions for this. These conditions are formulated by specifying attributes of objects or relations between objects.</p><p>The problem of uncertainty in spatial data; (i.e. location, area) has been considered. The question of how to represent potential uncertainties is dealt with. An investigation has been carried out to find which relations are practically useful when dealing with uncertain spatial data.</p><p>The query language has been evaluated by means of a scenario. The scenario was inspired by real events and was developed in cooperation with a military officer to assure that it was fairly realistic. The scenario was simulated using several tools where the query language was one of the more central ones. It proved that the query language can be of use in realistic situations.</p> / Report code: LiU-Tek-Lic-2007:42. query language command and control spatial temporal uncertainty sensor data independance visual query language sensor data source Computer science Datavetenskap
254	SQL mokymosi sistema / SQL learning system Čebanauskas, Saulius 26 August 2010 (has links) Šiais laikais, kai praktiškai visų sričių „popieriniai“ duomenys jau baigia išnykti, didelę dalį IT visuomenėje sudaro vienokio ar kitokio tipo duomenų bazės. Paprastam vartotojui, kuris naudojasi tik vartojamojo tipo programomis, SQL užklausų mokėjimas nėra reikalingas, tačiau bet kuris programuotojas, kuris dirba su duomenimis, savo darbo ko gero nebeįsivaizduoja be duomenų bazių ir SQL užklausų. SQL užklausa (SQL Query) – tai užklausa, atliekama SQL kalbos komandų pagalba. SQL (Structured Query Language) kalba – struktūrizuota užklausų kalba, skirta duomenų, esančių duomenų bazėje apdorojimui. SQL kalba dirba tik su reliacinėmis DB. Vartotojas SQL pagalba kreipiasi į DBVS, kuri apdoroja užklausą, randa reikalingus duomenis ir pateikia juos vartotojui. SQL nėra nei DBVS, nei atskiras programinis produktas, tai yra neatsiejama DBVS dalis, instrumentas, kurio pagalba realizuojamas vartotojo ryšys su DBVS. SQL kalbos lankstumas ir nepriklausomumas nuo kompiuterinių technologijų specifikos, o taip pat jos palaikymas pagrindiniais lyderiais reliacinių duomenų bazių technologijų srityje padarė SQL kalbą pagrindine standartine duomenų bazių programavimo kalba [1]. SQL užklausos yra naudojamos visose dabartinėse duomenų bazėse, tokiose kaip MsSQL, MySQL, Firebird ir kitose. SQL sakiniai gali būti įterpiami į programas, sudaromas bazine programavimo kalba. Taigi SQL užklausos yra naudojamos visur, kur yra naudojamos ir duomenų bazės. Tą pačią SQL užklausą... [toliau žr. visą tekstą] / Nowadays, when “paper” data practically from all the scopes is on the edge of extinction, big part of IT society is made of one or another type of databases. SQL queries are used to work with database information. SQL (Structured Query Language) is designed for database information processing. For fast systems, related with databases operation, it is necessary to correctly write and optimize SQL queries. To learn write SQL queries correctly and optimize them, only the theoretical knowledge is not enough, it is necessary to do a lot of practical tasks. When learning to write SQL queries, basic problem is poor choice of practical tasks, and non-existence of good testing system, which allows writing SQL queries easily. On the job the existent SQL learning systems and learning materials analysis were performed, all found systems has its own problems, full freedom to write SQL queries for user is not granted. During designing, methods allowing for learner easily write various types of queries, view executed queries results and get the result if a written query is correct, were designed. Designed methods were used in remote SQL queries learning system design and implementation. Aim: The algorithms which are developed and implemented allows to test sql queries based on simple syntax and allows to execute queries of the different types. The object of research: Execution of various SQL queries. Informatics SQL užklausos SQL mokymosi sistema SQL vykdymo algoritmai SQL query SQL learning system Algorythm of SQL query executing
255	Gestion de flux de données pour l'observation de systèmes / Data stream management for systems monitoring Petit, Loïc 10 December 2012 (has links) La popularisation de la technologie a permis d'implanter des dispositifs et des applications de plus en plus développés à la portée d'utilisateurs non experts. Ces systèmes produisent des flux ainsi que des données persistantes dont les schémas et les dynamiques sont hétérogènes. Cette thèse s'intéresse à pouvoir observer les données de ces systèmes pour aider à les comprendre et à les diagnostiquer. Nous proposons tout d'abord un modèle algébrique Astral capable de traiter sans ambiguïtés sémantiques des données provenant de flux ou relations. Le moteur d'exécution Astronef a été développé sur l'architecture à composants orientés services pour permettre une grande adaptabilité. Il est doté d'un constructeur de requête permettant de choisir un plan d'exécution efficace. Son extension Asteroid permet de s'interfacer avec un SGBD pour gérer des données persistantes de manière intégrée. Nos contributions sont confrontées à la pratique par la mise en œuvre d'un système d'observation du réseau domestique ainsi que par l'étude des performances. Enfin, nous nous sommes intéressés à la mise en place de la personnalisation des résultats dans notre système par l'introduction d'un modèle de préférences top-k. / Due to the popularization of technology, non-expert people can now use more and more advanced devices and applications. Such systems produce data streams as well as persistent data with heterogeneous schemas and dynamics. This thesis is focused on monitoring data coming from those systems to help users to understand and to perform diagnosis on them. We propose an algebraic model Astral able to treat data coming from streams or relations without semantic ambiguity. The engine Astronef has been developed on top of a service-oriented component framework to enable a large adaptability. It embeds a query builder which can select a composition of components to provide an efficient query plan. Its extension Asteroid interfaces with a DBMS in order to manage persistent data in an integrated manner. Our contributions have been confronted to practice with the deployment of a monitoring system for the digital home and with a performance study. Finally, we extend our approach with an operator to personalize the results by introducing a top-k preference model. Flux de données Observation Algèbre Optimisation de requête Équivalence de requêtes Base de données Data stream Monitoring Algebra Query optimization Query equivalence Databases
256	Declarative parallel query processing on large scale astronomical databases / Traitement parallèle et déclaratif de requêtes sur des masses de données issues d'observations astronomiques Mesmoudi, Amin 03 December 2015 (has links) Les travaux de cette thèse s'inscrivent dans le cadre du projet Petasky. Notre objectif est de proposer des outils permettant de gérer des dizaines de Peta-octets de données issues d'observations astronomiques. Nos travaux se focalisent essentiellement sur la conception des nouveaux systèmes permettant de garantir le passage à l'échelle. Dans cette thèse, nos contributions concernent trois aspects : Benchmarking des systèmes existants, conception d'un nouveau système et optimisation du système. Nous avons commencé par analyser la capacité des systèmes fondés sur le modèle MapReduce et supportant SQL à gérer les données LSST et leurs capacités d'optimisation de certains types de requêtes. Nous avons pu constater qu'il n'y a pas de technique « magique » pour partitionner, stocker et indexer les données mais l'efficacité des techniques dédiées dépend essentiellement du type de requête et de la typologie des données considérées. Suite à notre travail de Benchmarking, nous avons retenu quelques techniques qui doivent être intégrées dans un système de gestion de données à large échelle. Nous avons conçu un nouveau système de façon à garantir la capacité dudit système à supporter plusieurs mécanismes de partitionnement et plusieurs opérateurs d'évaluation. Nous avons utilisé BSP (Bulk Synchronous Parallel) comme modèle de calcul. Les données sont représentées logiquement par des graphes. L'évaluation des requêtes est donc faite en explorant le graphe de données en utilisant les arcs entrants et les arcs sortants. Les premières expérimentations ont montré que notre approche permet une amélioration significative des performances par rapport aux systèmes Map/Reduce / This work is carried out in framework of the PetaSky project. The objective of this project is to provide a set of tools allowing to manage Peta-bytes of data from astronomical observations. Our work is concerned with the design of a scalable approach. We first started by analyzing the ability of MapReduce based systems and supporting SQL to manage the LSST data and ensure optimization capabilities for certain types of queries. We analyzed the impact of data partitioning, indexing and compression on query performance. From our experiments, it follows that there is no “magic” technique to partition, store and index data but the efficiency of dedicated techniques depends mainly on the type of queries and the typology of data that are considered. Based on our work on benchmarking, we identified some techniques to be integrated to large-scale data management systems. We designed a new system allowing to support multiple partitioning mechanisms and several evaluation operators. We used the BSP (Bulk Synchronous Parallel) model as a parallel computation paradigm. Unlike MapeReduce model, we send intermediate results to workers that can continue their processing. Data is logically represented as a graph. The evaluation of queries is performed by exploring the data graph using forward and backward edges. We also offer a semi-automatic partitioning approach, i.e., we provide the system administrator with a set of tools allowing her/him to choose the manner of partitioning data using the schema of the database and domain knowledge. The first experiments show that our approach provides a significant performance improvement with respect to Map/Reduce systems Benchmarking Passage à échelle Traitement parallèle Optimisation Graphes Astronomie Benchmark Scalability Parallel query evaluation Query optimization Graphs Astronomy 004.21
257	Uma técnica para ranqueamento de interpretações SQL oriundas de consultas com palavras-chave / A technique forranking SQL interpretations from keyword queries Sousa, Walisson Pereira de 11 December 2017 (has links) Submitted by Franciele Moreira (francielemoreyra@gmail.com) on 2018-01-26T12:51:05Z No. of bitstreams: 2 Dissertação - Walisson Pereira de Sousa - 2017.pdf: 2525793 bytes, checksum: 0717fb8c52cc2e89d38f1e7c4a763ec1 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2018-01-29T10:41:52Z (GMT) No. of bitstreams: 2 Dissertação - Walisson Pereira de Sousa - 2017.pdf: 2525793 bytes, checksum: 0717fb8c52cc2e89d38f1e7c4a763ec1 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-01-29T10:41:52Z (GMT). No. of bitstreams: 2 Dissertação - Walisson Pereira de Sousa - 2017.pdf: 2525793 bytes, checksum: 0717fb8c52cc2e89d38f1e7c4a763ec1 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2017-12-11 / Retrieving information using words from a natural language is a simple and already consolidated way to access data on the Web. It would be highly desirable that a similar method could be used to submit queries on databases, thereby freeing the user from learning a query language and knowing the searched database structure. In this sense, a great research effort has been dedicated by the database community in order to develop more efficient query keywords techniques for database access. However, a keyword query can result in a large number of SQL interpretations, most of them irrelevant for the initial query. This work carry out a study of different query interpretations ranking techniques and, finally, proposes a ranking methodology that maximizes the amount of relevant results for keyword queries submitted to relational databases. / Recuperar informações utilizando palavras de uma linguagem natural é uma maneira simples e já consolidada para acessar dados na Web. Seria altamente desejável que um método semelhante fosse utilizado para executar consultas em bancos de dados, liberando assim o usuário do aprendizado de uma linguagem de consulta e o conhecimento da estrutura do banco de dados a ser consultado. Nesse sentido, um grande esforço de pesquisa vem sendo dedicado pela comunidade de Banco de dados, a fim de desenvolver técnicas de consultas com palavras-chave mais eficientes para acesso a bancos de dados. No entanto, uma consulta com palavras-chave pode originar uma grande quantidade de interpretações SQL, boa parte delas resultando em dados irrelevantes para a consulta inicial. Este trabalho realiza um estudo de diferentes técnicas para ranqueamento de interpretações de consultas e, ao final, propõe uma metodologia de ranqueamento que maximiza a quantidade de resultados relevantes para consultas com palavras-chave submetidas a bancos de dados relacionais. Consulta a banco de dados Interpretação de consulta Banco de dados relacional Database query Query interpretation Relational database
258	Principles for Distributed Databases in Telecom Environment / Principer för distribuerade databaser inom Telecom Miljö Ashraf, Imran, Khokhar, Amir Shahzed January 2010 (has links) Centralized databases are becoming bottleneck for organizations that are physically distributed and access data remotely. Data management is easy in centralized databases. However, it carries high communication cost and most importantly high response time. The concept of distributing the data over various locations is very attractive for such organizations. In such cases the database is fragmented into fragments and distributed to the locations where it is needed. This kind of distribution provides local control of data and the data access is also very fast in such databases. However, concurrency control, query optimization and data allocations are the factors that affect the response time and must be investigated prior to implementing distributed databases. This thesis makes the use of mixed method approach to meet its objective. In quantitative section, we performed an experiment to compare the response time of two databases; centralized and fragmented/distributed. The experiment was performed at Ericsson. A literature review was also done to find out other important response time related issues like query optimization, concurrency control and data allocation. The literature review revealed that these factors can further improve the response time in distributed environment. Results of the experiment showed a substantial decrease in the response time due to the fragmentation and distribution. / Centraliserade databaser blir flaskhals för organisationer som är fysiskt distribuerade och tillgång till data på distans. Datahantering är lätt i centrala databaser. Men bär den höga kostnaden kommunikation och viktigast av hög svarstid. Konceptet att distribuera data över olika orter är mycket attraktiv för sådana organisationer. I sådana fall databasen är splittrade fragment och distribueras till de platser där det behövs. Denna typ av distribution ger lokal kontroll av uppgifter och dataåtkomst är också mycket snabb i dessa databaser. Men, samtidighet kontroll, frågeoptimering och data anslagen är de faktorer som påverkar svarstiden och måste utredas innan genomförandet distribuerade databaser. Denna avhandling gör användningen av blandade metod strategi för att nå sitt mål. I kvantitativa delen utförde vi ett experiment för att jämföra svarstid på två databaser, centraliserad och fragmenterad / distribueras. Försöket utfördes på Ericsson. En litteraturstudie har gjorts för att ta reda på andra viktiga svarstid liknande frågor som frågeoptimering, samtidighet kontroll och data tilldelning. Litteraturgenomgången visade att dessa faktorer ytterligare kan förbättra svarstiden i distribuerad miljö. Resultaten av försöket visade en betydande minskning av den svarstid på grund av splittring och distribution. distributed databases centralized database fragmentation data allocation query processing query optimization concurrency control Computer Sciences Datavetenskap (datalogi)
259	Graph Models For Query Focused Text Summarization And Assessment Of Machine Translation Using Stopwords Rama, B 06 1900 (has links) (PDF) Text summarization is the task of generating a shortened version of the original text where core ideas of the original text are retained. In this work, we focus on query focused summarization. The task is to generate the summary from a set of documents which answers the query. Query focused summarization is a hard task because it expects the summary to be biased towards the query and at the same time important concepts in the original documents must be preserved with high degree of novelty. Graph based ranking algorithms which use biased random surfer model like Topic-sensitive LexRank have been applied to query focused summarization. In our work, we propose look-ahead version of Topic-sensitive LexRank. We incorporate the option of look-ahead in the random walk model and we show that it helps in generating better quality summaries. Next, we consider assessment of machine translation. Assessment of a machine translation output is important for establishing benchmarks for translation quality. An obvious way to assess the quality of machine translation is through the perception of human subjects. Though highly reliable, this approach is not scalable and is time consuming. Hence mechanisms have been devised to automate the assessment process. All such assessment methods are essentially a study of correlations between human translation and the machine translation. In this work, we present a scalable approach to assess the quality of machine translation that borrows features from the study of writing styles, popularly known as Stylometry. Towards this, we quantify the characteristic styles of individual machine translators and compare them with that of human generated text. The translator whose style is closest to human style is deemed to generate a higher quality translation. We show that our approach is scalable and does not require actual source text translations for evaluation. Natural Language Processing Abstracting Query Optimization Machine Translation Text Summarization Query Focused Summarization Machine Translators Computer Science
260	A visual query language served by a multi-sensor environment Camara (Silvervarg), Karin January 2007 (has links) A problem in modern command and control situations is that much data is available from different sensors. Several sensor data sources also require that the user has knowledge about the specific sensor types to be able to interpret the data. To alleviate the working situation for a commander, we have designed and constructed a system that will take input from several different sensors and subsequently present the relevant combined information to the user. The users specify what kind of information is of interest at the moment by means of a query language. The main issues when designing this query language have been that (a) the users should not have to have any knowledge about sensors or sensor data analysis, and (b) that the query language should be powerful and flexible, yet easy to use. The solution has been to (a) use sensor data independence and (b) have a visual query language. A visual query language was developed with a two-step interface. First, the users pose a “rough”, simple query that is evaluated by the underlying knowledge system. The system returns the relevant information that can be found in the sensor data. Then, the users have the possibility to refine the result by setting conditions for this. These conditions are formulated by specifying attributes of objects or relations between objects. The problem of uncertainty in spatial data; (i.e. location, area) has been considered. The question of how to represent potential uncertainties is dealt with. An investigation has been carried out to find which relations are practically useful when dealing with uncertain spatial data. The query language has been evaluated by means of a scenario. The scenario was inspired by real events and was developed in cooperation with a military officer to assure that it was fairly realistic. The scenario was simulated using several tools where the query language was one of the more central ones. It proved that the query language can be of use in realistic situations. / <p>Report code: LiU-Tek-Lic-2007:42.</p> query language command and control spatial temporal uncertainty sensor data independance visual query language sensor data source Computer Sciences Datavetenskap (datalogi)

Search results