Global ETD Search

331	Proposition d'approches de routage de requêtes dans les systèmes pair-à-pair non structurés / Query routing approaches for peer to peer systems Yeferny, Taoufik 15 January 2014 (has links) Ces deux dernières décennies les systèmes P2P de partage de fichiers sont devenus très populaires grâce aux accès à des ressources diverses, distribuées sur Internet. Parallèlement à l'évolution de cette catégorie de systèmes, les dispositifs mobiles (téléphones cellulaires, PDA et autres appareils portatifs) ont eu un grand succès sur le marché. Équipés d'une technologie de communication sans fil (Bluetooth, et Wifi), ils peuvent communiquer sans nécessiter une infrastructure particulière en utilisant un réseau mobile adhoc (Mobile Adhoc NETwork -MANET). De la même manière, les systèmes P2P peuvent être aussi déployés sur ce type de réseau et deviennent des systèmes P2P mobiles (Mobile 2P systems). Dans le cadre de cette thèse, nous nous intéressons essentiellement à la recherche d'information dans les systèmes P2P et plus précisément au problème de routage de requêtes. La première partie de la thèse, s'est focalisée sur le routage de requêtes dans les systèmes P2P sur Internet. Nous avons proposé (i) un modèle de routage sémantique basé sur l'historique des requêtes. Ce modèle est ensuite instancié pour définir une nouvelle méthode de routage par apprentissage. Pour pallier le problème de démarrage à froid, (ii) nous avons proposé une méthode prédictive de l'intention de l'utilisateur qui construit une base de connaissances à priori pour chaque pair. Enfin, (iii) nous avons proposé une méthode de routage hybride pour traiter le problème d'échec de sélection. Cette méthode est basée sur l'historique des requêtes et le regroupement de pairs dans des groupes sémantiques. La deuxième partie de la thèse, s'est focalisée sur le routage de requêtes dans les systèmes P2P mobiles. L'apparition des MANETs, a soulevé de nouveaux challenges de routage. Ces réseaux souffrent de plusieurs contraintes liées aux supports de transmission ou bien aux dispositifs mobiles. Dans ce cadre, nous avons proposé une méthode de routage pour les systèmes P2P non structurés mobiles basée sur le contexte de l'utilisateur. D'un point de vue technique, toutes ces propositions ont été développées, validées et évaluées grâce aux simulateurs PeerSim et NS2 / Peer-to-peer systems have emerged as platforms for users to search and share information over the Internet. In fact, thanks to these systems, user can share various resources, send queries to search and locate resources shared by other users. Nowadays, mobile and wireless technology has achieved great progress. These devices are also equipped with low radio range technology, like Bluetooth and Wi-Fi, etc. By means of the low radio range technology, they can communicate with each other without using communication infrastructure (e.g. Internet network) and form a mobile ad hoc network (MANET). Hence, P2P file sharing systems can be also deployed over MANET. A challenging problem in unstructured P2P systems is query routing. Researches' efficiency and effectiveness can be improved by making smart decisions for query routing. Our contributions, in this thesis, focus on two complementary axes. Firstly, our research work focalized on P2P systems over Internet. We introduced a novel semantic model for query routing based on past queries, thereafter we instantiated this model to define our specific routing method. In addition, we addressed two difficult challenging problems: (i) the bootstraping (ii) the unsuccessful relevant peers search. Secondly, we are focalized on P2P systems over MANET. Due the nature of MANET, mobile P2P systems suffer from several constraints of wireless medium and energy-limited. Indeed, query routing methods proposed for P2P system over Internet cannot be applied. In this context, we proposed a context-aware integrated routing method for P2P file sharing systems over MANET. The different contributions are developed, validated and evaluated with the network simulators PeerSim and NS2 P2P Routage de requêtes Contexte utilisateur Réseau MANET P2P Query routing User's context MANET network
332	NOVEL COMPUTATIONAL METHODS FOR SEQUENCING DATA ANALYSIS: MAPPING, QUERY, AND CLASSIFICATION Liu, Xinan 01 January 2018 (has links) Over the past decade, the evolution of next-generation sequencing technology has considerably advanced the genomics research. As a consequence, fast and accurate computational methods are needed for analyzing the large data in different applications. The research presented in this dissertation focuses on three areas: RNA-seq read mapping, large-scale data query, and metagenomics sequence classification. A critical step of RNA-seq data analysis is to map the RNA-seq reads onto a reference genome. This dissertation presents a novel splice alignment tool, MapSplice3. It achieves high read alignment and base mapping yields and is able to detect splice junctions, gene fusions, and circular RNAs comprehensively at the same time. Based on MapSplice3, we further extend a novel lightweight approach called iMapSplice that enables personalized mRNA transcriptional profiling. As huge amount of RNA-seq has been shared through public datasets, it provides invaluable resources for researchers to test hypotheses by reusing existing datasets. To meet the needs of efficiently querying large-scale sequencing data, a novel method, called SeqOthello, has been developed. It is able to efficiently query sequence k-mers against large-scale datasets and finally determines the existence of the given sequence. Metagenomics studies often generate tens of millions of reads to capture the presence of microbial organisms. Thus efficient and accurate algorithms are in high demand. In this dissertation, we introduce MetaOthello, a probabilistic hashing classifier for metagenomic sequences. It supports efficient query of a taxon using its k-mer signatures. RNA-seq mapping splice junction Othello query classification Bioinformatics Computational Biology Genomics
333	METADATA-BASED IMAGE COLLECTING AND DATABASING FOR SHARING AND ANALYSIS Wu, Xi 01 January 2019 (has links) Data collecting and preparing is generally considered a crucial process in data science projects. Especially for image data, adding semantic attributes when preparing image data provides much more insights for data scientists. In this project, we aim to implement a general-purpose central image data repository that allows image researchers to collect data with semantic properties as well as data query. One of our researchers has come up with the specific challenge of collecting images with weight data of infants in least developed countries with limited internet access. The rationale is to predict infant weights based on image data by applying Machine Learning techniques. To address the data collecting issue, I implemented a mobile application which features online and offline image and annotation upload and a web application which features image query functionality. This work is derived and partly decoupled from the previous project – ImageSfERe (Image Sharing for Epilepsy Research), which is a web-based platform to collect and share epilepsy patient imaging. image data collecting text-based image query mobile development web development APIs Data Storage Systems
334	Towards Semantically Enabled Complex Event Processing Keskisärkkä, Robin January 2017 (has links) The Semantic Web provides a framework for semantically annotating data on the web, and the Resource Description Framework (RDF) supports the integration of structured data represented in heterogeneous formats. Traditionally, the Semantic Web has focused primarily on more or less static data, but information on the web today is becoming increasingly dynamic. RDF Stream Processing (RSP) systems address this issue by adding support for streaming data and continuous query processing. To some extent, RSP systems can be used to perform complex event processing (CEP), where meaningful high-level events are generated based on low-level events from multiple sources; however, there are several challenges with respect to using RSP in this context. Event models designed to represent static event information lack several features required for CEP, and are typically not well suited for stream reasoning. The dynamic nature of streaming data also greatly complicates the development and validation of RSP queries. Therefore, reusing queries that have been prepared ahead of time is important to be able to support real-time decision-making. Additionally, there are limitations in existing RSP implementations in terms of both scalability and expressiveness, where some features required in CEP are not supported by any of the current systems. The goal of this thesis work has been to address some of these challenges and the main contributions of the thesis are: (1) an event model ontology targeted at supporting CEP; (2) a model for representing parameterized RSP queries as reusable templates; and (3) an architecture that allows RSP systems to be integrated for use in CEP. The proposed event model tackles issues specifically related to event modeling in CEP that have not been sufficiently covered by other event models, includes support for event encapsulation and event payloads, and can easily be extended to fit specific use-cases. The model for representing RSP query templates was designed as an extension to SPIN, a vocabulary that supports modeling of SPARQL queries as RDF. The extended model supports the current version of the RSP Query Language (RSP-QL) developed by the RDF Stream Processing Community Group, along with some of the most popular RSP query languages. Finally, the proposed architecture views RSP queries as individual event processing agents in a more general CEP framework. Additional event processing components can be integrated to provide support for operations that are not supported in RSP, or to provide more efficient processing for specific tasks. We demonstrate the architecture in implementations for scenarios related to traffic-incident monitoring, criminal-activity monitoring, and electronic healthcare monitoring. Semantic Web RDF stream processing Complex Event Processing Event modeling Query abstraction Computer Sciences Datavetenskap (datalogi)
335	Medical Data Management on the cloud / Gestion de données médicales sur le cloud Mohamad, Baraa 23 June 2015 (has links) Résumé indisponible / Medical data management has become a real challenge due to the emergence of new imaging technologies providing high image resolutions.This thesis focuses in particular on the management of DICOM files. DICOM is one of the most important medical standards. DICOM files have special data format where one file may contain regular data, multimedia data and services. These files are extremely heterogeneous (the schema of a file cannot be predicted) and have large data sizes. The characteristics of DICOM files added to the requirements of medical data management in general – in term of availability and accessibility- have led us to construct our research question as follows:Is it possible to build a system that: (1) is highly available, (2) supports any medical images (different specialties, modalities and physicians’ practices), (3) enables to store extremely huge/ever increasing data, (4) provides expressive accesses and (5) is cost-effective .In order to answer this question we have built a hybrid (row-column) cloud-enabled storage system. The idea of this solution is to disperse DICOM attributes thoughtfully, depending on their characteristics, over both data layouts in a way that provides the best of row-oriented and column-oriented storage models in one system. All with exploiting the interesting features of the cloud that enables us to ensure the availability and portability of medical data. Storing data on such hybrid data layout opens the door for a second research question, how to process queries efficiently over this hybrid data storage with enabling new and more efficient query plansThe originality of our proposal comes from the fact that there is currently no system that stores data in such hybrid storage (i.e. an attribute is either on row-oriented database or on column-oriented one and a given query could interrogate both storage models at the same time) and studies query processing over it.The experimental prototypes implemented in this thesis show interesting results and opens the door for multiple optimizations and research questions. Imagerie médicale Medical Imaging Hybrid database Cloud computing Query processing Multi-database DICOM Distributed database
336	Distributed indexing and scalable query processing for interactive big data explorations Guzun, Gheorghi 01 August 2016 (has links) The past few years have brought a major surge in the volumes of collected data. More and more enterprises and research institutions find tremendous value in data analysis and exploration. Big Data analytics is used for improving customer experience, perform complex weather data integration and model prediction, as well as personalized medicine and many other services. Advances in technology, along with high interest in big data, can only increase the demand on data collection and mining in the years to come. As a result, and in order to keep up with the data volumes, data processing has become increasingly distributed. However, most of the distributed processing for large data is done by batch processing and interactive exploration is hardly an option. To efficiently support queries over large amounts of data, appropriate indexing mechanisms must be in place. This dissertation proposes an indexing and query processing framework that can run on top of a distributed computing engine, to support fast, interactive data explorations in data warehouses. Our data processing layer is built around bit-vector based indices. This type of indexing features fast bit-wise operations and scales up well for high dimensional data. Additionally, compression can be applied to reduce the index size, and thus utilize less memory and network communication. Our work can be divided into two areas: index compression and query processing. Two compression schemes are proposed for sparse and dense bit-vectors. The design of these encoding methods is hardware-driven, and the query processing is optimized for the available computing hardware. Query algorithms are proposed for selection, aggregation, and other specialized queries. The query processing is supported on single machines, as well as computer clusters. Bit-vector Database Indexing Data Compression Data Exploration Distributed Database Query Algorithm Electrical and Computer Engineering
337	Optimal multidimensional storage organisation for efficient query processing in databases Mohammed, Salahadin, 1959- January 2001 (has links) Abstract not available Query languages (Computer science) Database management Heuristic programming Multivariate analysis Relational databases
338	Access Methods for Temporal Databases Stantic, Bela, n/a January 2005 (has links) A Temporal database is one that supports some aspect of time distinct from user defined time. Over the last two decades interest in the field of temporal databases has increased significantly, with contributions from many researchers. However, the lack of efficient access methods is perhaps one of the reasons why commercial RDBMS vendors have been reluctant to adopt the advances in temporal database research. Therefore, an obvious research question is: can we develop more robust and more efficient access methods for temporal databases than the existing ones? This thesis attempts to address this question, and the main contributions of this study are summarised as follows: We investigated different representations of 'now' and how the modelling of current time influences the efficiency of accessing 'now relative' temporal data. A new method, called the 'Point' approach, is proposed. Our approach not only elegantly models the current time but also significantly outperforms the existing methods. We proposed a new index structure, called a Virtual Binary tree (VB-tree), based on spatial representation of interval data and a regular triangular decomposition of this space. Further, we described a sound and complete query algorithm. The performance of the algorithm is then evaluated both asymptotically and experimentally with respect to the state-of-the-art in the field. We claim that the VB-tree requires less space and uses fewer disk accesses than the currently best known structure - the RI-tree. Temporal database point approach virtual binary tree index structure query algorithm
339	Contextual information retrieval from the WWW Limbu, Dilip Kumar January 2008 (has links) Contextual information retrieval (CIR) is a critical technique for today’s search engines in terms of facilitating queries and returning relevant information. Despite its importance, little progress has been made in its application, due to the difficulty of capturing and representing contextual information about users. This thesis details the development and evaluation of the contextual SERL search, designed to tackle some of the challenges associated with CIR from the World Wide Web. The contextual SERL search utilises a rich contextual model that exploits implicit and explicit data to modify queries to more accurately reflect the user’s interests as well as to continually build the user’s contextual profile and a shared contextual knowledge base. These profiles are used to filter results from a standard search engine to improve the relevance of the pages displayed to the user. The contextual SERL search has been tested in an observational study that has captured both qualitative and quantitative data about the ability of the framework to improve the user’s web search experience. A total of 30 subjects, with different levels of search experience, participated in the observational study experiment. The results demonstrate that when the contextual profile and the shared contextual knowledge base are used, the contextual SERL search improves search effectiveness, efficiency and subjective satisfaction. The effectiveness improves as subjects have actually entered fewer queries to reach the target information in comparison to the contemporary search engine. In the case of a particularly complex search task, the efficiency improves as subjects have browsed fewer hits, visited fewer URLs, made fewer clicks and have taken less time to reach the target information when compared to the contemporary search engine. Finally, subjects have expressed a higher degree of satisfaction on the quality of contextual support when using the shared contextual knowledge base in comparison to using their contextual profile. These results suggest that integration of a user’s contextual factors and information seeking behaviours are very important for successful development of the CIR framework. It is believed that this framework and other similar projects will help provide the basis for the next generation of contextual information retrieval from the Web. Retrieval models Personalized Web search Contextual search User profile Relevance feedback Query formulation
340	IDQ Viewer: För enklare Visning av Databasstruktur : Viewer : För enklare Visning av Databasstruktur Widinghoff, Steve, Andersson, Michael January 2008 (has links) <p>The social community is structured around data, a lot of that data is stored in different types of databases. The meaning of this essay is to develop an interactive tool that will change the way database visualization is done, and also to research in which fields a tool of this type can be used. With the help of different methods we discovered areas where it could be of use, such as; presentation of a database, development and documentation of a database. The primary goal of the prototype was to make viewing of database structures easier, even people without prior database knowledge should be able to understand the structure. During the development we used prototypes. Their are two types of prototyping, high-fidelity and low-fidelity, both were used during the development. IDQ viewer is a prototype, which shows tables and columns in a database. One of the thoughts behind the tools functionality was that it should be platform independent, and also place independent. The tool became useful in different roles, as a learning tool, a political tool and a technical tool. As a result we can see that the usability of this tool has broadened. There exists a strong public need of tools that will generate this type of database visualization. Further research and further development of new tools are needed in the areas of database visualization.</p> IDQ.se Interactive Database Querying Query SQL Viewer View Structure Informatics Informatik

Search results