Global ETD Search

101	Supporting Advanced Queries on Scientific Array Data Ebenstein, Roee A. 18 December 2018 (has links) No description available. Computer Science DBMS Scientific Array Data Analytical Querying SDBMS Join Array Join Array Data Join Databases Array Databases NoDB Scientific Databases Query Optimization Plan Pruning Bitmap Join Bitmap Indexes Range Joins Data Skew Querying Skewed data
102	Visualisation de données dynamiques et complexes : des séries temporelles hiérarchiques aux graphes multicouches / Visualization of Dynamic and Complex Data : from Hierarchical Time Series to Multilayer Graphs Cuenca Pauta, Erick 12 November 2018 (has links) L'analyse de données de plus en plus complexes, volumineuses et issues de différentes sources (e.g. internet, médias sociaux, etc.) est une tâche difficile. Elle reste cependant cruciale dans de très nombreux domaines d'application. Elle implique, pour pouvoir en extraire des connaissances, de mieux comprendre la nature des données, leur évolution ou les nombreuses relations complexes qu'elles peuvent contenir. La visualisation d'informations s'intéresse aux méthodes de représentations visuelles et interactives permettant d'aider un utilisateur à extraire des connaissances. C'est dans ce contexte que se situe le travail présenté dans ce mémoire. Dans un premier temps, nous nous intéressons à la visualisation de longues séries temporelles hiérarchiques. Après avoir analysé les différentes approches existantes, nous présentons le système MultiStream permettant de visualiser, explorer et comparer l'évolution de séries organisées dans une structure hiérarchique. Nous illustrons son utilisation par deux exemples d'utilisation : émotions exprimées dans des médias sociaux et évolution des genres musicaux. Dans un second temps nous abordons la problématique de données complexes modélisées sous la forme de graphes multicouches (différentes types d'arêtes peuvent relier les n÷uds). Plus particulièrement nous nous intéressons au requêtage visuel de graphes volumineux en présentant VERTIGo un système qui permet de construire des requêtes, d'interroger un moteur spécifique, de visualiser/explorer les résultats à différentes niveaux de détail et de suggérer de nouvelles extensions de requêtes. Nous illustrons son utilisation à l'aide d'un graphe d'auteurs provenant de différentes communautés. / The analysis of data that is increasingly complex, large and from different sources (e.g. internet, social medias, etc.) is a dificult task. However, it remains crucial for many fields of application. It implies, in order to extract knowledge, to better understand the nature of the data, its evolution or the many complex relationships it may contain. Information visualization is about visual and interactive representation methods to help a user to extract knowledge. The work presented in this document takes place in this context. At first, we are interested in the visualization of large hierarchical time series. After analyzing the different existing approaches, we present the MultiStream system for visualizing, exploring and comparing the evolution of the series organized into a hierarchical structure. We illustrate its use by two examples: emotions expressed in social media and the evolution of musical genres. In a second time, we tackle the problem of complex data modeled in the form of multilayer graphs (different types of edges can connect the nodes). More specifically, we are interested in the visual querying of large graphs and we present VERTIGo, a system which makes it possible to build queries, to launch them on a specific engine, to visualize/explore the results at different levels of details and to suggest new query extensions. We illustrate its use with a graph of co-authors from different communities. Visualisation d'informations Séries temporelles hiérarchiques Requêtage visuel Graphes multicouches Information visualization Visual querying Hierarchical time series Multilayer graphs
103	Dynamic Optimization and Migration of Continuous Queries Over Data Streams Zhu, Yali 23 August 2006 (has links) "Continuous queries process real-time streaming data and output results in streams for a wide range of applications. Due to the fluctuating stream characteristics, a streaming database system needs to dynamically adapt query execution. This dissertation proposes novel solutions to continuous query adaptation in three core areas, namely dynamic query optimization, dynamic plan migration and partitioned query adaptation. Runtime query optimization needs to efficiently generate plans that satisfy both CPU and memory resource constraints. Existing work focus on minimizing intermediate query results, which decreases memory and CPU usages simultaneously. However, doing so cannot assure that both resource constraints are being satisfied, because memory and CPU can be either positively or negatively correlated. This part of the dissertation proposes efficient optimization strategies that utilize both types of correlations to search the entire query plan space in polynomial time when a typical exhaustive search would take at least exponential time. Extensive experimental evaluations have demonstrated the effectiveness of the proposed strategies. Dynamic plan migration is concerned with on-the-fly transition from one continuous plan to a semantically equivalent yet more efficient plan. It is a must to guarantee the continuation and repeatability of dynamic query optimization. However, this research area has been largely neglected in the current literature. The second part of this dissertation proposes migration strategies that dynamically migrate continuous queries while guaranteeing the integrity of the query results, meaning there are no missing, duplicate or incorrect results. The extensive experimental evaluations show that the proposed strategies vary significantly in terms of output rates and memory usages given distinct system configurations and stream workloads. Partitioned query processing is effective to process continuous queries with large stateful operators in a distributed system. Dynamic load redistribution is necessary to balance uneven workload across machines due to changing stream properties. However, existing solutions generally assume static query plans without runtime query optimization. This part of the dissertation evaluates the benefits of applying query optimization in partitioned query processing and shows dramatic performance improvement of more than 300%. Several load balancing strategies are then proposed to consider the heterogeneity of plan shapes across machines caused by dynamic query optimization. The effectiveness of the proposed strategies is analyzed through extensive experiments using a cluster." query optimization data streams runtime query adaptations continuous queries plan migration distributed query processing window constraints Querying (Computer science)
104	Avaliação experimental de uma técnica de padronização de escores de similaridade / Experimental evaluation of a similarity score standardization technique Nunes, Marcos Freitas January 2009 (has links) Com o crescimento e a facilidade de acesso a Internet, o volume de dados cresceu muito nos últimos anos e, consequentemente, ficou muito fácil o acesso a bases de dados remotas, permitindo integrar dados fisicamente distantes. Geralmente, instâncias de um mesmo objeto no mundo real, originadas de bases distintas, apresentam diferenças na representação de seus valores, ou seja, os mesmos dados no mundo real podem ser representados de formas diferentes. Neste contexto, surgiram os estudos sobre casamento aproximado utilizando funções de similaridade. Por consequência, surgiu a dificuldade de entender os resultados das funções e selecionar limiares ideais. Quando se trata de casamento de agregados (registros), existe o problema de combinar os escores de similaridade, pois funções distintas possuem distribuições diferentes. Com objetivo de contornar este problema, foi desenvolvida em um trabalho anterior uma técnica de padronização de escores, que propõe substituir o escore calculado pela função de similaridade por um escore ajustado (calculado através de um treinamento), o qual é intuitivo para o usuário e pode ser combinado no processo de casamento de registros. Tal técnica foi desenvolvida por uma aluna de doutorado do grupo de Banco de Dados da UFRGS e será chamada aqui de MeaningScore (DORNELES et al., 2007). O presente trabalho visa estudar e realizar uma avaliação experimental detalhada da técnica MeaningScore. Com o final do processo de avaliação aqui executado, é possível afirmar que a utilização da abordagem MeaningScore é válida e retorna melhores resultados. No processo de casamento de registros, onde escores de similaridades distintos devem ser combinados, a utilização deste escore padronizado ao invés do escore original, retornado pela função de similaridade, produz resultados com maior qualidade. / With the growth of the Web, the volume of information grew considerably over the past years, and consequently, the access to remote databases became easier, which allows the integration of distributed information. Usually, instances of the same object in the real world, originated from distinct databases, present differences in the representation of their values, which means that the same information can be represented in different ways. In this context, research on approximate matching using similarity functions arises. As a consequence, there is a need to understand the result of the functions and to select ideal thresholds. Also, when matching records, there is the problem of combining the similarity scores, since distinct functions have different distributions. With the purpose of overcoming this problem, a previous work developed a technique that standardizes the scores, by replacing the computed score by an adjusted score (computed through a training), which is more intuitive for the user and can be combined in the process of record matching. This work was developed by a Phd student from the UFRGS database research group, and is referred to as MeaningScore (DORNELES et al., 2007). The present work intends to study and perform an experimental evaluation of this technique. As the validation shows, it is possible to say that the usage of the MeaningScore approach is valid and return better results. In the process of record matching, where distinct similarity must be combined, the usage of the adjusted score produces results with higher quality. Armazenamento : Dados Banco : Dados Métricas : Similaridade Consulta : Similaridade Similarity querying Data integration Data cleaning Record matching Adjusted score Data quality
105	Metadata Extraction From Text In Soccer Domain Gokturk, Ozkan Ziya 01 September 2008 (has links) (PDF) Video databases and content based retrieval in these databases have become popular with the improvements in technology. Metadata extraction techniques are used for providing data to video content. One popular metadata extraction technique for mul- timedia is information extraction from text. For some domains, it is possible to &amp / #64257 / nd accompanying text with the video, such as soccer domain, movie domain and news domain. In this thesis, we present an approach of metadata extraction from match reports for soccer domain. The UEFA Cup and UEFA Champions League Match Reports are downloaded from the web site of UEFA by a web-crawler. These match reports are preprocessed by using regular expressions and then important events are extracted by using hand-written rules. In addition to hand-written rules, two di&amp / #64256 / erent machine learning techniques are applied on match corpus to learn event patterns and automatically extract match events. Extracted events are saved in an MPEG-7 &amp / #64257 / le. A user interface is implemented to query the events in the MPEG-7 match corpus and view the corresponding video segments.
106	A framework for spatio-temporal querying amongst mobile devices Cochran, Benjamin Mark, 1982- 13 August 2012 (has links) With mobile web browsers holding around eight percent of the global browser market share in terms of usage, web development for these platforms is becoming critically important as usage moves from the desktop towards mobile devices. Recent advances in client side browser technology like HTML5 and WebSockets have allowed web browser applications to approach feature parity with thick client desktop applications. This paper explores the possibility of a real-time online multiplayer game playable from just a mobile device's web browser. It does not focus on gameplay or graphics, rather it focuses on the backend infrastructure needed to support such a game. The framework devised to support this sort of interaction, Marionette, is well suited towards addressing sharing of location-specific, short-lived information between people using their smartphones without the use of any external software or proprietary software packages on the client side. / text iOS Mobile Web Gaming HTML5 WebSockets Juggernaut Marionette Spatio-temporal Querying Space Android Sinatra Ruby Voronoi Tesselation
107	Natural Language Interface On A Video Data Model Erozel, Guzen 01 July 2005 (has links) (PDF) The video databases and retrieval of data from these databases have become popular in various business areas of work with the improvements in technology. As a kind of video database, video archive systems need user-friendly interfaces to retrieve video frames. In this thesis, an NLP based user interface to a video database system is developed using a content-based spatio-temporal video data model. The data model is focused on the semantic content which includes objects, activities, and spatial properties of objects. Spatio-temporal relationships between video objects and also trajectories of moving objects can be queried with this data model. In this video database system, NL interface enables flexible querying. The queries, which are given as English sentences, are parsed using Link Parser. Not only exact matches but similar objects and activities are also returned from the database with the help of the conceptual ontology module to return all related frames to the user. This module is implemented using a distance-based method of semantic similarity search on the semantic domain-independent ontology, WordNet. The semantic representations of the given queries are extracted from their syntactic structures using information extraction techniques. The extracted semantic representations are used to call the related parts of the underlying spatio-temporal video data model to calculate the results of the queries. QA Computer Software 76.75-76.765
108	Human-centered semantic retrieval in multimedia databases Chen, Xin. January 2008 (has links) (PDF) Thesis (Ph. D.)--University of Alabama at Birmingham, 2008. / Additional advisors: Barrett R. Bryant, Yuhua Song, Alan Sprague, Robert W. Thacker. Description based on contents viewed Oct. 8, 2008; title from PDF t.p. Includes bibliographical references (p. 172-183).
109	A personalised query expansion approach using context Seher, Indra. January 2007 (has links) Thesis (Ph.D.)--University of Western Sydney, 2007. / A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy to the College of Health & Science, School of Computing and Mathematics, University of Western Sydney. Includes bibliography.
110	Towards effective analysis of big graphs : from scalability to quality Tian, Chao January 2017 (has links) This thesis investigates the central issues underlying graph analysis, namely, scalability and quality. We first study the incremental problems for graph queries, which aim to compute the changes to the old query answer, in response to the updates to the input graph. The incremental problem is called bounded if its cost is decided by the sizes of the query and the changes only. No matter how desirable, however, our first results are negative: for common graph queries such as graph traversal, connectivity, keyword search and pattern matching, their incremental problems are unbounded. In light of the negative results, we propose two new characterizations for the effectiveness of incremental computation, and show that the incremental computations above can still be effectively conducted, by either reducing the computations on big graphs to small data, or incrementalizing batch algorithms by minimizing unnecessary recomputation. We next study the problems with regards to improving the quality of the graphs. To uniquely identify entities represented by vertices in a graph, we propose a class of keys that are recursively defined in terms of graph patterns, and are interpreted with subgraph isomorphism. As an application, we study the entity matching problem, which is to find all pairs of entities in a graph that are identified by a given set of keys. Although the problem is proved to be intractable, and cannot be parallelized in logarithmic rounds, we provide two parallel scalable algorithms for it. In addition, to catch numeric inconsistencies in real-life graphs, we extend graph functional dependencies with linear arithmetic expressions and comparison predicates, referred to as NGDs. Indeed, NGDs strike a balance between expressivity and complexity, since if we allow non-linear arithmetic expressions, even of degree at most 2, the satisfiability and implication problems become undecidable. A localizable incremental algorithm is developed to detect errors using NGDs, where the cost is determined by small neighbors of nodes in the updates instead of the entire graph. Finally, a rule-based method to clean graphs is proposed. We extend graph entity dependencies (GEDs) as data quality rules. Given a graph, a set of GEDs and a block of ground truth, we fix violations of GEDs in the graph by combining data repairing and object identification. The method finds certain fixes to errors detected by GEDs, i.e., as long as the GEDs and the ground truth are correct, the fixes are assured correct as their logical consequences. Several fundamental results underlying the method are established, and an algorithm is developed to implement the method. We also parallelize the method and guarantee to reduce its running time with the increase of processors.

Search results