Global ETD Search

131	CircularTrip and ArcTrip:effective grid access methods for continuous spatial queries. Cheema, Muhammad Aamir, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) A k nearest neighbor query q retrieves k objects that lie closest to the query point q among a given set of objects P. With the availability of inexpensive location aware mobile devices, the continuous monitoring of such queries has gained lot of attention and many methods have been proposed for continuously monitoring the kNNs in highly dynamic environment. Multiple continuous queries require real-time results and both the objects and queries issue frequent location updates. Most popular spatial index, R-tree, is not suitable for continuous monitoring of these queries due to its inefficiency in handling frequent updates. Recently, the interest of database community has been shifting towards using grid-based index for continuous queries due to its simplicity and efficient update handling. For kNN queries, the order in which cells of the grid are accessed is very important. In this research, we present two efficient and effective grid access methods, CircularTrip and ArcTrip, that ensure that the number of cells visited for any continuous kNN query is minimum. Our extensive experimental study demonstrates that CircularTrip-based continuous kNN algorithm outperforms existing approaches in terms of both efficiency and space requirement. Moreover, we show that CircularTrip and ArcTrip can be used for many other variants of nearest neighbor queries like constrained nearest neighbor queries, farthest neighbor queries and (k + m)-NN queries. All the algorithms presented for these queries preserve the properties that they visit minimum number of cells for each query and the space requirement is low. Our proposed techniques are flexible and efficient and can be used to answer any query that is hybrid of above mentioned queries. For example, our algorithms can easily be used to efficiently monitor a (k + m) farthest neighbor query in a constrained region with the flexibility that the spatial conditions that constrain the region can be changed by the user at any time. Nearest neighbor analysis (Statistics) Querying (Computer science) Query languages (Computer science) Data structures (Computer science) Real-time data processing. Computational grids (Computer systems) Computer algorithms.
132	Contribution à l'interrogation flexible et personnalisée d'objets complexes modélisés par des graphes / Flexible and Personalized Querying of Complex Objects Modeled by Graphs Abbaci, Katia 12 December 2013 (has links) Plusieurs domaines d'application traitent des objets et des données complexes dont la structure et la sémantique de leurs composants sont des informations importantes pour leur manipulation et leur exploitation. La structure de graphe a été bien souvent adoptée, comme modèles de représentation, dans ces domaines. Elle permet de véhiculer un maximum d'informations, liées à la structure, la sémantique et au comportement de ces objets, nécessaires pour assurer une meilleure représentation et une manipulation eﬃcace. Ainsi, lors d'une comparaison entre deux objets complexes, l'opération d'appariement est appliquée entre les graphes les modélisant. Nous nous sommes intéressés dans cette thèse à l'appariement approximatif qui permet de sélectionner les graphes les plus similaires au graphe d'une requête. L'objectif de notre travail est de contribuer à l'interrogation ﬂexible et personnalisée d'objets complexes modélisés sous forme de graphes pour identiﬁer les graphes les plus pertinents aux besoins de l'utilisateur, exprimés d'une manière partielle ou imprécise. Dans un premier temps, nous avons proposé un cadre de sélection de services Web modélisés sous forme de graphes qui permet (i) d'améliorer le processus d'appariement en intégrant les préférences des utilisateurs et l'aspect structurel des graphes comparés, et (ii) de retourner les services les plus pertinents. Une deuxième méthode d'évaluation de requêtes de recherche de graphes par similarité a également été présentée pour calculer le skyline de graphes d'une requête utilisateur en tenant compte de plusieurs mesures de distance de graphes. Enﬁn, des approches de raﬃnement ont été déﬁnies pour réduire la taille, souvent importante, du skyline. Elles ont pour but d'identiﬁer et d'ordonner les points skyline qui répondent le mieux à la requête de l'utilisateur. / Several application domains deal with complex objects whose structure and semantics of their components are crucial for their handling. For this, graph structure has been adopted, as a model of representation, in these areas to capture a maximum of information, related to the structure, semantics and behavior of such objects, necessary for eﬀective representation and processing. Thus, when comparing two complex objects, a matching technique is applied between their graph structures. In this thesis, we are interested in approximate matching techniques which constitute suitable tools to automatically ﬁnd and select the most similar graphs to user graph query. The aim of our work is to develop methods to personalized and ﬂexible querying of repositories of complex objects modeled thanks to graphs and then to return the graphs results that ﬁt best the users ’needs, often expressed partially and in an imprecise way. In a ﬁrst time, we propose a ﬂexible approach for Web service retrieval that relies both on preference satisﬁability and structural similarity between process model graphs. This approach allows (i) to improve the matching process by integrating user preferences and the graph structural aspect, and (ii) to return the most relevant services. A second method for evaluating graph similarity queries is also presented. It retrieves graph similarity skyline of a user query by considering a vector of several graph distance measures instead of a single measure. Thus, graphs which are maximally similar to graph query are returned in an ordered way. Finally, reﬁnement methods have been developed to reduce the size of the skyline when it is of a signiﬁcant size. They aim to identify and order skyline points that match best the user query. Interrogation de bases de données Recherche d’information Théorie des ensembles ﬂous Quantiﬁcateurs linguistiques Requêtes Services Web Théorie des graphes Querying databases Research strategies in databases Information retrieval Fuzzy set theory Linguistic quantiﬁers Queries Web services Graph theory
133	Semantic Process Engineering – Konzeption und Realisierung eines Werkzeugs zur semantischen Prozessmodellierung Fellmann, Michael 23 October 2013 (has links) In der Geschäftsprozessmodellierung haben sich semiformale, grafische Darstellungen etabliert. Die Bezeichnung der Elemente in diesen Modellen ist dabei an betriebswirtschaftliche Fachtermini angelehnt und erfolgt mit Hilfe der natürlichen Sprache, die jedoch Interpretationsspielräume mit sich bringt. Die Semantik der einzelnen Modellelemente ist somit für Menschen und Maschinen nicht eindeutig interpretierbar. In der vorliegenden Dissertation erfolgt daher die Konzeption und Realisierung einer semantischen Prozessmodellierung, die die Verknüpfung der semiformalen Prozessmodellierung mit formalen Begriffssystemen (Ontologien) gestaltet und werkzeugtechnisch unterstützt. Durch diese Verknüpfung wird die Semantik der einzelnen Modellelemente um eine eindeutige und maschinell verarbeitbare Semantik erweitert. Hierdurch können die mit formalen Ontologien möglichen Schlussfolgerungen angewendet werden, um etwa bei der Suche in Modellbeständen oder der Korrektheitsprüfung genauere oder vollständigere Ergebnisse zu erhalten. Im Ergebnis werden somit die im Bereich der Informatik und Künstlichen Intelligenz etablierten Ansätze der Wissensrepräsentation, insbesondere der Beschreibungslogik, in die fachlichen Prozessmodellierung eingebettet. Die Erprobung des Konzepts erfolgt über eine prototypische Implementierung, die einerseits die technische Umsetzbarkeit zeigt, andererseits auch für ein Laborexperiment zur Evaluation genutzt wurde. Geschäftsprozessmanagement SPARQL OWL Semantisches Wiki Validierung Verifikation Anfrage Annotation Ontologien Prozessmodellierung Semantic Web Process Modelling Business Process Management Semantic Wiki Validation Verification Querying Annotation Ontologies Semantic Web 85.00 - Betriebswirtschaft: Allgemeines 54.72 - Künstliche Intelligenz ddc:330 ddc:000
134	Pharmacodynamics miner : an automated extraction of pharmacodynamic drug interactions Lokhande, Hrishikesh 11 December 2013 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Pharmacodynamics (PD) studies the relationship between drug concentration and drug effect on target sites. This field has recently gained attention as studies involving PD Drug-Drug interactions (DDI) assure discovery of multi-targeted drug agents and novel efficacious drug combinations. A PD drug combination could be synergistic, additive or antagonistic depending upon the summed effect of the drug combination at a target site. The PD literature has grown immensely and most of its knowledge is dispersed across different scientific journals, thus the manual identification of PD DDI is a challenge. In order to support an automated means to extract PD DDI, we propose Pharmacodynamics Miner (PD-Miner). PD-Miner is a text-mining tool, which is capable of identifying PD DDI from in vitro PD experiments. It is powered by two major features, i.e., collection of full text articles and in vitro PD ontology. The in vitro PD ontology currently has four classes and more than hundred subclasses; based on these classes and subclasses the full text corpus is annotated. The annotated full text corpus forms a database of articles, which can be queried based upon drug keywords and ontology subclasses. Since the ontology covers term and concept meanings, the system is capable of formulating semantic queries. PD-Miner extracts in vitro PD DDI based upon references to cell lines and cell phenotypes. The results are in the form of fragments of sentences in which important concepts are visually highlighted. To determine the accuracy of the system, we used a gold standard of 5 expert curated articles. PD-Miner identified DDI with a recall of 75% and a precision of 46.55%. Along with the development of PD Miner, we also report development of a semantically annotated in vitro PD corpus. This corpus includes term and sentence level annotations and serves as a gold standard for future text mining. Drug interactions -- Research Drugs -- Physiological effect Bioinformatics -- Research -- Analysis Semantics -- Data processing Semantics -- Network analysis Text processing (Computer science) Computational linguistics Computational complexity Pharmacokinetics Querying (Computer science) -- Research
135	Big Graph Processing : Partitioning and Aggregated Querying / Traitement des graphes massifs : partitionnement et requêtage agrégatif Echbarthi, Ghizlane 23 October 2017 (has links) Avec l'avènement du « big data », de nombreuses répercussions ont eu lieu dans tous les domaines de la technologie de l'information, préconisant des solutions innovantes remportant le meilleur compromis entre coûts et précision. En théorie des graphes, où les graphes constituent un support de modélisation puissant qui permet de formaliser des problèmes allant des plus simples aux plus complexes, la recherche pour des problèmes NP-complet ou NP-difficils se tourne plutôt vers des solutions approchées, mettant ainsi en avant les algorithmes d'approximations et les heuristiques alors que les solutions exactes deviennent extrêmement coûteuses et impossible d'utilisation.Nous abordons dans cette thèse deux problématiques principales: dans un premier temps, le problème du partitionnement des graphes est abordé d'une perspective « big data », où les graphes massifs sont partitionnés en streaming. Nous étudions et proposons plusieurs modèles de partitionnement en streaming et nous évaluons leurs performances autant sur le plan théorique qu'empirique. Dans un second temps, nous nous intéressons au requêtage des graphes distribués/partitionnés. Dans ce cadre, nous étudions la problématique de la « recherche agrégative dans les graphes » qui a pour but de répondre à des requêtes interrogeant plusieurs fragments de graphes et qui se charge de la reconstruction de la réponse finale tel que l'on obtient un « matching approché » avec la requête initiale / With the advent of the "big data", many repercussions have taken place in all fields of information technology, advocating innovative solutions with the best compromise between cost and accuracy. In graph theory, where graphs provide a powerful modeling support for formalizing problems ranging from the simplest to the most complex, the search for NP-complete or NP-difficult problems is rather directed towards approximate solutions, thus Forward approximation algorithms and heuristics while exact solutions become extremely expensive and impossible to use. In this thesis we discuss two main problems: first, the problem of partitioning graphs is approached from a perspective big data, where massive graphs are partitioned in streaming. We study and propose several models of streaming partitioning and we evaluate their performances both theoretically and empirically. In a second step, we are interested in querying distributed / partitioned graphs. In this context, we study the problem of aggregative search in graphs, which aims to answer queries that interrogate several fragments of graphs and which is responsible for reconstructing the final response such that a Matching approached with the initial query Requête de graphes Matching de graphes Mesure de similarité dans les graphes Recherche agrégative dans les graphes Partitionnement des graphes Partitionnement en streaming Heuristiques de streaming Partitionnement équilibré des graphes Graph querying Graph matching Graph similarity metric Aggregated search Graph partitioning Streaming partitioning Streaming heuristics Balanced graph partitioning 004
136	Evaluation of Queries on Linked Distributed XML Data / Auswertung von Anfragen an verteilte, verlinkte XML Daten Behrends, Erik 18 December 2006 (has links) No description available. 004 Informatik Mathematics and Computer Science XML XLink XPointer XML Anfragen XPath XQuery XML XLink XPointer XML Querying XPath XQuery 54.55 54.64
137	Agrégation de classements avec égalités : algorithmes, guides à l'utilisateur et applications aux données biologiques / Rank aggregation with ties : algorithms, user guidance et applications to biologicals data Brancotte, Bryan 25 September 2015 (has links) L'agrégation de classements consiste à établir un consensus entre un ensemble de classements (éléments ordonnés). Bien que ce problème ait de très nombreuses applications (consensus entre les votes d'utilisateurs, consensus entre des résultats ordonnés différemment par divers moteurs de recherche...), calculer un consensus exact est rarement faisable dans les cas d'applications réels (problème NP-difficile). De nombreux algorithmes d'approximation et heuristiques ont donc été conçus. Néanmoins, leurs performances (en temps et en qualité de résultat produit) sont très différentes et dépendent des jeux de données à agréger. Plusieurs études ont cherché à comparer ces algorithmes mais celles-ci n’ont généralement pas considéré le cas (pourtant courant dans les jeux de données réels) des égalités entre éléments dans les classements (éléments classés au même rang). Choisir un algorithme de consensus adéquat vis-à-vis d'un jeu de données est donc un problème particulièrement important à étudier (grand nombre d’applications) et c’est un problème ouvert au sens où aucune des études existantes ne permet d’y répondre. Plus formellement, un consensus de classements est un classement qui minimise le somme des distances entre ce consensus et chacun des classements en entrés. Nous avons considérés (comme une grande partie de l’état-de-art) la distance de Kendall-Tau généralisée, ainsi que des variantes, dans nos études. Plus précisément, cette thèse comporte trois contributions. Premièrement, nous proposons de nouveaux résultats de complexité associés aux cas que l'on rencontre dans les données réelles où les classements peuvent être incomplets et où plusieurs éléments peuvent être classés à égalité. Nous isolons les différents « paramètres » qui peuvent expliquer les variations au niveau des résultats produits par les algorithmes d’agrégation (par exemple, utilisation de la distance de Kendall-Tau généralisée ou de variantes, d’un pré-traitement des jeux de données par unification ou projection). Nous proposons un guide pour caractériser le contexte et le besoin d’un utilisateur afin de le guider dans le choix à la fois d’un pré-traitement de ses données mais aussi de la distance à choisir pour calculer le consensus. Nous proposons finalement une adaptation des algorithmes existants à ce nouveau contexte. Deuxièmement, nous évaluons ces algorithmes sur un ensemble important et varié de jeux de données à la fois réels et synthétiques reproduisant des caractéristiques réelles telles que similarité entre classements, la présence d'égalités, et différents pré-traitements. Cette large évaluation passe par la proposition d’une nouvelle méthode pour générer des données synthétiques avec similarités basée sur une modélisation en chaîne Markovienne. Cette évaluation a permis d'isoler les caractéristiques des jeux de données ayant un impact sur les performances des algorithmes d'agrégation et de concevoir un guide pour caractériser le besoin d'un utilisateur et le conseiller dans le choix de l'algorithme à privilégier. Une plateforme web permettant de reproduire et étendre ces analyses effectuée est disponible (rank-aggregation-with-ties.lri.fr). Enfin, nous démontrons l'intérêt d'utiliser l'approche d'agrégation de classements dans deux cas d'utilisation. Nous proposons un outil reformulant à-la-volé des requêtes textuelles d'utilisateur grâce à des terminologies biomédicales, pour ensuite interroger de bases de données biologiques, et finalement produire un consensus des résultats obtenus pour chaque reformulation (conqur-bio.lri.fr). Nous comparons l'outil à la plateforme de références et montrons une amélioration nette des résultats en qualité. Nous calculons aussi des consensus entre liste de workflows établie par des experts dans le contexte de la similarité entre workflows scientifiques. Nous observons que les consensus calculés sont très en accord avec les utilisateurs dans une large proportion de cas. / The rank aggregation problem is to build consensus among a set of rankings (ordered elements). Although this problem has numerous applications (consensus among user votes, consensus between results ordered differently by different search engines ...), computing an optimal consensus is rarely feasible in cases of real applications (problem NP-Hard). Many approximation algorithms and heuristics were therefore designed. However, their performance (time and quality of product loss) are quite different and depend on the datasets to be aggregated. Several studies have compared these algorithms but they have generally not considered the case (yet common in real datasets) that elements can be tied in rankings (elements at the same rank). Choosing a consensus algorithm for a given dataset is therefore a particularly important issue to be studied (many applications) and it is an open problem in the sense that none of the existing studies address it. More formally, a consensus ranking is a ranking that minimizes the sum of the distances between this consensus and the input rankings. Like much of the state-of-art, we have considered in our studies the generalized Kendall-Tau distance, and variants. Specifically, this thesis has three contributions. First, we propose new complexity results associated with cases encountered in the actual data that rankings may be incomplete and where multiple items can be classified equally (ties). We isolate the different "features" that can explain variations in the results produced by the aggregation algorithms (for example, using the generalized distance of Kendall-Tau or variants, pre-processing the datasets with unification or projection). We propose a guide to characterize the context and the need of a user to guide him into the choice of both a pre-treatment of its datasets but also the distance to choose to calculate the consensus. We finally adapt existing algorithms to this new context. Second, we evaluate these algorithms on a large and varied set of datasets both real and synthetic reproducing actual features such as similarity between rankings, the presence of ties and different pre-treatments. This large evaluation comes with the proposal of a new method to generate synthetic data with similarities based on a Markov chain modeling. This evaluation led to the isolation of datasets features that impact the performance of the aggregation algorithms, and to design a guide to characterize the needs of a user and advise him in the choice of the algorithm to be use. A web platform to replicate and extend these analyzes is available (rank-aggregation-with-ties.lri.fr). Finally, we demonstrate the value of using the rankings aggregation approach in two use cases. We provide a tool to reformulating the text user queries through biomedical terminologies, to then query biological databases, and ultimately produce a consensus of results obtained for each reformulation (conqur-bio.lri.fr). We compare the results to the references platform and show a clear improvement in quality results. We also calculate consensus between list of workflows established by experts in the context of similarity between scientific workflows. We note that the computed consensus agree with the expert in a very large majority of cases. Agrégation de classements Agrégation de préférences Top-k Topk Classement de Kemeny optimal Solution exact Guidance Benchmark Résultat de complexité Interrogation de sources biomédicale NP-difficile Rank aggregation Preference aggregation Top-k Topk Optimal Kemey ranking Exact solution Guidance Benchmark Complexity results Querying biomedical sources NP-Hard

Page generated in 0.091 seconds