Global ETD Search

21	Editing model based on the object-oriented approach Watanabe, Toyohide, Yoshida, Yuuji, Fukumura, Teruo 10 1900 (has links) No description available. editing facility editing object editing model object schema data instance user interface multi-media/form integration functionality data definition language relational data model object-oriented approach
22	Learning with Markov logic networks : transfer learning, structure learning, and an application to Web query disambiguation Mihalkova, Lilyana Simeonova 18 March 2011 (has links) Traditionally, machine learning algorithms assume that training data is provided as a set of independent instances, each of which can be described as a feature vector. In contrast, many domains of interest are inherently multi-relational, consisting of entities connected by a rich set of relations. For example, the participants in a social network are linked by friendships, collaborations, and shared interests. Likewise, the users of a search engine are related by searches for similar items and clicks to shared sites. The ability to model and reason about such relations is essential not only because better predictive accuracy is achieved by exploiting this additional information, but also because frequently the goal is to predict whether a set of entities are related in a particular way. This thesis falls within the area of Statistical Relational Learning (SRL), which combines ideas from two traditions within artificial intelligence, first-order logic and probabilistic graphical models to address the challenge of learning from multi-relational data. We build on one particular SRL model, Markov logic networks (MLNs), which consist of a set of weighted first-order-logic formulae and provide a principled way of defining a probability distribution over possible worlds. We develop algorithms for learning of MLN structure both from scratch and by transferring a previously learned model, as well as an application of MLNs to the problem of Web query disambiguation. The ideas we present are unified by two main themes: the need to deal with limited training data and the use of bottom-up learning techniques. Structure learning, the task of automatically acquiring a set of dependencies among the relations in the domain, is a central problem in SRL. We introduce BUSL, an algorithm for learning MLN structure from scratch that proceeds in a more bottom-up fashion, breaking away from the tradition of top-down learning typical in SRL. Our approach first constructs a novel data structure called a Markov network template that is used to restrict the search space for clauses. Our experiments in three relational domains demonstrate that BUSL dramatically reduces the search space for clauses and attains a significantly higher accuracy than a structure learner that follows a top-down approach. Accurate and efficient structure learning can also be achieved by transferring a model obtained in a source domain related to the current target domain of interest. We view transfer as a revision task and present an algorithm that diagnoses a source MLN to determine which of its parts transfer directly to the target domain and which need to be updated. This analysis focuses the search for revisions on the incorrect portions of the source structure, thus speeding up learning. Transfer learning is particularly important when target-domain data is limited, such as when data on only a few individuals is available from domains with hundreds of entities connected by a variety of relations. We also address this challenging case and develop a general transfer learning approach that makes effective use of such limited target data in several social network domains. Finally, we develop an application of MLNs to the problem of Web query disambiguation in a more privacy-aware setting where the only information available about a user is that captured in a short search session of 5-6 previous queries on average. This setting contrasts with previous work that typically assumes the availability of long user-specific search histories. To compensate for the scarcity of user-specific information, our approach exploits the relations between users, search terms, and URLs. We demonstrate the effectiveness of our approach in the presence of noise and show that it outperforms several natural baselines on a large data set collected from the MSN search engine. / text Markov logic networks Web query disambiguation Statistical Relational Learning Artificial intelligence Machine learning Multi-relational data First-order logic Probability distribution Structure learning Transfer learning Algorithms
23	Modèles d'embeddings à valeurs complexes pour les graphes de connaissances / Complex-Valued Embedding Models for Knowledge Graphs Trouillon, Théo 29 September 2017 (has links) L'explosion de données relationnelles largement disponiblessous la forme de graphes de connaissances a permisle développement de multiples applications, dont les agents personnels automatiques,les systèmes de recommandation et l'amélioration desrésultats de recherche en ligne.La grande taille et l'incomplétude de ces bases de donnéesnécessite le développement de méthodes de complétionautomatiques pour rendre ces applications viables.La complétion de graphes de connaissances, aussi appeléeprédiction de liens, se doit de comprendre automatiquementla structure des larges graphes de connaissances (graphes dirigéslabellisés) pour prédire les entrées manquantes (les arêtes labellisées).Une approche gagnant en popularité consiste à représenter ungraphe de connaissances comme un tenseur d'ordre 3, etd'utiliser des méthodes de décomposition de tenseur pourprédire leurs entrées manquantes.Les modèles de factorisation existants proposent différentscompromis entre leur expressivité, et leur complexité en temps et en espace.Nous proposons un nouveau modèle appelé ComplEx, pour"Complex Embeddings", pour réconcilier expressivité etcomplexité par l'utilisation d'une factorisation en nombre complexes,dont nous explorons le lien avec la diagonalisation unitaire.Nous corroborons notre approche théoriquement en montrantque tous les graphes de connaissances possiblespeuvent être exactement décomposés par le modèle proposé.Notre approche, basées sur des embeddings complexesreste simple, car n'impliquant qu'un produit trilinéaire complexe,là où d'autres méthodes recourent à des fonctions de compositionde plus en plus compliquées pour accroître leur expressivité.Le modèle proposé ayant une complexité linéaire en tempset en espace est passable à l'échelle, tout endépassant les approches existantes sur les jeux de données de référencepour la prédiction de liens.Nous démontrons aussi la capacité de ComplEx àapprendre des représentations vectorielles utiles pour d'autres tâches,en enrichissant des embeddings de mots, qui améliorentles prédictions sur le problème de traitement automatiquedu langage d'implication entre paires de phrases.Dans la dernière partie de cette thèse, nous explorons lescapacités de modèles de factorisation à apprendre lesstructures relationnelles à partir d'observations.De part leur nature vectorielle,il est non seulement difficile d'interpréter pourquoicette classe de modèles fonctionne aussi bien,mais aussi où ils échouent et comment ils peuventêtre améliorés. Nous conduisons une étude expérimentalesur les modèles de l'état de l'art, non pas simplementpour les comparer, mais pour comprendre leur capacitésd'induction. Pour évaluer les forces et faiblessesde chaque modèle, nous créons d'abord des tâches simplesreprésentant des propriétés atomiques despropriétés des relations des graphes de connaissances ;puis des tâches représentant des inférences multi-relationnellescommunes au travers de généalogies synthétisées.À partir de ces résultatsexpérimentaux, nous proposons de nouvelles directionsde recherches pour améliorer les modèles existants,y compris ComplEx. / The explosion of widely available relational datain the form of knowledge graphsenabled many applications, including automated personalagents, recommender systems and enhanced web search results.The very large size and notorious incompleteness of these data basescalls for automatic knowledge graph completion methods to make these applicationsviable. Knowledge graph completion, also known as link-prediction,deals with automatically understandingthe structure of large knowledge graphs---labeled directed graphs---topredict missing entries---labeled edges. An increasinglypopular approach consists in representing knowledge graphs as third-order tensors,and using tensor factorization methods to predict their missing entries.State-of-the-art factorization models propose different trade-offs between modelingexpressiveness, and time and space complexity. We introduce a newmodel, ComplEx---for Complex Embeddings---to reconcile both expressivenessand complexity through the use of complex-valued factorization, and exploreits link with unitary diagonalization.We corroborate our approach theoretically and show that all possibleknowledge graphs can be exactly decomposed by the proposed model.Our approach based on complex embeddings is arguably simple,as it only involves a complex-valued trilinear product,whereas other methods resort to more and more complicated compositionfunctions to increase their expressiveness. The proposed ComplEx model isscalable to large data sets as it remains linear in both space and time, whileconsistently outperforming alternative approaches on standardlink-prediction benchmarks. We also demonstrateits ability to learn useful vectorial representations for other tasks,by enhancing word embeddings that improve performanceson the natural language problem of entailment recognitionbetween pair of sentences.In the last part of this thesis, we explore factorization models abilityto learn relational patterns from observed data.By their vectorial nature, it is not only hard to interpretwhy this class of models works so well,but also to understand where they fail andhow they might be improved. We conduct an experimentalsurvey of state-of-the-art models, not towardsa purely comparative end, but as a means to get insightabout their inductive abilities.To assess the strengths and weaknesses of each model, we create simple tasksthat exhibit first, atomic properties of knowledge graph relations,and then, common inter-relational inference through synthetic genealogies.Based on these experimental results, we propose new researchdirections to improve on existing models, including ComplEx. Apprentissage statistique Factorisation de tenseur Données multi-Relationnelles Embeddings Knowledge graph Link prediction Machine learning Tensor factorization Multi-Relational data Embeddings Graphe de connaissances Prédiction de liens 004
24	COMOVI: um framework para transformação de dados em aplicações de credit behavior scoring baseado no desenvolvimento dirigido por modelos OlLIVEIRA NETO, Rosalvo Ferreira de 11 December 2015 (has links) Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2016-07-12T12:11:15Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Tese_Rosalvo_Neto_CIN_2015.pdf: 7674683 bytes, checksum: 99037c704450a9a878bcbe93ab8b392d (MD5) / Made available in DSpace on 2016-07-12T12:11:15Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Tese_Rosalvo_Neto_CIN_2015.pdf: 7674683 bytes, checksum: 99037c704450a9a878bcbe93ab8b392d (MD5) Previous issue date: 2015-12-11 / CAPEs / A etapa de pré-processamento em um projeto de descoberta do conhecimento é custosa, em geral, consome cerca de 50 a 80% do tempo total de um projeto. É nesta etapa que um banco de dados relacional é transformado para aplicação de um algoritmo de mineração de dados. A transformação dos dados nesta etapa é uma tarefa complexa, uma vez que exige uma forte integração entre projetistas de banco de dados e especialistas do domínio da aplicação. Os frameworks que buscam sistematizar a etapa de transformação dos dados encontrados na literatura apresentam limitações significativas quando aplicados a soluções comportamentais, como Credit Behavior Scoring. Estas soluções visam a auxiliar as instituições financeiras a decidirem sobre a concessão de crédito aos consumidores com base no risco das solicitações. Este trabalho propõe um framework baseado no Desenvolvimento Dirigido por Modelos para sistematizar esta etapa em soluções de Credit Behavior Scoring. Ele é composto por um meta-modelo que mapeia os conceitos do domínio e um conjunto de regras de transformações. As três principais contribuições do framework proposto são: 1) aumentar o poder discriminatório da solução, através da construção de novas variáveis que maximizam o conteúdo estatístico da informação do domínio; 2) reduzir o tempo da transformação dos dados através da geração automática de código e 3) permitir que profissionais e pesquisadores de Inteligência Artificial e Estatística realizem a transformação dos dados sem o auxílio de especialistas de Banco de Dados. Para validar o framework proposto, dois estudos comparativos foram realizados. Primeiro, um estudo comparando o desempenho entre os principais frameworks existentes na literatura e o framework proposto foi realizado em duas bases de dados. Uma base de dados de um conhecido benchmark de uma competição internacional organizada pela PKDD, e outra obtida de uma das maiores empresas de varejo do Brasil, que possui seu próprio cartão de crédito. Os frameworks RelAggs e Validação de Múltiplas Visões Baseado em Correção foram escolhidos como representantes das abordagens proposicional e mineração de dados relacional, respectivamente. A comparação foi realizada através do processo de validação cruzada estratificada, para definir os intervalos de confiança para a avaliação de desempenho. Os resultados mostram que o framework proposto proporciona um desempenho equivalente ou superior aos principais framework existentes, medido pela área sob a curva ROC, utilizando uma rede neural MultiLayer Perceptron, K vizinho mais próximos e Random Forest como classificadores, com um nível de confiança de 95%. O segundo estudo verificou a redução de tempo proporcionada pelo framework durante a transformação dos dados. Para isso, sete times compostos por estudantes de uma universidade brasileira mensuraram o tempo desta atividade com e sem o framework proposto. O teste pareado Wilcoxon Signed-Rank mostrou que o framework proposto reduz o tempo de transformação com um nível de confiança de 95%. / The pre-processing stage in knowledge discovery projects is costly, generally taking between 50 and 80% of total project time. It is in this stage that data in a relational database are transformed for applying a data mining technique. This stage is a complex task that demands from database designers a strong interaction with experts who have a broad knowledge about the application domain. The frameworks that aim to systemize the data transformation stage have significant limitations when applied to behavior solutions such as the Credit Behavior Scoring solutions. Their goal is help financial institutions to decide whether to grant credit to consumers based on the credit risk of their requests. This work proposes a framework based on the Model Driven Development to systemize this stage in Credit Behavioral Scoring solutions. It is composed by a meta-model which maps the domain concepts and a set of transformation rules. This work has three main contributions: 1) improving the discriminant power of data mining techniques by means of the construction of new input variables, which embed new knowledge for the technique; 2) reducing the time of data transformation using automatic code generation and 3) allowing artificial intelligence and statistics modelers to perform the data transformation without the help of database experts. In order to validate the proposed framework, two comparative studies were conducted. First, a comparative study of performance between the main existing frameworks found in literature and the proposed framework applied to two databases was performed. One database from a known benchmark of an international competition organized by PKDD, and another one obtained from one of the biggest retail companies from Brazil, that has its own private label credit card. The RelAggs and Correlation-based Multiple View Validation frameworks were chosen as representatives of the propositional and relational data mining approaches, respectively. The comparison was carried out through by a 10-fold stratified cross-validation process with ten stratified parts in order to define the confidence intervals. The results show that the proposed framework delivers a performance equivalent or superior to those of existing frameworks, for the evaluation of performance measured by the area under the ROC curve, using a Multilayer Perceptron neural network, k-nearest neighbors and Random Forest as classifiers, with a confidence level of 95%. The second comparative study verified the reduction of time required for data transformation using the proposed framework. For this, seven teams composed by students from a Brazilian university measured the runtime of this stage with and without the proposed framework. The paired Wilcoxon Signed-Rank’s Test showed that the proposed framework reduces the time of data transformation with a confidence level of 95%. Mineração de dados Proposicionalização Mineração de dados Relacional Credit Behavior Scoring Desenvolvimento Dirigido por Modelos Data Mining. Propositionalization Relational Data Mining Credit Behavior Scoring Model-Driven Development
25	Temporal Change in the Power Production of Real-world Photovoltaic Systems Under Diverse Climatic Conditions Hu, Yang 08 February 2017 (has links) No description available. Engineering Energy
26	An approach to automate the adaptor software generation for tool integration in Application/ Product Lifecycle Management tool chains. Singh, Shikhar January 2016 (has links) An emerging problem in organisations is that there exist a large number of tools storing data that communicate with each other too often, throughout the process of an application or product development. However, no means of communication without the intervention of a central entity (usually a server) or storing the schema at a central repository exist. Accessing data among tools and linking them is tough and resource intensive. As part of the thesis, we develop a software (also referred to as ‘adaptor’ in the thesis), which, when implemented in the lifecycle management systems, integrates data seamlessly. This will eliminate the need of storing database schemas at a central repository and make the process of accessing data within tools less resource intensive. The adaptor acts as a wrapper to the tools and allows them to directly communicate with each other and exchange data. When using the developed adaptor for communicating data between various tools, the data in relational databases is first converted into RDF format and is then sent or received. Hence, RDF forms the crucial underlying concept on which the software will be based. The Resource description framework (RDF) provides the functionality of data integration irrespective of underlying schemas by treating data as resource and representing it as URIs. The model of RDF is a data model that is used for exchange and communication of data on the Internet and can be used in solving other real world problems like tool integration and automation of communication in relational databases. However, developing this adaptor for every tool requires understanding the individual schemas and structure of each of the tools’ database. This again requires a lot of effort for the developer of the adaptor. So, the main aim of the thesis will be to automate the development of such adaptors. With this automation, the need for anyone to manually assess the database and then develop the adaptor specific to the database is eliminated. Such adaptors and concepts can be used to implement similar solutions in other organisations faced with similar problems. In the end, the output of the thesis is an approachwhich automates the process of generating these adaptors. / Resource Description Framework (RDF) ger funktionaliteten av dataintegration, oberoende av underliggande scheman genom att behandla uppgifter som resurs och representerar det som URI. Modellen för Resource Description Framework är en datamodell som används för utbyte och kommunikation av uppgifter om Internet och kan användas för att lösa andra verkliga problem som integrationsverktyg och automatisering av kommunikation i relationsdatabaser. Ett växande problem i organisationer är att det finns ett stort antal verktyg som lagrar data och som kommunicerar med varandra alltför ofta, under hela processen för ett program eller produktutveckling. Men inga kommunikationsmedel utan ingripande av en central enhet (oftast en server) finns. Åtkomst av data mellan verktyg och länkningar mellan dem är resurskrävande. Som en del av avhandlingen utvecklar vi en programvara (även hänvisad till som "adapter" i avhandlingen), som integrerar data utan större problem. Detta kommer att eliminera behovet av att lagra databasscheman på en central lagringsplats och göra processen för att hämta data inom verktyg mindre resurskrävande. Detta kommer att ske efter beslut om en särskild strategi för att uppnå kommunikation mellan olika verktyg som kan vara en sammanslagning av många relevanta begrepp, genom studier av nya och kommande metoder som kan hjälpa i nämnda scenarier. Med den utvecklade programvaran konverteras först datat i relationsdatabaserna till RDF form och skickas och tas sedan emot i RDF format. Således utgör RDF det viktiga underliggande konceptet för programvaran. Det främsta målet med avhandlingen är att automatisera utvecklingen av ett sådant verktyg (adapter). Med denna automatisering elimineras behovet att av någon manuellt behöver utvärdera databasen och sedan utveckla adaptern enligt databasen. Ett sådant verktyg kan användas för att implementera liknande lösningar i andra organisationer som har liknande problem. Således är resultatet av avhandlingen en algoritm eller ett tillvägagångssätt för att automatisera processen av att skapa adaptern. OSLC adaptor D2R server Resource Description Framework (RDF) relational data mapping OSLC4J meta-model. OSLC adaptor D2R server Resource Description Framework (RDF) relational data mapping OSLC4J meta-model. Computer Engineering Datorteknik
27	Organisation et exploitation des connaissances sur les réseaux d'intéractions biomoléculaires pour l'étude de l'étiologie des maladies génétiques et la caractérisation des effets secondaires de principes actifs / Organization and exploitation of biological molecular networks for studying the etiology of genetic diseases and for characterizing drug side effects Bresso, Emmanuel 25 September 2013 (has links) La compréhension des pathologies humaines et du mode d'action des médicaments passe par la prise en compte des réseaux d'interactions entre biomolécules. Les recherches récentes sur les systèmes biologiques produisent de plus en plus de données sur ces réseaux qui gouvernent les processus cellulaires. L'hétérogénéité et la multiplicité de ces données rendent difficile leur intégration dans les raisonnements des utilisateurs. Je propose ici des approches intégratives mettant en oeuvre des techniques de gestion de données, de visualisation de graphes et de fouille de données, pour tenter de répondre au problème de l'exploitation insuffisante des données sur les réseaux dans la compréhension des phénotypes associés aux maladies génétiques ou des effets secondaires des médicaments. La gestion des données sur les protéines et leurs propriétés est assurée par un système d'entrepôt de données générique, NetworkDB, personnalisable et actualisable de façon semi-automatique. Des techniques de visualisation de graphes ont été couplées à NetworkDB pour utiliser les données sur les réseaux biologiques dans l'étude de l'étiologie des maladies génétiques entrainant une déficience intellectuelle. Des sous-réseaux de gènes impliqués ont ainsi pu être identifiés et caractérisés. Des profils combinant des effets secondaires partagés par les mêmes médicaments ont été extraits de NetworkDB puis caractérisés en appliquant une méthode de fouille de données relationnelles couplée à Network DB. Les résultats permettent de décrire quelles propriétés des médicaments et de leurs cibles (incluant l'appartenance à des réseaux biologiques) sont associées à tel ou tel profil d'effets secondaires / The understanding of human diseases and drug mechanisms requires today to take into account molecular interaction networks. Recent studies on biological systems are producing increasing amounts of data. However, complexity and heterogeneity of these datasets make it difficult to exploit them for understanding atypical phenotypes or drug side-effects. This thesis presents two knowledge-based integrative approaches that combine data management, graph visualization and data mining techniques in order to improve our understanding of phenotypes associated with genetic diseases or drug side-effects. Data management relies on a generic data warehouse, NetworkDB, that integrates data on proteins and their properties. Customization of the NetworkDB model and regular updates are semi-automatic. Graph visualization techniques have been coupled with NetworkDB. This approach has facilitated access to biological network data in order to study genetic disease etiology, including X-linked intellectual disability (XLID). Meaningful sub-networks of genes have thus been identified and characterized. Drug side-effect profiles have been extracted from NetworkDB and subsequently characterized by a relational learning procedure coupled with NetworkDB. The resulting rules indicate which properties of drugs and their targets (including networks) preferentially associate with a particular side-effect profile Réseaux d'interactions Intégration de données Visualisation de graphes Fouille de données relationnelle Compréhension des effets secondaires Relations génotype-phénotype Interaction networks Data integration Graph visualization Relational data mining Understanding of drug side-effects Genotype-phenotype relationships 572.8
28	Enhancing supervised learning with complex aggregate features and context sensitivity / Amélioration de l'apprentissage supervisé par l'utilisation d'agrégats complexes et la prise en compte du contexte Charnay, Clément 30 June 2016 (has links) Dans cette thèse, nous étudions l'adaptation de modèles en apprentissage supervisé. Nous adaptons des algorithmes d'apprentissage existants à une représentation relationnelle. Puis, nous adaptons des modèles de prédiction aux changements de contexte.En représentation relationnelle, les données sont modélisées par plusieurs entités liées par des relations. Nous tirons parti de ces relations avec des agrégats complexes. Nous proposons des heuristiques d'optimisation stochastique pour inclure des agrégats complexes dans des arbres de décisions relationnels et des forêts, et les évaluons sur des jeux de données réelles.Nous adaptons des modèles de prédiction à deux types de changements de contexte. Nous proposons une optimisation de seuils sur des modèles à scores pour s'adapter à un changement de coûts. Puis, nous utilisons des transformations affines pour adapter les attributs numériques à un changement de distribution. Enfin, nous étendons ces transformations aux agrégats complexes. / In this thesis, we study model adaptation in supervised learning. Firstly, we adapt existing learning algorithms to the relational representation of data. Secondly, we adapt learned prediction models to context change.In the relational setting, data is modeled by multiples entities linked with relationships. We handle these relationships using complex aggregate features. We propose stochastic optimization heuristics to include complex aggregates in relational decision trees and Random Forests, and assess their predictive performance on real-world datasets.We adapt prediction models to two kinds of context change. Firstly, we propose an algorithm to tune thresholds on pairwise scoring models to adapt to a change of misclassification costs. Secondly, we reframe numerical attributes with affine transformations to adapt to a change of attribute distribution between a learning and a deployment context. Finally, we extend these transformations to complex aggregates. Fouille de données relationnelles Reframing Agrégation complexe Optimisation stochastique Classification sensible au coût Adaptation de modèles Apprentissage automatique Intelligence artificielle Relational Data Mining Reframing Complex Aggregation Stochastic Optimization Cost-Sensitive Classification Model Adaptation Machine Learning Artificial Intelligence 006.35
29	La visualisation d’information pour les données massives : une approche par l’abstraction de données / Information visualization for big data : a data abstraction approach Sansen, Joris 04 July 2017 (has links) L’évolution et la démocratisation des technologies ont engendré une véritable explosion de l’information et notre capacité à générer des données et le besoin de les analyser n’a jamais été aussi important. Pourtant, les problématiques soulevées par l’accumulation de données (stockage, temps de traitement, hétérogénéité, vitesse de captation/génération, etc. ) sont d’autant plus fortes que les données sont massives, complexes et variées. La représentation de l’information, de part sa capacité à synthétiser et à condenser des données, se constitue naturellement comme une approche pour les analyser mais ne résout pas pour autant ces problèmes. En effet, les techniques classiques de visualisation sont rarement adaptées pour gérer et traiter cette masse d’informations. De plus,les problèmes que soulèvent le stockage et le temps de traitement se répercutent sur le système d’analyse avec par exemple, la distanciation de plus en plus forte entre la donnée et l’utilisateur : le lieu où elle sera stockée et traitée et l’interface utilisateur servant à l’analyse. Dans cette thèse nous nous intéressons à ces problématiques et plus particulièrement à l’adaptation des techniques de visualisation d’informations pour les données massives. Pour cela, nous nous intéressons tout d’abord à l’information de relation entre éléments, comment est-elle véhiculée et comment améliorer cette transmission dans le contexte de données hiérarchisées. Ensuite, nous nous intéressons à des données multivariées,dont la complexité à un impact sur les calculs possibles. Enfin, nous présentons les approches mises en oeuvre pour rendre nos méthodes compatibles avec les données massives. / The evolution and spread of technologies have led to a real explosion of information and our capacity to generate data and our need to analyze them have never been this strong. Still, the problems raised by such accumulation (storage, computation delays, diversity, speed of gathering/generation, etc. ) is as strong as the data are big, complex and varied. Information visualization,by its ability to summarize and abridge data was naturally established as appropriate approach. However, it does not solve the problem raised by Big Data. Actually, classical visualization techniques are rarely designed to handle such mass of information. Moreover, the problems raised by data storage and computation time have repercussions on the analysis system. For example,the increasing distance between the data and the analyst : the place where the data is stored and the place where the user will perform the analyses arerarely close. In this thesis, we focused on these issues and more particularly on adapting the information visualization techniques for Big Data. First of all focus on relational data : how does the existence of a relation between entity istransmitted and how to improve this transmission for hierarchical data. Then,we focus on multi-variate data and how to handle their complexity for the required computations. Finally, we present the methods we designed to make our techniques compatible with Big Data. Visualisation d’information Exploration Données massives Données relationnelles Données multivariées Données hiérarchiques Graphes orientés pondérés Information visualization Data exploration Big data Relational data Multivariate data Hierarchical data Directed weighted graphs
30	Integracija šema modula baze podataka informacionog sistema / Integration of Information System Database Module Schemas Luković Ivan 18 January 1996 (has links) <p>Paralelan i nezavisan rad vi&scaron;e projektanata na različitim modulima (podsistemima) nekog informacionog sistema, identifikovanim saglasno početnoj funkcionalnoj dekompoziciji realnog sistema, nužno dovodi do međusobno nekonzistentnih re&scaron;enja &scaron;ema modula baze podataka. Rad se bavi pitanjima identifikacije i razre&scaron;avanja problema, vezanih za automatsko otkrivanje kolizija, koje nastaju pri paralelnom projektovanju različitih &scaron;ema modula i problema vezanih za integraciju &scaron;ema modula u jedinstvenu &scaron;emu baze podataka informacionog sistema.</p><p>Identifikovani su mogući tipovi kolizija &scaron;ema modula, formulisan je i dokazan potreban i dovoljan uslov stroge i intenzionalne kompatibilnosti &scaron;ema modula, &scaron;to je omogućilo da se, u formi algoritama, prikažu postupci za ispitivanje stroge i intenzionalne kompatibilnosti &scaron;ema modula. Formalizovan je i postupak integracije kompatibilnih &scaron;ema u jedinstvenu (strogo pokrivajuću) &scaron;emu baze podataka. Dat je, takođe, prikaz metodologije primene algoritama za testiranje kompatibilnosti i integraciju &scaron;ema modula u jedinstvenu &scaron;emu baze podataka informacionog sistema.</p> / <p>Parallel and independent work of a number of designers on different information system modules (i.e. subsystems), identified by the initial real system functional decomposition, necessarily leads to mutually inconsistent database (db) module schemas. The thesis considers the problems concerning automatic detection of collisions, that can appear during the simultaneous design of different db module schemas, and integration of db module schemas into the unique information system db schema.</p><p>All possible types of db module schema collisions have been identified. Necessary and sufficient condition of strong and intensional db module schema compatibility has been formu-lated and proved. It has enabled to formalize the process of db module schema strong and intensional compatibility checking and to construct the appropriate algorithms. The integration process of the unique (strong covering) db schema, on the basis of compatible db module schemas, is formalized, as well. The methodology of applying the algorithms for compatibility checking and unique db schema integration is also presented.</p>

Search results