• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 19689
  • 3373
  • 2417
  • 2015
  • 1551
  • 1432
  • 881
  • 406
  • 390
  • 359
  • 297
  • 237
  • 208
  • 208
  • 208
  • Tagged with
  • 38215
  • 12470
  • 9257
  • 7123
  • 6700
  • 5896
  • 5307
  • 5203
  • 4740
  • 3461
  • 3307
  • 2834
  • 2730
  • 2546
  • 2117
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Data Modelling of Electricity Data in Sweden : Pre-study of the Envolve Project

Do, Yen Thi Kim January 2011 (has links)
Electricity has always had a great impact on our daily life. It plays an important role in every aspect  of  society,  economy,  and  technology  of  every  nation.  Sweden  among  other  Nordic countries has always strived to improve its energy landscape. Currently, Nuclear power and Hydroelectricity  are  the  main  methods  of  energy  generation  in  this  country.  Together  with exploring  new  ways  of  generating  energy  without  dependency  on  nuclear  power,  Sweden also  expresses  an  interest  in  encouraging  households  and  companies  to  use  energy  in  an efficient way in order to reduce energy  consumption and its associated costs. The scope of this thesis is to review and evaluate various state-of-the-art data analysis tools and algorithms to  generate  a  meaningful  consumer  behaviour  model  based  on  the  electricity  usage  data collected   from   households   in   several   areas   of   Sweden.   Understanding   the   demand characteristics  for  electricity  would  give  electric  suppliers  more  power  in  shaping  their marketing strategies as well as setting appropriate electricity pricing.
2

A study of the administrative provisions governing personal data protection in the Hong Kong Government /

Ng, Chi-kwan, Miranda. January 1987 (has links)
Thesis (M. Soc. Sc.)--University of Hong Kong, 1987.
3

A strategy for reducing I/O and improving query processing time in an Oracle data warehouse environment

Titus, Chris. January 2009 (has links) (PDF)
Thesis (M.S.C.I.T.)--Regis University, Denver, Colo., 2009. / Title from PDF title page (viewed on May 28, 2009). Includes bibliographical references.
4

A feature-based approach to visualizing and mining simulation data

Jiang, Ming, January 2005 (has links)
Thesis (Ph. D.)--Ohio State University, 2005. / Title from first page of PDF file. Document formatted into pages; contains xvi, p. 116; also includes graphics. Includes bibliographical references (p. 108-116). Available online via OhioLINK's ETD Center
5

A study of the administrative provisions governing personal data protection in the Hong Kong Government

Ng, Chi-kwan, Miranda. January 1987 (has links)
Thesis (M.Soc.Sc.)--University of Hong Kong, 1987. / Also available in print.
6

Prioritized data synchronization with applications

Jin, Jiaxi January 2013 (has links)
We are interested on the problem of synchronizing data on two distinct devices with differed priorities using minimum communication. A variety of distributed sys- tems require communication efficient and prioritized synchronization, for example, where the bandwidth is limited or certain information is more time sensitive than others. Our particular approach, P-CPI, involving the interactive synchronization of prioritized data, is efficient both in communication and computation. This protocol sports some desirable features, including (i) communication and computational com- plexity primarily tied to the number of di erences between the hosts rather than the amount of the data overall and (ii) a memoryless fast restart after interruption. We provide a novel analysis of this protocol, with proved high-probability performance bound and fast-restart in logarithmic time. We also provide an empirical model for predicting the probability of complete synchronization as a function of time and symmetric differences. We then consider two applications of our core algorithm. The first is a string reconciliation protocol, for which we propose a novel algorithm with online time com- plexity that is linear in the size of the string. Our experimental results show that our string reconciliation protocol can potentially outperform existing synchroniza- tion tools such like rsync in some cases. We also look into the benefit brought by our algorithm to delay-tolerant networks(DTNs). We propose an optimized DTN routing protocol with P-CPI implemented as middleware. As a proof of concept, we demonstrate improved delivery rate, reduced metadata and reduced average delay.
7

Data analytics on Yelp data set

Tata, Maitreyi January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / William H. Hsu / In this report, I describe a query-driven system which helps in deciding which restaurant to invest in or which area is good to open a new restaurant in a specific place. Analysis is performed on already existing businesses in every state. This is based on certain factors such as the average star rating, the total number of reviews associated with a specific restaurant, the price range of the restaurant etc. The results will give an idea of successful restaurants in a city, which helps you decide where to invest and what are the things to be kept in mind while starting a new business. The main scope of the project is to concentrate on Analytics and Data Visualization.
8

Data Sharing and Exchange: Semantics and Query Answering

Awada, Rana January 2015 (has links)
Exchanging and integrating data that belong to worlds of different vocabularies are two prominent problems in the database literature. While data coordination deals with managing and integrating data between autonomous yet related sources with possibly distinct vocabularies, data exchange is defined as the problem of extracting data from a source and materializing it in an independent target to conform to the target schema. These two problems, however, have never been studied in a unified setting which allows both the exchange of the data as well as the coordination of different vocabularies between different sources. Our thesis shows that such a unified setting exhibits data integration capabilities that are beyond the ones provided by data exchange and data coordination separately. In this thesis, we propose a new setting – called DSE, for Data Sharing and Exchange – which allows the exchange of data between independent source and target applications that possess independent schemas, as well as independent yet related domains of constants. To facilitate this type of exchange, we extend the source-to-target dependencies used in the ordinary data exchange setting which allow the association between the source and the target at the schema level, with the mapping table construct introduced in the classical data coordination setting which defines the association between the source and the target at the instance level. A mapping table construct defines for each source element, the set of associated (or corresponding) elements in the domain of the target. The semantics of this association relationship between source and target elements change with different requirements of different applications. Ordinary DE settings can represent DSE settings; however, we show that there exist DSE settings with particular semantics of related values in mapping tables where DE is not the best exchange solution to adopt. The thesis introduces two DSE settings with such a property. We call the first DSE with unique identity semantics. The semantics of a mapping table in this DSE setting specifies that each source element should be uniquely mapped to at least one target element that is associated with it in the mapping table. ii In this setting, classical DE is one method to perform a data exchange; however, it is not the best method to adopt, since it can not represent exchange applications, that require – as DC applications – to compute both portions as well as complete sets of certain answers for conjunctive queries. In addition, we show that adopting known DE universal solutions as semantics for such DSE settings is not the best in terms of efficiency when computing certain answers for conjunctive queries. The second DSE setting that the thesis introduces with the same property is called DSE with equality semantics. This setting captures interesting meaning of related data in a mapping table. Such semantics impose that each source element in a mapping table is related to a target element only if both elements are equivalent (i.e they have the same meaning). We show in our thesis that this DSE setting differs from ordinary DE settings in the sense that additional information could be entailed under such interpretation of related data. Also, this added information needs to be augmented to both the source instance and the mapping table in order to generate target instances that correctly reflect both in a DSE scenario. In other words, we can say that in such a DSE setting, a source instance and a mapping table can be incomplete with respect to the semantics of the mapping table. We formally define the two aforementioned semantics of a DSE setting and we distinguish between two types of solutions for this setting, named,universal DSE solutions, which contain the complete set of exchanged information, and universal DSE KB-Solutions, which store a portion of the exchanged information with implicit information in the form of a set of rules over the target. DSEKB-Solutions allow applications to compute on demand both a portion and the complete set of certain answers for conjunctive queries. In addition,we define the semantics of conjunctive query answering, and we distinguish between sound and complete certain answers for conjunctive queries and we define the algorithms to compute these efficiently. Finally, we provide experimental results which compare the run times to generate DSE solutions versus DSE KB-solutions, and compare the performance of computing sound and complete certain answers for conjunctive queries using both types of solutions
9

Extending dependencies for improving data quality

Ma, Shuai January 2011 (has links)
This doctoral thesis presents the results of my work on extending dependencies for improving data quality, both in a centralized environment with a single database and in a data exchange and integration environment with multiple databases. The first part of the thesis proposes five classes of data dependencies, referred to as CINDs, eCFDs, CFDcs, CFDps and CINDps, to capture data inconsistencies commonly found in practice in a centralized environment. For each class of these dependencies, we investigate two central problems: the satisfiability problem and the implication problem. The satisfiability problem is to determine given a set Σ of dependencies defined on a database schema R, whether or not there exists a nonempty database D of R that satisfies Σ. And the implication problem is to determine whether or not a set Σ of dependencies defined on a database schema R entails another dependency φ on R. That is, for each database D ofRthat satisfies Σ, the D must satisfy φ as well. These are important for the validation and optimization of data-cleaning processes. We establish complexity results of the satisfiability problem and the implication problem for all these five classes of dependencies, both in the absence of finite-domain attributes and in the general setting with finite-domain attributes. Moreover, SQL-based techniques are developed to detect data inconsistencies for each class of the proposed dependencies, which can be easily implemented on the top of current database management systems. The second part of the thesis studies three important topics for data cleaning in a data exchange and integration environment with multiple databases. One is the dependency propagation problem, which is to determine, given a view defined on data sources and a set of dependencies on the sources, whether another dependency is guaranteed to hold on the view. We investigate dependency propagation for views defined in various fragments of relational algebra, conditional functional dependencies (CFDs) [FGJK08] as view dependencies, and for source dependencies given as either CFDs or traditional functional dependencies (FDs). And we establish lower and upper bounds, all matching, ranging from PTIME to undecidable. These not only provide the first results for CFD propagation, but also extend the classical work of FD propagation by giving new complexity bounds in the presence of a setting with finite domains. We finally provide the first algorithm for computing a minimal cover of all CFDs propagated via SPC views. The algorithm has the same complexity as one of the most efficient algorithms for computing a cover of FDs propagated via a projection view, despite the increased expressive power of CFDs and SPC views. Another one is matching records from unreliable data sources. A class of matching dependencies (MDs) is introduced for specifying the semantics of unreliable data. As opposed to static constraints for schema design such as FDs, MDs are developed for record matching, and are defined in terms of similarity metrics and a dynamic semantics. We identify a special case of MDs, referred to as relative candidate keys (RCKs), to determine what attributes to compare and how to compare them when matching records across possibly different relations. We also propose a mechanism for inferring MDs with a sound and complete system, a departure from traditional implication analysis, such that when we cannot match records by comparing attributes that contain errors, we may still find matches by using other, more reliable attributes. We finally provide a quadratic time algorithm for inferring MDs, and an effective algorithm for deducing quality RCKs from a given set of MDs. The last one is finding certain fixes for data monitoring [CGGM03, SMO07], which is to find and correct errors in a tuple when it is created, either entered manually or generated by some process. That is, we want to ensure that a tuple t is clean before it is used, to prevent errors introduced by adding t. As noted by [SMO07], it is far less costly to correct a tuple at the point of entry than fixing it afterward. Data repairing based on integrity constraints may not find certain fixes that are absolutely correct, and worse, may introduce new errors when repairing the data. We propose a method for finding certain fixes, based on master data, a notion of certain regions, and a class of editing rules. A certain region is a set of attributes that are assured correct by the users. Given a certain region and master data, editing rules tell us what attributes to fix and how to update them. We show how the method can be used in data monitoring and enrichment. We develop techniques for reasoning about editing rules, to decide whether they lead to a unique fix and whether they are able to fix all the attributes in a tuple, relative to master data and a certain region. We also provide an algorithm to identify minimal certain regions, such that a certain fix is warranted by editing rules and master data as long as one of the regions is correct.
10

A More Decentralized Vision for Linked Data

Polleres, Axel, Kamdar, Maulik R., Fernandez Garcia, Javier David, Tudorache, Tania, Musen, Mark A. 25 June 2018 (has links) (PDF)
In this deliberately provocative position paper, we claim that ten years into Linked Data there are still (too?) many unresolved challenges towards arriving at a truly machine-readable and decentralized Web of data. We take a deeper look at the biomedical domain - currently, one of the most promising "adopters" of Linked Data - if we believe the ever-present "LOD cloud" diagram. Herein, we try to highlight and exemplify key technical and non-technical challenges to the success of LOD, and we outline potential solution strategies. We hope that this paper will serve as a discussion basis for a fresh start towards more actionable, truly decentralized Linked Data, and as a call to the community to join forces. / Series: Working Papers on Information Systems, Information Business and Operations

Page generated in 0.0737 seconds