351 |
Recommender System for Audio RecordingsLee, Jong Seo 01 January 2010 (has links) (PDF)
Nowadays the largest E-commerce or E-service websites offer millions of products for sale. A Recommender system is defined as software used by such websites for recommending commercial or noncommercial product items to users according to the users’ tastes. In this project, we develop a recommender system for a private multimedia web service company. In particular, we devise three recommendation engines using different data filtering methods – named weighted-average, K-nearest neighbors, and item-based – which are based on collaborative filtering techniques, which work by recording user preferences on items and by anticipating the future likes and dislikes of users by comparing the records, for prediction of user preference. To acquire proper input data for the three engines, we retrieve data from database using three data collection techniques: active filtering, passive filtering, and item-based filtering. For experimental purpose we compare prediction accuracy of those three recommendation engines with the results from each engine and additionally we evaluate the performance of weighted-average method using an empirical analysis approach – a methodology which was devised for verification of predictive accuracy.
|
352 |
Energy-Efficient Databases Using Sweet Spot FrequenciesLehner, Wolfgang, Götz, Sebastian, Ilsche, Thomas, Cardoso, Jorge, Spillner, Josef, Kissinger, Thomas, Aßmann, Uwe, Nagel, Wolfgang E., Schill, Alexander 12 January 2023 (has links)
Database management systems (DBMS) are typically tuned for high performance and scalability. Nevertheless, carbon footprint and energy efficiency are also becoming increasing concerns. Unfortunately, existing studies mainly present theoretical contributions but fall short on proposing practical techniques. These could be used by administrators or query optimizers to increase the energy efficiency of the DBMS. Thus, this paper explores the effect of so-called sweet spots, which are energy-efficient CPU frequencies, on the energy required to execute queries. From our findings, we derive the Sweet Spot Technique, which relies on identifying energy-efficient sweet spots and the optimal number of threads that minimizes energy consumption for a query or an entire database workload. The technique is simple and has a practical implementation leading to energy savings of up to 50% compared to using the nominal frequency and maximum number of threads.
|
353 |
Anti-Tamper Databases: Querying Encrypted DatabasesChung, Sun S. 09 January 2006 (has links)
No description available.
|
354 |
Computing Label-Constraint Reachability in Graph DatabasesHONG, HUI 16 April 2012 (has links)
No description available.
|
355 |
FUNCTION COMPUTING IN VERTICALLY PARTITIONED DISTRIBUTED DATABASESSHINDE, KAUSTUBH ARUN January 2006 (has links)
No description available.
|
356 |
A Novel Index Method for Write Optimization on Out-of-Core Column-Store DatabasesMatacic, Tyler Joseph January 2016 (has links)
No description available.
|
357 |
Query processing in distributed database systems /Unnava, Vasundhara January 1992 (has links)
No description available.
|
358 |
Design and Performance analysis of a relational replicated database systemsHanson, Jon Gregory 01 January 1987 (has links) (PDF)
The hardware organization and software structure of a new database system are presented. This system, the relational replicated database system (RRDS), is based on a set of replicated processors operating on a partitioned database. Performance improvements and capacity growth can be obtained by adding more processors to the configuration. Based on designing goals a set of hardware and software design questions were developed. The system then evolved according to a five-phase process, based on simulation and analysis, which addressed and resolved the design questions. Strategies and algorithms were developed for data access, data placement, and directory management for the hardware organization. A predictive performance analysis was conducted to determine the extent to which original design goals were satisfied. The predictive performance results, along with an analytical comparison with three other relational multi-backend systems, provided information about the strengths and weaknesses of our design as well as a basis for future research.
|
359 |
Relational Computing Using HPC Resources: Services and OptimizationsSoundarapandian, Manikandan 15 September 2015 (has links)
Computational epidemiology involves processing, analysing and managing large volumes of data. Such massive datasets cannot be handled efficiently by using traditional standalone database management systems, owing to their limitation in the degree of computational efficiency and bandwidth to scale to large volumes of data. In this thesis, we address management and processing of large volumes of data for modeling, simulation and analysis in epidemiological studies. Traditionally, compute intensive tasks are processed using high performance computing resources and supercomputers whereas data intensive tasks are delegated to standalone databases and some custom programs. DiceX framework is a one-stop solution for distributed database management and processing and its main mission is to leverage and utilize supercomputing resources for data intensive computing, in particular relational data processing.
While standalone databases are always on and a user can submit queries at any time for required results, supercomputing resources must be acquired and are available for a limited time period. These resources are relinquished either upon completion of execution or at the expiration of the allocated time period. This kind of reservation based usage style poses critical challenges, including building and launching a distributed data engine onto the supercomputer, saving the engine and resuming from the saved image, devising efficient optimization upgrades to the data engine and enabling other applications to seamlessly access the engine . These challenges and requirements cause us to align our approach more closely with cloud computing paradigms of Infrastructure as a Service(IaaS) and Platform as a Service(PaaS). In this thesis, we propose cloud computing like workflows, but using supercomputing resources to manage and process relational data intensive tasks. We propose and implement several services including database freeze and migrate and resume, ad-hoc resource addition and table redistribution. These services assist in carrying out the workflows defined.
We also propose an optimization upgrade to the query planning module of postgres-XC, the core relational data processing engine of the DiceX framework. With a knowledge of domain semantics, we have devised a more robust data distribution strategy that would enable to push down most time consuming sql operations forcefully to the postgres-XC data nodes, bypassing its query planner's default shippability criteria without compromising correctness. Forcing query push down reduces the query processing time by a factor of almost 40%-60% for certain complex spatio-temporal queries on our epidemiology datasets.
As part of this work, a generic broker service has also been implemented, which acts as an interface to the DiceX framework by exposing restful apis, which applications can make use of to query and retrieve results irrespective of the programming language or environment. / Master of Science
|
360 |
Multicore Scalability Through Asynchronous WorkMathew, Ajit 13 January 2020 (has links)
With the end of Moore's Law, computer architects have turned to multicore architecture to provide high performance. Unfortunately, to achieve higher performance, multicores require programs to be parallelized which is an untamed problem. Amdahl's law tells that the maximum theoretical speedup of a program is dictated by the size of the non-parallelizable section of a program. Hence to achieve higher performance, programmers need to reduce the size of sequential code in the program. This thesis explores asynchronous work as a means to reduce sequential portions of program. Using asynchronous work, a programmer can remove tasks which do not affect data consistency from the critical path and can be performed using background thread. Using this idea, the thesis introduces two systems. First, a synchronization mechanism, Multi-Version Read-Log-Update(MV-RLU), which extends Read-Log-Update (RLU) through multi-versioning. At the core of MV-RLU design is a concurrent garbage collection algorithm which reclaims obsolete versions asynchronously reducing blocking of threads. Second, a concurrent and highly scalable index-structure called Hydralist for multi-core. The key idea behind design of Hydralist is that an index-structure can be divided into two component (search layer and data layer) and updates to data layer can be done synchronously while updates to search layer can be propagated asynchronously using background threads. / Master of Science / Up until mid-2000s, Moore's law predicted that performance CPU doubled every two years. This is because improvement in transistor technology allowed smaller transistor which can switch at higher frequency leading to faster CPU clocks. But faster clock leads to higher heat dissipation and as chips reached their thermal limits, computer architects could no longer increase clock speeds. Hence they moved to multicore architecture, wherein a single die contains multiple CPUs, to allow higher performance. Now programmers are required to parallelize their code to take advangtage of all the CPUs in a chip which is a non trivial problem. The theoretical speedup achieved by a program on multicore architecture is dictated by Amdahl's law which describes the non parallelizable code in a program as the limiting factor for speedup. For example, a program with 99% parallelizable code can achieve speedup of 20 whereas a program with 50% parallelizable code can only achieve speedup of 2. Therefore to achieve high speedup, programmers need to reduce size of serial section in their program. One way to reduce sequential section in a program is to remove non-critical task from the sequential section and perform the tasks asynchronously using background thread. This thesis explores this technique in two systems. First, a synchronization mechanism which is used co-ordinate access to shared resource called Multi-Version Read-Log-Update (MV-RLU). MV-RLU achieves high performance by removing garbage collection from critical path and performing it asynchronously using background thread. Second, an index structure, Hydralist, which based on the insight that an index structure can be decomposed into two components, search layer and data layer, and decouples updates to both the layer which allows higher performance. Updates to search layer is done synchronously while updates to data layer is done asynchronously using background threads. Evaluation shows that both the systems perform better than state-of-the-art competitors in a variety of workloads.
|
Page generated in 0.0496 seconds