Global ETD Search

41	Mining of High-Utility Patterns in Big IoT-based Databases Wu, Jimmy M. T., Srivastava, Gautam, Lin, Jerry C., Djenouri, Youcef, Wei, Min, Parizi, Reza M., Khan, Mohammad S. 01 February 2021 (has links) When focusing on the general area of data mining, high-utility itemset mining (HUIM) can be defined as an offset of frequent itemset mining (FIM). It is known to emphasize more factors critically, which gives HUIM its intrinsic edge. Due to the flourishing development of the IoT technique, the uncertainty patterns mining is also attractive. Potential high-utility itemset mining (PHUIM) is introduced to reveal valuable patterns in an uncertainty database. Unfortunately, even though the previous methods are all very effective and powerful to mine, the potential high-utility itemsets quickly. These algorithms are not specifically designed for a database with an enormous number of records. In the previous methods, uncertainty transaction datasets would be load in the memory ultimately. Usually, several pre-defined operators would be applied to modify the original dataset to reduce the seeking time for scanning the data. However, it is impracticable to apply the same way in a big-data dataset. In this work, a dataset is assumed to be too big to be loaded directly into memory and be duplicated or modified; then, a MapReduce framework is proposed that can be used to handle these types of situations. One of our main objectives is to attempt to reduce the frequency of dataset scans while still maximizing the parallelization of all processes. Through in-depth experimental results, the proposed Hadoop algorithm is shown to perform strongly to mine all of the potential high-utility itemsets in a big-data dataset and shows excellent performance in a Hadoop computing cluster. data mining hadoop IoT data analytics uncertain utility patterns Computing
42	ACE: Agile,Contingent and Efficient Similarity Joins Using MapReduce Lakshminarayanan, Mahalakshmi January 2013 (has links) No description available. Computer Science Engineering
43	Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation Gadiraju, Krishna Karthik 13 October 2014 (has links) No description available. Computer Science Hive Hadoop benchmarking big data SQL queries
44	Using Hadoop to Cluster Data in Energy System Hou, Jun 03 June 2015 (has links) No description available. Computer Science Hadoop K-means energy data clustering analysis
45	Gaussian Deconvolution and MapReduce Approach for Chipseq Analysis Sugandharaju, Ravi Kumar Chatnahalli 26 September 2011 (has links) No description available. Computer Science dna sequencing chipseq mapreduce hadoop deconvolution nedler mead
46	Using the Architectural Tradeoff Analysis Method to Evaluate the Software Architecture of a Semantic Search Engine: A Case Study Chatra Raveesh, Sandeep January 2013 (has links) No description available. Computer Science Semantic Hadoop Big data Search engine
47	RDMA-based Plugin Design and Profiler for Apache and Enterprise Hadoop Distributed File system Bhat, Adithya January 2015 (has links) No description available. Computer Science HDFS Hadoop RDMA InfiniBand Apache HDP CDH
48	Performance Characterization and Improvements of SQL-On-Hadoop Systems Kulkarni, Kunal Vikas 28 December 2016 (has links) No description available. Computer Science Hadoop SQL Impala Hive Big Data Joins HDFS
49	High performance shared state schedulers Kouzoupis, Antonios January 2016 (has links) Large organizations and research institutes store a huge volume of data nowadays.In order to gain any valuable insights distributed processing frameworks over acluster of computers are needed. Apache Hadoop is the prominent framework fordistributed storage and data processing. At SICS Swedish ICT we are building Hops, a new distribution of Apache Hadoop relying on a distributed, highly available MySQL Cluster NDB to improve performance. Hops-YARN is the resource management framework of Hops which introduces distributed resource management, load balancing the tracking of resources in a cluster. In Hops-YARN we make heavy usage of the back-end database storing all the resource manager metadata and incoming RPCs to provide high fault tolerance and very short recovery time. This project aims in optimizing the mechanisms used for persisting metadata in NDB both in terms of transactional commit time but also in terms of pre-processing them. Under no condition should the in-memory RM state diverge from the state stored in NDB. With these goals in mind several solutions were examined that improved the performance of the system, making Hops-YARN comparable to Apache YARN with the extra benefits of high-fault tolerance and short recovery time. The solutions proposed in this thesis project enhance the pure commit time of a transaction to the MySQL Cluster and the pre-processing and parallelism of our Transaction Manager. The results indicate that the performance of Hops increased dramatically, utilizing more resources on a cluster with thousands of machines. Increasing the cluster utilization by a few percentages can save organizations a big amount of money. / Nu för tiden lagrar stora organisationer och forskningsinstitutioner enorma mängder data.För att kunna utvinna någon värdefull information från dessa data behöver den bearbetasav ett kluster av datorer. När flera datorer gemensamt ska bearbeta data behöver de utgåfrån ett så kallat "distributed processing framework''. I dagsläget är Apache Hadoop detmest använda ramverket för distribuerad lagring och behandling av data. Detta examensarbeteär har genomförts vid SICS Swedish ICT där vi byggt Hops, en ny distribution avApache Hadoop som drivs av ett distribuerat MySQL Cluster NDB som erbjuder en hög tillgänglighet.Hops-YARN är Hops ramverk för resurshantering med distribuerade ResourceManagers som lastbalanserarderas ResourceTrackerService. I detta examensarbete använder vi Hops-Yarn på ett sätt där ``back-end''databasen flitigt används för att hantera ResourceManagerns metadata och inkommande RPC-anrop. Vårkonfiguration erbjuder en hög feltolerans och återställer sig mycket snabbt vidfelberäkningar. Vidare används NDB-klustrets Event API för att ResourceManager ska kunnakommunicera med den distribuerade ResourceTrackers. Detta projekt syftar till att optimera de mekanismer som används för ihållande metadatai NDB både i termer av transaktions begå tid men också i termer av pre-bearbeta dem medan samtidigt garantera enhetlighet i RM: s tillstånd. ResourceManagerns tillståndi RAM-minnet får under inga omständigheteravvika från det tillstånd som finns lagrat i NDB:n. Med dessa mål i åtanke undersöktes fleralösningar som förbättrar prestandan och därmed gör Hops-Yarn jämförbart med Apache YARN.De lösningar som föreslås i denna uppsats förbättrar “pure commit time” när en transaktiongörs i ett MySQL Cluster samt förbehandlingen och parallelismen i vår Transaction Manager.Resultaten tyder på att Hops prestanda ökade dramatiskt vilket ledde till ett effektivarenyttjande av tillgängliga resurser i ett kluster bestående av ett tusental datorer. Närnyttjandet av tillgänliga resurser i ett kluster förbättras med några få procent kanorganisationer spara mycket pengar. Hops Hadoop Big data Yarn schedulers Computer Systems Datorsystem
50	On the Feasibility of MapReduce to Compute Phase Space Properties of Graphical Dynamical Systems: An Empirical Study Hamid, Tania 09 July 2015 (has links) A graph dynamical system (GDS) is a theoretical construct that can be used to simulate and analyze the dynamics of a wide spectrum of real world processes which can be modeled as networked systems. One of our goals is to compute the phase space of a system, and for this, even 30-vertex graphs present a computational challenge. This is because the number of state transitions needed to compute the phase space is exponential in the number of graph vertices. These problems thus produce memory and execution speed challenges. To address this, we devise various MapReduce programming paradigms that can be used to characterize system state transitions, compute phase spaces, functional equivalence classes, dynamic equivalence classes and cycle equivalence classes of dynamical systems. We also evaluate these paradigms and analyze their suitability for modeling different GDSs. / Master of Science Graph Dynamical Systems GDS MapReduce Map Reduce Hadoop

Search results