Global ETD Search

21	High-performance near-time processing of bulk data Swientek, Martin January 2015 (has links) Enterprise Systems like customer-billing systems or financial transaction systems are required to process large volumes of data in a fixed period of time. Those systems are increasingly required to also provide near-time processing of data to support new service offerings. Common systems for data processing are either optimized for high maximum throughput or low latency. This thesis proposes the concept for an adaptive middleware, which is a new approach for designing systems for bulk data processing. The adaptive middleware is able to adapt its processing type fluently between batch processing and single-event processing. By using message aggregation, message routing and a closed feedback-loop to adjust the data granularity at runtime, the system is able to minimize the end-to-end latency for different load scenarios. The relationship of end-to-end latency and throughput of batch and message-based systems is formally analyzed and a performance evaluation of both processing types has been conducted. Additionally, the impact of message aggregation on throughput and latency is investigated. The proposed middleware concept has been implemented with a research prototype and has been evaluated. The results of the evaluation show that the concept is viable and is able to optimize the end-to-end latency of a system. The design, implementation and operation of an adaptive system for bulk data processing differs from common approaches to implement enterprise systems. A conceptual framework has been development to guide the development process of how to build an adaptive software for bulk data processing. It defines the needed roles and their skills, the necessary tasks and their relationship, artifacts that are created and required by different tasks, the tools that are needed to process the tasks and the processes, which describe the order of tasks. 004
22	Prediction Of Queue Waiting Times For Metascheduling On Parallel Batch Systems Rajath Kumar, * 08 1900 (has links) (PDF) Production parallel systems are space-shared and employ batch queues in which the jobs submitted to the systems are made to wait before execution. Thus, jobs submitted to parallel batch systems incur queue waiting times in addition to the execution times. Prediction of these queue waiting times is important to provide overall estimates to the users and can also help meta-schedulers make scheduling decisions. In the first part of our research, we have developed an integrated framework PQStar for identification and prediction of jobs with short queue waiting times. Analyses of the job traces of supercomputers reveal that about 56 to 99% of the jobs incur queue waiting times of less than an hour. Hence, identifying these quick starters or jobs with short queue waiting times is Essential for overall improvement on queue waiting time predictions. An important aspect of our prediction strategy for quick starters is that it considers the processor occupancy state and the queue state at the time of the job submission in addition to the job characteristics including the requested number of processors and the estimated runtime. Our experiments with different Production supercomputer job traces show that our prediction strategies can lead to correct identification of about 20% more quick starters on an average and provide tighter bounds for these jobs, and result in about 24% higher overall prediction accuracy on an average than the next best existing method. We have also developed a framework for predicting ranges of queue waiting times for other classes of jobs by employing multi-class classification on similar jobs in history. Our hierarchical prediction strategy first predicts the point wait time of a job using dynamic k- Nearest Neighbor (kNN) method. It then performs a multi-class classification using Support Vector Machines (SVMs) among all the classes of the jobs. The probabilities given by the SVM for the predicted class (obtained from the kNN), along with its neighboring classes, are used to provide a set of ranges of wait times with probabilities. Our experiments with different production supercomputer job traces show that our prediction strategies can lead to about 8% improved accuracy on an average in prediction of the non-quick starters, compared to the next best existing method. Finally, we have used these predictions and probabilities in a meta-scheduling strategy that distributes jobs to different queues/sites in a multi-queue/grid environment for minimizing wait times of the jobs. For a given target job, we first identify the queues/sites where the job can be a quick starter to get a set of candidate queues/sites for the scheduling of the job. We then compute the expected value of the predicted wait time in each of the candidate queues/sites, and schedule the job to the one with minimum expected value, for the execution of the job. We have performed experiments with different production supercomputer job traces and synthetic traces for various system sizes, partitioning schemes and different workloads. These experiments have shown that our scheduling strategy gives much improved performance when compared to the existing scheduling policies by reducing the overall average queue waiting times of the jobs by about 47% on an average. Parallel Processing Queuing Theory PQStar (Predicting Quik Starters) Parallel Batch Systems - Metascheduling Batch Processing Metascheduling (Batch Systems) Batch Queues Parallel Batch Systems Queue Waiting Time Predictions Computer Science
23	Packet Order Matters! : Improving Application Performance by Deliberately Delaying Packets / Paketsekvensen betyder! : Förbättra applikationsprestanda genom att avsiktligt fördröja paket Ghasemirahni, Hamid January 2021 (has links) Data-centers increasingly deploy commodity servers with high-speed network interfaces to enable low-latency communication. However, achieving low latency at high data rates crucially depends on how the incoming traffic interacts with the system's caches. When packets that need to be processed in the same way are consecutive, i.e., exhibit high temporal and spatial locality, CPU caches deliver great benefits. This licentiate thesis systematically studies the impact of temporal and spatial traffic locality on the performance of commodity servers equipped with high-speed network interfaces. The results are that (i) the performance of a variety of widely deployed applications degrade substantially with even the slightest lack of traffic locality, and (ii) a traffic trace from our organization's link to/from its upstream provider reveals poor traffic locality as networking protocols, drivers, and the underlying switching/routing fabric spread packets out in time (reducing locality). To address these issues, we built Reframer, a software solution that deliberately delays packets and reorders them to increase traffic locality. Despite introducing µs-scale delays of some packets, Reframer increases the throughput of a network service chain by up to 84% and reduces the flow completion time of a web server by 11% while improving its throughput by 20%. / Datacenter distribuerar alltmer rå varuservrar med höghastighets-nätverksgränssnitt för att möjliggöra kommunikation med låg latens. Att uppnå låg latens vid höga datahastigheter beror dock mycket på hur den inkommande trafiken interagerar med systemets cacheminnen. När paket som behöver bearbetas på samma sätt är konsekutiva, dvs. uppvisar hög tids- och rumslig lokalitet, ger cacher stora fördelar. I denna licentiatuppsats studerar vi systematiskt effekterna av tidsmässig och rumslig trafikplats på prestanda för rå varuservrar utrustade med höghastighetsnätgränssnitt.Vå ra resultat visar att (i) prestandan för en mängd allmänt distribuerade applikationer försämras avsevärt med till och med den minsta bristen på trafikplats, och (ii) visar ett trafikspår från vår organisation dålig trafikplats som nätverksprotokoll, drivrutiner och den underliggande omkopplingen/dirigera tygspridningspaket i tid (minska lokaliteten). För att ta itu med dessa problem byggde vi Reframer, en mjukvarulösning som medvetet fördröjer paket och ordnar dem för att öka trafikplatsen. Trots införandet av µs-skalafördröjningar för vissa paket visar vi att Reframer ökar genomströmningen för en nätverkstjänstkedja med upp till 84% och minskar flödet för en webbserver med 11% samtidigt som dess genomströmning förbättras med 20%. / <p>QC 20210512</p> / ULTRA Packet Ordering. Spatial Locality Temporal Locality Packet Scheduling Batch Processing Paketbeställning Rumslig lokalitet Temporal lokalitet Paketplanering Satsvis bearbetning Communication Systems Kommunikationssystem
24	INVESTIGATORY ANALYSIS OF BIG DATA’S ROLE AND IMPACT ON LOCAL ORGANIZATIONS, INSTITUTIONS, AND BUSINESSES’ DECISION-MAKING AND DAY-TO-DAY OPERATIONS Markle, Scott Timothy 30 March 2023 (has links) No description available. Computer Science web scraping comparative analysis Big Data survey stream processing batch processing hesitancies and obstructions industry utilization collegiate supplement ParseHub Simplescraper
25	Dynamic optimisation and control of batch reactors. Development of a general model for batch reactors, dynamic optimisation of batch reactors under a variety of objectives and constraints and on-line tracking of optimal policies using different types of advanced control strategies. Aziz, Norashid January 2001 (has links) Batch reactor is an essential unit operation in almost all batch-processing industries. Different types of reaction schemes (such as series, parallel and complex) and different order of model complexity (short-cut, detailed, etc. ) result in different sets of model equations and computer coding of all possible sets of model equations is cumbersome and time consuming. In this work, therefore, a general computer program (GBRM - General Batch Reactor Model) is developed to generate all possible sets of equations automatically and as required. GBRM is tested for different types of reaction schemes and for different order of model complexity and its flexibility is demonstrated. The above GBRM computer program is lodged with Dr. I. M. Mujtaba. One of the challenges in batch reactors is to ensure desired performance of individual batch reactor operations. Depending on the requirement and the objective of the process, optimisation in batch reactors leads to different types of optimisation problems such as maximum conversion, minimum time and maximum profit problem. The reactor temperature, jacket temperature and jacket flow rate are the main control variables governing the process and these are optimised to ensure maximum benefit. In this work, an extensive study on mainly conventional batch reactor optimisation is carried out using GBRM coupled with efficient DAEs (Differential and Algebraic Equations) solver, CVP (Control Vector Parameterisation) technique and SQP (Successive Quadratic Programming) based optimisation technique. The safety, environment and product quality issues are embedded in the optimisation problem formulations in terms of constraints. A new approach for solving optimisation problem with safety constraint is introduced. All types of optimisation problems mentioned above are solved off-line, which results to optimal operating policies. The off-line optimal operating policies obtained above are then implemented as set points to be tracked on-line and various types of advanced controllers are designed for this purpose. Both constant and dynamic set points tracking are considered in designing the controllers. Here, neural networks are used in designing Direct Inverse and Inverse-Model-Based Control (IMBC) strategies. In addition, the Generic Model Control (GMC) coupled with on-line neural network heat release estimator (GMC-NN) is also designed to track the optimal set points. For comparison purpose, conventional Dual Mode (DM) strategy with PI and PID controllers is also designed. Robustness tests for all types of controllers are carried out to find the best controller. The results demonstrate the robustness of GMC-NN controller and promise neural controllers as potential robust controllers for future. Finally, an integrated framework (BATCH REACT) for modelling, simulation, optimisation and control of batch reactors is proposed. / University Sains Malaysia Batch reactors Modelling Dynamic optimisation On-line tracking Control vector parameterisation Generic model control Neural network techniques Batch processing industries GBRM - General Batch Reactor Model
26	Mécanismes de base et réalisation de fonctions pour l'utilisation interactive d'un réseau d'ordinateurs Zhiri, Amine 08 December 1973 (has links) (PDF) . réseau interaction processeur frontal ARPA SOC système conversationnel synchronisation réseau virtuel systèmes d'exploitation batch-processing temps partagé Operating System OS 360 Système d'Ordinateur Connectés
27	Deterministic Scheduling Of Parallel Discrete And Batch Processors Venkataramana, M 07 1900 (has links) Scheduling concerns the allocation of limited resources to tasks over time. In manufacturing systems, scheduling is nothing but assigning the jobs to the available processors over a period of time. Our research focuses on scheduling in systems of parallel processors which is challenging both from the theoretical and practical perspectives. The system of parallel processors is a common occurrence in different types of modern manufacturing systems such as job shop, batch shop and mass production. A variety of important and challenging problems with realistic settings in a system of parallel processors are considered. We consider two types of processors comprising discrete and batch processors. The processor which produces one job at a time is called a discrete processor. Batch processor is a processor that can produce several jobs simultaneously by keeping jobs in a batch form which is commonly seen in semiconductor manufacturing, heat treatment operations and also in chemical processing industries. Our aim is to develop efficient solution methodologies (heuristics/metaheuristics) for three different problems in the thesis. The first two problems consider the objective of minimizing total weighted tardiness in cases of discrete and batch processors where customer delivery time performance is critical. The third problem deals with the objective of minimizing the total weighted completion time in the case of batch processors to reduce work-in-process inventory. Specifically, the first problem deals with the scheduling of parallel identical discrete processors to minimize total weighted tardiness. We develop a metaheuristic based on Ant Colony Optimization(ACO) approach to solve the problem and compare it with the available best heuristics in the literature such as apparent tardiness cost and modified due date rules. An extensive experimentation is conducted to evaluate the performance of the ACO approach on different problem sizes with varied tardiness factors. Our experimentation shows that the proposed ant conony optimization algorithm yields promising results as compared to the best of the available heuristics. The second problem concerns with the scheduling of jobs to parallel identical batch processors for minimizing the total weighted tardiness. It is assumed that the jobs are incompatible in respect of job families indicating that jobs from different families cannot be processed together. We decompose the problem into two stages including batch formation and batch scheduling as in the literature. Ant colony optimization based heuristics are developed in which ACO is used to solve the batch scheduling problem. Our computational experimentation shows that the proposed five ACO based heuristics perform better than the available best traditional dispatching rule called ATC-BATC rule. The third scheduling problem is to minimize the total weighted completion time in a system of parallel identical batch processors. In the real world manufacturing system, jobs to be scheduled come in lots with different job volumes(i.e number of jobs) and priorities. The real settings of lots and high batch capacity are considered in this problem. This scheduling problem is formulated as a mixed integer non-linear program. We develop a solution framework based on the decomposition approach for this problem. Two heuristics are proposed based on the proposed decomposition approach and the performance of these heuristics is evaluated in the cases of two and three batch processors by comparing with the solution of LINGO solver. Batch Processing Parallel Processing Scheduling Ant Colony Optimization (ACO) Parallel Discrete Processor Scheduling Parallel Batch Processor Scheduling Batch Scheduling Incompatible Job Families Total Weighted Completion Time ATC-BATC Rule Computer Science
28	Zpracování velkých dat z rozsáhlých IoT sítí / Big Data Processing from Large IoT Networks Benkő, Krisztián January 2019 (has links) The goal of this diploma thesis is to design and develop a system for collecting, processing and storing data from large IoT networks. The developed system introduces a complex solution able to process data from various IoT networks using Apache Hadoop ecosystem. The data are real-time processed and stored in a NoSQL database, but the data are also stored in the file system for a potential later processing. The system is optimized and tested using data from IQRF network. The data stored in the NoSQL database are visualized and the system periodically generates derived predictions. Users are connected to this system via an information system, which is able to automatically generate notifications when monitored values are out of range.

Search results