• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • 5
  • 1
  • 1
  • 1
  • Tagged with
  • 15
  • 15
  • 6
  • 6
  • 6
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

On the parallelization of network diffusion models

Rhomberg, Patrick 01 August 2017 (has links)
In this thesis, we investigate methods by which discrete event network diffusion simulators may execute without the restriction of lockstep or near lockstep synchronicity. We develop a discrete event simulator that allows free clock drift between threads, develop a differential equations model to approximate communication cost of such a simulator, and propose an algorithm by which we leverage information gathered in the natural course of simulation to redistribute agents to parallel threads such that the burden of communication is lowered during future replicates.
2

GeoSparkSim: A Scalable Microscopic Road Network Traffic Simulator Based on Apache Spark

January 2019 (has links)
abstract: Researchers and practitioners have widely studied road network traffic data in different areas such as urban planning, traffic prediction and spatial-temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install Global Positioning System(GPS) receivers and administrators must continuously monitor these devices. There have been some urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) Scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) Granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. This paper proposed GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize large-scale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand cars over an extensive road network (250 thousand road junctions and 300 thousand road segments). / Dissertation/Thesis / Masters Thesis Computer Engineering 2019
3

Performance and Power Optimization of GPU Architectures for General-purpose Computing

Wang, Yue 18 June 2014 (has links)
Power-performance efficiency has become a central focus that is challenging in heterogeneous processing platforms as the power constraints have to be established without hindering the high performance. In this dissertation, a framework for optimizing the power and performance of GPUs in the context of general-purpose computing in GPUs (GPGPU) is proposed. To optimize the leakage power of caches in GPUs, we dynamically switch the L1 and L2 caches into low power modes during periods of inactivity to reduce leakage power. The L1 cache can be put into a low-leakage (sleep) state when a processing unit is stalled due to no ready threads to be scheduled and the L2 can be put into sleep state during its idle period when there is no memory request. The sleep mode is state-retentive, which obviates the necessity to flush the caches after they are woken up, thereby, avoiding any performance degradation. Experimental results indicate that this technique can reduce the leakage power by 52% on average. Further, to improve performance, we redistribute the GPGPU workload across the computing units of the GPU during application execution. The fundamental idea is to monitor the workload on each multi-processing unit and redistribute it by having a portion of its unfinished threads executed in a neighboring multi-processing unit. Experimental results show this technique improves the performance of the GPGPU workload by 15.7%. Finally, to improve both performance and dynamic power of GPUs, we propose two dynamic frequency scaling (DFS) techniques implemented on CPU host threads, one of which is motivated by the significance of the pipeline stalls during GPGPU execution. It applies a feedback controlling algorithm, Proportional-Integral-Derivative (PID), to regulate the frequency of parallel processors and memory channels based on the occupancy of the memory buffering queues. The other technique targets on maximizing the average throughput of all parallel processors under the dynamic power constraints. We formalize this target as a linear programming problem and solve it on the runtime. According to the simulation results, the first technique achieves more than 22% power savings with a 4% improvement in performance and the second technique saves 11% power consumption with 9% performance improvement. The contributions of this dissertation represent a significant advancement in the quest for improving performance and reducing energy consumption of GPGPU.
4

Computational Studies in Multi-Criteria Scheduling and Optimization

Martin, Megan Wydick 11 August 2017 (has links)
Multi-criteria scheduling provides the opportunity to create mathematical optimization models that are applicable to a diverse set of problem domains in the business world. This research addresses two different employee scheduling applications using multi-criteria objectives that present decision makers with trade-offs between global optimality and the level of disruption to current operating resources. Additionally, it investigates a scheduling problem from the product testing domain and proposes a heuristic solution technique for the problem that is shown to produce very high-quality solutions in short amounts of time. Chapter 2 addresses a grant administration workload-to-staff assignment problem that occurs in the Office of Research and Sponsored Programs at land-grant universities. We identify the optimal workload assignment plan which differs considerably due to multiple reassignments from the current state. To achieve the optimal workload reassignment plan we demonstrate a technique to identify the n best reassignments from the current state that provides the greatest progress toward the utopian solution. Solving this problem over several values of n and plotting the results allows the decision maker to visualize the reassignments and the progress achieved toward the utopian balanced workload solution. Chapter 3 identifies a weekly schedule that seeks the most cost-effective set of coach-to-program assignments in a gymnastics facility. We identify the optimal assignment plan using an integer linear programming model. The optimal assignment plan differs greatly from the status quo; therefore, we utilize a similar approach from Chapter 2 and use a multiple objective optimization technique to identify the n best staff reassignments. Again, the decision maker can visualize the trade-off between the number of reassignments and the resulting progress toward the utopian staffing cost solution and make an informed decision about the best number of reassignments. Chapter 4 focuses on product test scheduling in the presence of in-process and at-completion inspection constraints. Such testing arises in the context of the manufacture of products that must perform reliably in extreme environmental conditions. Each product receives a certification at the successful completion of a predetermined series of tests. Operational efficiency is enhanced by determining the optimal order and start times of tests so as to minimize the make span while ensuring that technicians are available when needed to complete in-process and at-completion inspections We first formulate a mixed-integer programming model (MILP) to identify the optimal solution to this problem using IBM ILOG CPLEX Interactive Optimizer 12.7. We also present a genetic algorithm (GA) solution that is implemented and solved in Microsoft Excel. Computational results are presented demonstrating the relative merits of the MILP and GA solution approaches across a number of scenarios. / Ph. D.
5

Performance Analysis of Decentralized Supply Chains: Considerations of Channel Power and Subcontracting

Bichescu, Bogdan Cristian 28 September 2006 (has links)
No description available.
6

[en] WORKLOAD BALANCING STRATEGIES FOR PARALLEL BLAST EVALUATION ON REPLICATED DATABASES AND PRIMARY FRAGMENTS / [pt] ESTRATÉGIAS DE BALANCEAMENTO DE CARGA PARA AVALIAÇÃO PARALELA DO BLAST COM BASES DE DADOS REPLICADAS E FRAGMENTOS PRIMÁRIOS

DANIEL XAVIER DE SOUSA 07 April 2008 (has links)
[pt] Na área de biologia computacional a busca por informações relevantes em meio a volumes de dados cada vez maiores é uma atividade fundamental. Dentre outras, uma tarefa importante é a execução da ferramenta BLAST (Basic Local Alignment Search Tool), que possibilita comparar biosseqüências a fim de se descobrir homologias entre elas e inferir as demais informações pertinentes. Um dos problemas a serem resolvidos no que diz respeito ao custo de execução do BLAST se refere ao tamanho da base de dados, que vem aumentando consideravelmente nos últimos anos. Avaliar o BLAST com estrat´egias paralelas e distribuídas com apoio de agrupamento de computadores tem sido uma das estratégias mais utilizadas para obter ganhos de desempenho. Nesta dissertação, é realizada uma alocação física replicada da base de dados (de seqüências), onde cada réplica é fragmentada em partes distintas, algumas delas escolhidas como primárias. Dessa forma, é possível mostrar que se aproveitam as principais vantagens das estratégias de execução sobre bases replicadas e fragmentadas convencionais, unindo flexibilidade e paralelismo de E/S. Associada a essa alocação particular da base, são sugeridas duas formas de balanceamento dinâmico da carga de trabalho. As abordagens propostas são realizadas de maneira não intrusiva no código BLAST. São efetuados testes de desempenho variados que demonstram não somente a eficácia no equilíbrio de carga como também eficiência no processamento como um todo. / [en] A fundamental task in the area of computational biology is the search for relevant information within the large amount of available data. Among others, it is important to run tools such as BLAST - Basic Local Alignment Search Tool - effciently, which enables the comparison of biological sequences and discovery of homologies and other related information. However, the execution cost of BLAST is highly dependent on the database size, which has considerably increased. The evaluation of BLAST in distributed and parallel environments like PC clusters has been largely investigated in order to obtain better performances. This work reports a replicated allocation of the (sequences) database where each copy is also physically fragmented, with some fragments assigned as primary. This way we show that it is possible to execute BLAST with some nice characteristics of both replicated and fragmented conventional strategies, like flexibility and I/O parallelism. We propose two dynamic workload balancing strategies associated with this data allocation. We have adopted a non- intrusive approach, i.e., the BLAST code remains unchanged. These methods are implemented and practical results show that we achieve not only a balanced workload but also very good performances.
7

Linear Static Analysis Of Large Structural Models On Pc Clusters

Ozmen, Semih 01 July 2009 (has links) (PDF)
This research focuses on implementing and improving a parallel solution framework for the linear static analysis of large structural models on PC clusters. The framework consists of two separate programs where the first one is responsible from preparing data for the parallel solution that involves partitioning, workload balancing, and equation numbering. The second program is a fully parallel nite element program that utilizes substructure based solution approach with direct solvers. The first step of data preparation is partitioning the structure into substructures. After creating the initial substructures, the estimated imbalance of the substructures is adjusted by iteratively transferring nodes from the slower substructures to the faster ones. Once the final substructures are created, the solution phase is initiated. Each processor assembles its substructure&#039 / s stiffness matrix and condenses it to the interfaces. The interface equations are then solved in parallel with a block-cyclic dense matrix solver. After computing the interface unknowns, each processor calculates the internal displacements and element stresses or forces. Comparative tests were done to demonstrate the performance of the solution framework.
8

[en] AN INTEREST MANAGEMENT APPROACH TO DYNAMIC PARTITIONING DISTRIBUTED SIMULATIONS / [pt] UMA ABORDAGEM BASEADA EM GERENCIAMENTO DE INTERESSES PARA O PARTICIONAMENTO DINÂMICO DE SIMULAÇÕES DISTRIBUÍDAS

FELIPE COIMBRA BACELAR 01 February 2017 (has links)
[pt] Para que simulações distribuídas baseadas em agentes possam ter alto grau de escalabilidade é necessário evitar gargalos de comunicação. Existe troca de mensagens entre máquinas toda vez que um agente contido em um determinado computador precisa interagir com elementos que se encontram em outro computador. O presente trabalho propõe particionar dinamicamente uma simulação de forma a manter um agente no mesmo nó da rede em que se encontram os elementos com os quais ele mais interage, reduzindo o custo de comunicação entre os computadores da rede. Para isto, é utilizado o conceito de gerenciamento de interesses, que visa prover ao agente apenas o conjunto mínimo de informações para que ele possa interagir com o ambiente de forma coerente. Para ilustrar a solução proposta foi desenvolvido um estudo de caso que compreende uma simulação distribuída representando um cenário de derramamento de petróleo no mar. / [en] To achieve high scalability in distributed simulations is necessary to avoid communication bottlenecks. Messages between machines are necessary when an agent kept in a specific computer needs to interact with elements kept in another computer. This work presents an approach to dynamically partitioning a distributed simulation keeping each agent in the same network node where are the elements more accessed by it, reducing the communication cost between the network computers. To reach this objective, we are using the concept of interest management, which aims to provide to an agent only the smallest set of information necessary to allow it to interact with the environment in a coherent way. To illustrate the proposed solution was developed a case study comprehending a distributed simulation representing an oil spill scenario.
9

[en] AN EXPERIMENTAL EVALUATION OF CONSISTENT HASHING WITH BOUNDED LOADS IN ONLINE VIDEO DISTRIBUTION / [pt] UMA AVALIAÇÃO EXPERIMENTAL DE HASHING CONSISTENTE COM CARGAS LIMITADAS NA DISTRIBUIÇÃO DE VÍDEOS ONLINE

BERNARDO DE CAMPOS VIDAL CAMILO 14 December 2018 (has links)
[pt] O consumo de vídeos representa grande parte do tráfego na Internet hoje e tende a aumentar ainda mais nos próximos anos. Neste trabalho, investigamos formas de aprimorar o caching em redes de distribuição de conteúdo (Content Delivery Networks - CDNs) de vídeo para reduzir o tempo de resposta das mesmas e aumentar a qualidade de experiência dos usuários. A partir da análise de diferentes técnicas, concluímos que o hashing consistente com cargas limitadas possui características interessantes para esse fim e se encaixa adequadamente ao cenário de distribuição de vídeos. Para verificar o seu desempenho, criamos uma plataforma de experimentação e, usando dados de uma CDN de vídeos real, o confrontamos com o hashing consistente e com o método de balanceamento least connections, todos implementados de maneira equivalente para permitir uma comparação justa. Por fim, discutimos os resultados dessa avaliação, destacando os benefícios e limitações dessa técnica no contexto considerado. / [en] Video consumption accounts for a large part of Internet traffic today and tends to increase further in the next years. In this work, we investigate ways to improve caching in video content delivery networks (CDNs) to reduce their response time and increase the users quality of experience. From the analysis of different techniques, we concluded that consistent hashing with bounded loads has interesting characteristics for this purpose and fits adequately to the video delivery scenario. In order to verify its performance, we created an experimentation platform and, using data from a real video CDN, confronted it with the consistent hashing and the least connections balancing method, all implemented in an equivalent manner to permit a fair comparison. Lastly, we discussed the results of this evaluation, highlighting the benefits and limitations of this technique in the considered context.
10

[en] A DYNAMIC LOAD BALANCING MECHANISM FOR DATA STREAM PROCESSING ON DDS SYSTEMS / [pt] UM MECANISMO DE BALANCEAMENTO DE CARGA DINÂMICO PARA PROCESSAMENTO DE FLUXO DE DADOS EM SISTEMAS DDS

RAFAEL OLIVEIRA VASCONCELOS 04 November 2014 (has links)
[pt] Esta dissertação apresenta a solução de balanceamento de carga baseada em fatias de processamento de dados (Data Processing Slice Load Balancing solution) para permitir o balanceamento de carga dinâmico do processamento de fluxos de dados em sistemas baseados em DDS (Data Distribution Service). Um grande número de aplicações requer o processamento contínuo de alto volume de dados oriundos de várias fontes distribuídas., tais como monitoramento de rede, sistemas de engenharia de tráfego, roteamento inteligente de carros em áreas metropolitanas, redes de sensores, sistemas de telecomunicações, aplicações financeiras e meteorologia. Conceito chave da solução proposta é o Data Processing Slice, o qual é a unidade básica da carga de processamento dos dados dos nós servidores em um domínio DDS. A solução consiste de um nó balanceador, o qual é responsável por monitorar a carga atual de um conjunto de nós processadores homogêneos e quando um desbalanceamento de carga é detectado, coordenar ações para redistribuir entre os nós processadores algumas fatias de carga de trabalho de forma segura. Experimentos feitos com grandes fluxos de dados que demonstram a baixa sobrecarga, o bom desempenho e a confiabilidade da solução apresentada. / [en] This thesis presents the Data Processing Slice Load Balancing solution to enable dynamic load balancing of Data Stream Processing on DDS-based systems (Data Distribution Service). A large number of applications require continuous and timely processing of high-volume of data originated from many distributed sources, such as network monitoring, traffic engineering systems, intelligent routing of cars in metropolitan areas, sensor networks, telecommunication systems, financial applications and meteorology. The key concept of the proposed solution is the Data Processing Slice (DPS), which is the basic unit of data processing load of server nodes in a DDS Domain. The Data Processing Slice Load Balancing solution consists of a load balancer, which is responsible for monitoring the current load of a set of homogenous data processing nodes and when a load unbalance is detected, it coordinates the actions to redistribute some data processing slices among the processing nodes in a secure way. Experiments with large data stream have demonstrated the low overhead, good performance and the reliability of the proposed solution.

Page generated in 0.0569 seconds