Spelling suggestions: "subject:"schedule""
71 |
Pokročilé nástroje pro měření výkonu / Advanced Tools for Performance MeasurementSmrček, Jaromír January 2008 (has links)
This thesis presents the I/O layer of Linux kernel and shows various tools for tuning and optimization of its performance. Many tools are presented and their usage and outputs are studied. The thesis then focuses on the means of combining such tools to create more applicable methodology of system analysis and monitoring. The practical part consists of applying SystemTap scripts for blktrace subsystem and creating a fragmentation monitoring tool with graphical output.
|
72 |
Performance analysis and enhancement of QoS framework for fixed WiMAX networks : design, analysis and evaluation of 802.16 Point-to-Multipoint (PMP) Quality of Service Framework based on uplink scheduler and call admission control analysisLaias, Elmabruk M. January 2009 (has links)
Given the current developments and advances in the scientific and technological aspects of human knowledge and introducing new approaches in various fields of telecommunication technologies and industries, there has been an increasing growth in its players' plans and a positive change in their outlooks in order to achieve the target of "anywhere and anytime access". Recent developments of WiMAX (Worldwide interoperability for Microwave Access) networks, as a sign of increasing needs and demands for new telecommunication services and capabilities, have led to revolutions in global telecommunication which should be perceived properly in terms of the commercial and technical aspects in order to enjoy the new opportunities. Most experts believe that WiMAX technology is a preliminary step to develop Fourth Generation networks known as 4G technologies. It has not only succeeded in the utilization of several of the latest telecommunication techniques in the form of unique practical standards, but also paved the way for the quantitative and qualitative developments of high-speed broadband access. IEEE 802.16 Standard introduces several advantages, and one of them is the support for Quality of Services (QoS) at the Media Access Control (MAC) level. For these purposes, the standard defines several scheduling classes at MAC layer to treat service flow in a different way, depending on QoS requirements. In this thesis, we have proposed a new QoS framework for Point-to-Multi Point (PMP) 802.16 systems operating in Time Division Duplexing (TDD) mode over a WirelessMAN-OFDM physical layer. The proposed framework consists of a Call Admission Control (CAC) module and a scheduling scheme for the uplink traffic as well as a simple frame allocation scheme. The proposed CAC module interacts with the uplink scheduler status and it makes its decision based on the scheduler queue status; on the other hand, the proposed scheduling scheme for the uplink traffic aims to support realtime flows and adapts the frame-by-frame allocations to the current needs of the connections, with respect to the grants boundaries fixed by the CAC module. Extensive OPNET simulation demonstrates the effectiveness of the proposed architecture.
|
73 |
Computing models for networks of tiny objects / Modèles de calcul pour les réseaux d'objets à capacité restreinteOuled abdallah, Nesrine 22 May 2017 (has links)
Dans cette thèse, nous nous intéressons aux modèles de calcul dans les réseaux d'objets à capacité restreinte, tels que les réseaux de capteurs sans fil. Nous nous focalisons sur les protocoles de population proposés par Angluin et al. Dans ce modèle, les objets sont représentés par des agents à états finis, passivement mobiles, communiquant entre paires et formant un réseau asynchrone et anonyme. Nous présentons deux études comparatives qui nous permettent par la suite de proposer une approche établissant le lien des protocoles de population avec deux autres modèles : le modèle des tâches avec les systèmes de réécritures de graphes, et le modèle asynchrone et anonyme d'échange de messages. Nous passons ensuite au problème d'ordonnancement dans les protocoles de population. Nous proposons un nouvel ordonnanceur probabiliste, 1-central, basé sur les rendez-vous randomisés et appelé HS Scheduler. Contrairement aux autres ordonnanceurs,il permet à plus d'une paire de communiquer à la fois. Nous prouvons qu'il est équitable avec probabilité 1. Nous analysons par la suite les termes Nous analysons par la suite les temps de stabilisation de certains protocoles s'exécutant sous le Random Scheduler ou le HS Scheduleret sur différentes topologies du graphe d'interaction. Nous prouvons que le HS Scheduler est équivalent en temps au Random Scheduler quand le graphe d'interaction est complet mais qu'il permet une stabilisation plus rapide quand le graphe est aléatoire. Par la suite,nous proposons un autre ordonnanceur qui prend en considération les états des agents et permet d'introduire la terminaison à certains protocoles : le Prorotol Aware HS Scheduler.Nous prouvons qu'il est équitable avec probabilité 1. Nous faisons l'analyse des temps de stabilisation de certains protocoles s'exécutant sous cet ordonnanceur en considérant différentes topologies du graphe d'interaction. Finalement, nous implémentons et simulons sur ViSiDiA l'ensemble des scénarios étudiés et validons nos résultats théoriques. / In this work, we consider computing models for networks of tiny objects suchas wireless sensor networks. We focus on the population protocols, a pairwise computationalmodel introduced by Angluin et al. where the tiny objects are represented byanonymous, passively mobile, finite state agents forming asynchronous networks. Weestablish two comparative studies between the population protocol model (and its extensions)and the two following ones: tasks with graph relabeling systems, and anonymousasynchronous message passing. These studies aim to establish possible mappings betweenthe population protocols and these two models. We then focus on the scheduling of thepairwise interactions in population protocols. We propose the HS Scheduler, a new probabilistic1-central scheduler based on randomized handshakes. Compared to the existingschedulers, this scheduler allows to more than one pair of agents to communicate simultaneously.We prove that this scheduler is fair with probability 1. We thereafter presentanalyses of the complexity of the stabilization time of some protocols running under thescheduling of the Random Scheduler and the HS Scheduler, and over different topologiesof the interaction graph. We prove that these two schedulers are time equivalent withComputing Models for Networks of Tiny Objects iiirespect to these protocols when the interaction graph is complete, however computationsunder the HS Scheduler stabilize faster when the interaction graph is random. We then introducethe Protocol Aware HS Scheduler, a slightly modifed version of the HS Schedulerthat takes into account the states of the agents and allows termination in some protocols.We also prove that this scheduler is fair with probability 1. We present analyses of thetime complexity of some protocols running under the scheduling of the Protocol AwareHS Scheduler and over dfferent structures of the interaction graph. We implement thedifferent scenarios in ViSiDiA, and validate through simulations our theoretical results.
|
74 |
SLA-Aware Adaptive Data Broadcasting in Wireless EnvironmentsPopescu, Adrian Daniel 16 February 2010 (has links)
In mobile and wireless networks, data broadcasting for popular data items enables the efficient utilization of the limited wireless bandwidth. However, efficient data scheduling schemes are needed to fully exploit the benefits of data broadcasting. Towards this goal, several broadcast scheduling policies have been proposed. These existing schemes have mostly focused on either minimizing response time, or drop rate, when requests are associated with hard deadlines.
The inherent inaccuracy of hard deadlines in a dynamic mobile environment motivated us to use Service Level Agreements (SLAs) where a user specifies the utility of data as a function of its arrival time. Moreover, SLAs provide the mobile user with an already familiar quality of service specification
from wired environments. Hence, in this
dissertation, we propose SAAB, an SLA-aware adaptive data broadcast scheduling policy for maximizing the system utility under SLA-based performance measures. To achieve this goal, SAAB considers both the
characteristics of disseminated data objects as well as the SLAs associated with them. Additionally, SAAB automatically adjusts to the system workload conditions which enables it to constantly outperform existing broadcast scheduling policies.
|
75 |
SLA-Aware Adaptive Data Broadcasting in Wireless EnvironmentsPopescu, Adrian Daniel 16 February 2010 (has links)
In mobile and wireless networks, data broadcasting for popular data items enables the efficient utilization of the limited wireless bandwidth. However, efficient data scheduling schemes are needed to fully exploit the benefits of data broadcasting. Towards this goal, several broadcast scheduling policies have been proposed. These existing schemes have mostly focused on either minimizing response time, or drop rate, when requests are associated with hard deadlines.
The inherent inaccuracy of hard deadlines in a dynamic mobile environment motivated us to use Service Level Agreements (SLAs) where a user specifies the utility of data as a function of its arrival time. Moreover, SLAs provide the mobile user with an already familiar quality of service specification
from wired environments. Hence, in this
dissertation, we propose SAAB, an SLA-aware adaptive data broadcast scheduling policy for maximizing the system utility under SLA-based performance measures. To achieve this goal, SAAB considers both the
characteristics of disseminated data objects as well as the SLAs associated with them. Additionally, SAAB automatically adjusts to the system workload conditions which enables it to constantly outperform existing broadcast scheduling policies.
|
76 |
An examination of Linux and Windows CE embedded operating systemsTrivedi, Anish Chandrakant 04 January 2011 (has links)
The software that operates mobile and embedded devices, the embedded
operating system, has evolved to adapt from the traditional desktop environment, where processing horsepower and energy supply are abundant, to the challenging resource-starved embedded environment. The embedded environment presents the software with some difficult constraints when compared to the typical desktop environment: slower hardware, smaller memory size, and a limited battery life. Different embedded OSs tackle these constraints in different ways. We survey two of the more popular embedded OSs: Linux and Windows CE. To reveal their strengths and weaknesses, we examine and compare each of the OS’s process management and scheduler, interrupt handling, memory management, synchronization mechanisms and interprocess communication, and power management. / text
|
77 |
Enhancing GPGPU Performance through Warp Scheduling, Divergence Taming and Runtime Parallelizing TransformationsAnantpur, Jayvant P January 2017 (has links) (PDF)
There has been a tremendous growth in the use of Graphics Processing Units (GPU) for the acceleration of general purpose applications. The growth is primarily due to the huge computing power offered by the GPUs and the emergence of programming languages such as CUDA and OpenCL. A typical GPU consists of several 100s to a few 1000s of Single Instruction Multiple Data (SIMD) cores, organized as 10s of Streaming Multiprocessors (SMs), each having several SIMD cores which operate in a lock-step manner, o ering a few TeraFLOPS of performance in a single socket. SMs execute instructions from a group of consecutive threads, called warps. At each cycle, an SM schedules a warp from a group of active warps and can context switch among the active warps to hide various stalls. However, various factors, such as global memory latency, divergence among warps of a thread block (TB), branch divergence among threads of a warp (Control Divergence), number of active warps, etc., can significantly impact the ability of a warp scheduler to hide stalls. This reduces the speedup of applications running on the GPU. Further, applications containing loops with potential cross iteration dependences, do not utilize the available resources (SIMD cores) effectively and hence su er in terms of performance. In this thesis, we propose several mechanisms which address the above issues and enhance the performance of GPU applications through efficient warp scheduling, taming branch and warp divergence, and runtime parallelization.
First, we propose RLWS, a Reinforcement Learning (RL) based Warp Scheduler which uses unsupervised learning to schedule warps based on the current state of the core and the long-term benefits of scheduling actions. As the design space involving the state variables used by the RL and the RL parameters (such as learning and exploration rates, reward and penalty values, etc.) is large, we use a Genetic Algorithm to identify the useful subset of state variables and RL parameter values. We evaluated the proposed RL based scheduler using the GPGPU-SIM simulator on a large number of applications from the Rodinia, Parboil, CUDA-SDK and GPGPU-SIM benchmark suites. Our RL based implementation achieved an average speedup of 1.06x over the Loose Round Robin (LRR) strategy and 1.07x over the Two-Level (TL) strategy. A salient feature of RLWS is that it is robust, i.e., performs nearly as well as the best performing warp scheduler, consistently across a wide range of applications. Using the insights obtained from RLWS, we designed PRO, a heuristic warp scheduler which in addition to hiding the long latencies of certain operations, reduces the waiting time of warps at synchronization points. Evaluation of the proposed algorithm using the GPGPU-SIM simulator on a diverse set of applications showed an average speedup of 1.07x over the LRR warp scheduler and 1.08x over the TL warp scheduler.
In the second part of the thesis, we address problems due to warp and branch divergences. First, many GPU kernels exhibit warp divergence due to various reasons such as, different amounts of work, cache misses, and thread divergence. Also, we observed that some kernels contain code which is redundant across TBs, i.e., all TBs will execute the code identically and hence compute the same results. To improve performance of such kernels, we propose a solution based on the concept of virtual TBs and loop independent code motion. We propose necessary code transformations which enable one virtual TB to execute the kernel code for multiple real TBs. We evaluated this technique using the GPGPU-SIM simulator on a diverse set of applications and observed an average improvement of 1.08x over the LRR and 1.04x over the Greedy Then Old (GTO) warp scheduling algorithms. Second, branch divergence causes execution of diverging branches to be serialized to execute only one control ow path at a time. Existing stack based hardware mechanism to reconverge threads causes duplicate execution of code for unstructured control ow graphs (CFG). We propose a simple and elegant transformation to convert an unstructured CFG to a structured CFG. The transformation eliminates duplicate execution of user code while incurring only a linear increase in the number of basic blocks and also the number of instructions. We implemented the proposed transformation at the PTX level using the Ocelot compiler infrastructure and demonstrate that the pro-posed technique is effective in handling the performance problem due to divergence in unstructured CFGs.
Our third proposal is to enable efficient execution of loops with indirect memory accesses that can potentially cause cross iteration dependences. Such dependences are hard to detect using existing compilation techniques. We present an algorithm to compute at run-time, the cross iteration dependences in such loops, using both the CPU and the GPU. It effectively uses the compute capabilities of the GPU to collect the memory accesses performed by the iterations. Using the dependence information, the loop iterations are levelized such that each level contains independent iterations which can be executed in parallel. Experimental evaluation on real hardware (NVIDIA GPUs) reveals that the proposed technique can achieve an average speedup of 6.4x on loops with a reasonable number of cross iteration dependences.
|
78 |
Escalonamento de aplicações paralelas: de clusters para gridsJacinto, Daniele Santini 24 August 2007 (has links)
Made available in DSpace on 2016-06-02T19:05:26Z (GMT). No. of bitstreams: 1
1631.pdf: 1988300 bytes, checksum: e305adb917a8fdf720897942982390b7 (MD5)
Previous issue date: 2007-08-24 / Different algorithms provide efficient scheduling of parallel applications on distributed and
heterogeneous computational platforms, such as computational grids.
Most scheduling algorithms for such environments require an application model represented
by a directed acyclic graph (DAG), selecting tasks for execution according to their processing
and communication characteristics.
The obtainment of DAGs for real applications, however, is not a simple quest. The required
knowledge about the application tasks and the communication among them, considering
existing transmission cycles, harden the elaboration of appropriate graphs.
Particularly, MPI programs, that represent a meaningful portion of existing parallel applications,
usually present a cyclic communication model among the master and the processing
nodes. This behavior prevents most scheduling algorithms to be employed as they recursively
traverse the graphs to prioritize the tasks.
In this sense, this work presents a mechanism for the automatic creation of DAGs for real
MPI application originally developed for homogeneous clusters. In order to do so, applications
go through a monitored execution in a cluster and the collected data are used for the elaboration
of an appropriate DAGs. Data dependencies are identified and existing cycles among the
tasks are eliminated. The HEFT scheduling algorithm is used to evaluate the application model
and the schedule obtained is then automatically converted into an RSL (Resource Specification
Language) file for execution in a grid with Globus.
Results from running real applications and simulations show using the grid can be advantageous. / Algoritmos diferentes possibilitam o escalonamento eficiente de aplicações paralelas em
plataformas computacionais heterogêneas e distribuídas, tais como grids computacionais. Vários
algoritmos de escalonamento para esses ambientes necessitam de um modelo de aplicação
representado por um grafo acíclico direcionado (GAD), selecionando tarefas para execução de
acordo com suas características de comunicação e de processamento.
A obtenção de um GAD para uma aplicação real, contudo, não é uma questão simples.
O conhecimento necessário sobre as tarefas da aplicação e as comunicações entre elas, considerando
ciclos de transmissão, dificulta a elaboração de um grafo apropriado.
Particularmente, programas MPI, os quais representam uma parcela significativa das aplicações
paralelas, apresentam um modelo de comunicação cíclico entre o nó master e os nós
de processamento. Esse comportamento impede a utilização de muitos algoritmos de escalonamento
devido ao fato de eles percorrerem o grafo recursivamente para priorizar as tarefas. Nesse sentido, esse trabalho apresenta um mecanismo para a criação automática de GADs
para aplicações MPI reais originalmente desenvolvidas para clusters homogêneos. Para essa
implementação, aplicações são monitoradas durante a execução em um cluster e os dados coletados
são usados para a elaboração de um GADs apropriados. Dependências de dados são
identificadas e ciclos existentes entre as tarefas são eliminados. O algoritmo de escalonamento HEFT é usado para avaliar o modelo de aplicação e o
escalonamento obtido é então automaticamente convertido em um arquivo RSL (Resource Specification
Language) para execução em um grid com Globus.
Resultados de execuções de aplicações reais e simulações demonstram que o uso de grid
pode ser vantajoso.
|
79 |
Avaliação comparativa de modulações candidatas às redes 5G baseadas em LTE e escalonamento de recursos considerando fila e qualidade de canalSouza, Dalton Foltran de 04 July 2018 (has links)
Submitted by Luciana Ferreira (lucgeral@gmail.com) on 2018-08-27T11:17:25Z
No. of bitstreams: 2
Dissertação - Dalton Foltran de Souza - 2018.pdf: 3869661 bytes, checksum: 5fa93e5e35c983cbefa148826a07a3ef (MD5)
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2018-08-27T11:20:32Z (GMT) No. of bitstreams: 2
Dissertação - Dalton Foltran de Souza - 2018.pdf: 3869661 bytes, checksum: 5fa93e5e35c983cbefa148826a07a3ef (MD5)
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-08-27T11:20:32Z (GMT). No. of bitstreams: 2
Dissertação - Dalton Foltran de Souza - 2018.pdf: 3869661 bytes, checksum: 5fa93e5e35c983cbefa148826a07a3ef (MD5)
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Previous issue date: 2018-07-04 / With the development of the next generation of mobile communications systems for 5G, several technologies are being studied aiming to reach the new requirements in new application scenarios. Among them, is the use of new modulations with higher spectral efficiency in substitution of OFDM, such as F-OFDM and UFMC, and also the scheduling algorithms which are in charge of sharing resources between users. In this work, we evaluated the application of F-OFDM and UFMC, 5G modulations candidates, on downlink LTE compared with OFDM and evaluated the schedulers Round Robin, QoS Guaranteed and PSO dealing with additional resources provided by the tested modulations. For that, we compared performance considering parameters like fairness, latency, throughput and spectral efficiency. The results showed that the downlink LTE improved performance in all evaluated parameters with UFMC modulation. In fact, there were a performance improvement in all schedulers evaluated. For example, PSO based scheduler improved latency and throughput while QoS Guaranteed reached the lowest loss, as the highest fairness were reached by QoS Guaranteed and Round Robin. Also, we propose a scheduling algorithm that takes into account the queue size in the user buffer and channel quality to maximize throughput and fairness in the LTE downlink network. The metrics evaluated were transmission efficiency, throughput, fairness, delay and losses. The proposed algorithm achieved better results for all evaluated metrics. / Com o desenvolvimento da próxima geração dos sistemas de comunicação móvel sem fio (5G) diversas tecnologias estão sendo estudadas com o objetivo de se atender aos novos requisitos de desempenho em diferentes cenários de aplicação. Dentre elas, está a utilização de modulações com maior eficiência espectral em substituição à OFDM, como F-OFDM e UFMC, como também os algoritmos de escalonamento que são responsáveis pelo compartilhamento dos recursos aos usuários. Neste trabalho, avaliamos a aplicação no downlink LTE das modulações F-OFDM e UFMC, candidatas ao 5G, comparadas a OFDM e avaliamos os escalonadores Round Robin, QoS Garantido e PSO ao lidar com recursos adicionais disponíveis proporcionados pelas modulações estudadas. Para tal, realizamos a análise de parâmetros de desempenho de tráfego, tais como vazão, índice de justiça, perda de dados e retardo médio. Os resultados mostraram que o enlace de descida do LTE apresentou melhor desempenho para todos os parâmetros analisados com a modulação UFMC. De fato, foram obtidas melhorias de desempenho para todos escalonadores avaliados. Como exemplo, o escalonador baseado em PSO apresentou melhorias no retardo e vazão, enquanto o escalonador QoS Garantido obteve menor taxa de perda de dados, sendo que índices de justiça mais elevados foram obtidos para os escalonadores QoS Garantido e Round Robin. Ainda, propusemos um escalonador que considera o tamanho da fila no buffer e a qualidade de canal visando maximizar a vazão e o índice de justiça no downlink da rede LTE. As métricas avaliadas foram eficiência de transmissão, vazão, índice de justiça, retardo e perda de dados. O algoritmo proposto alcançou melhores resultados em todas as métricas avaliadas em relação aos outros algoritmos considerados.
|
80 |
Problematika přechodu od jednojádrové k vícejádrové implementaci operačního systému / Issue of Migrating from Single-Core to Multi-Core Implementation of Operating SystemMatyáš, Jan January 2014 (has links)
This thesis discuss necessary changes needed in order to run MicroC/OS-II on multicore processor, mainly Zynq 7000 All Programmable SoC which uses two ARM Cortex-A9 cores. Problems that arise during this transition are also discussed.
|
Page generated in 0.0624 seconds