• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 258
  • 98
  • 21
  • 16
  • 11
  • 9
  • 9
  • 9
  • 8
  • 6
  • 5
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 527
  • 527
  • 91
  • 78
  • 77
  • 67
  • 65
  • 57
  • 55
  • 54
  • 51
  • 38
  • 37
  • 36
  • 35
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

Analyzing the memory behavior of parallel scientific applications / Analyse du comportement mémoire d'applications parallèles de calcul scientifique

Beniamine, David 05 December 2016 (has links)
Depuis plusieurs décennies, afin de réduire la consommation énergétique des processeurs, les constructeurs fabriquent des ordinateurs de plus en plus parallèles.Dans le même temps, l'écart de fréquence entre les processeurs et la mémoire a significativement augmenté.Pour compenser cet écart, les processeurs modernes embarquent une hiérarchie de caches complexe.Développer un programme efficace sur de telles machines est une tâche complexe.Par conséquent, l'analyse de performance est devenue une étape majeure lors du développement d'applications requérant des performances.La plupart des outils d'analyse de performances se concentrent sur le point de vue du processeur.Ces outils voient la mémoire comme une entité monolithique et sont donc incapable de comprendre comment elle est accédée.Cependant, la mémoire est une ressource critique et les schémas d'accès à cette dernière peuvent impacter les performances de manière significative.Quelques outils permettant l'analyse de performances mémoire existent, cependant ils sont basé sur un échantillon age à large grain.Par conséquent, ces outils se concentrent sur une petite partie de l’Exécution et manquent le comportement global de l'application.De plus, l'échantillonnage à large granularité ne permet pas de collecter des schémas d'accès.Dans cette thèse, nous proposons deux outils différences pour analyser le comportement mémoire d'une application.Le premier outil est conçu spécifiquement pour pour les machines NUMA (Not Uniform Memory Accesses) et fournit plusieurs visualisations du schéma global de partage de chaque structure de données entre les flux d’ExécutionLe deuxième outil collecte des traces mémoires a grain fin avec information temporelles.Nous proposons de visualiser ces traces soit à l'aide d'un outil générique de gestion de traces soit en utilisant une approche programmatique basé sur R.De plus nous évaluons ces deux outils en les comparant a des outils existant de trace mémoire en terme de performances, précision et de complétude. / Since a few decades, to reduce energy consumption, processor vendors builds more and more parallel computers.At the same time, the gap between processors and memory frequency increased significantly.To mitigate this gap, processors embed a complex hierarchical caches architectureWriting efficient code for such computers is a complex task.Therefore, performance analysis has became an important step of the development of applications seeking for performances.Most existing performance analysis tools focuses on the point of view of the processor.Theses tools see the main memory as a monolithic entity and thus are not able to understand how it is accessed.However, memory is a common bottleneck in High Performances Computing, and the pattern of memory accesses can impact significantly the performances.There are a few tools to analyze memory performances, however theses tools are based on a coarse grain sampling.Consequently, they focus on a small part of the execution missing the global memory behavior.Furthermore, these coarse grain sampling are not able to collect memory accesses patterns.In this thesis we propose two different tools to analyze the memory behavior of an application.The first tool is designed specifically for Not Uniform Memory Accesses machines and provides some visualizations of the global sharing pattern inside each data structure between the threads.The second one collects fine grain memory traces with temporal information.We can visualize theses traces either with a generic trace management framework or with a programmatic exploration using R.Furthermore we evaluate both of these tools, comparing them with state of the art memory analysis tools in terms of performances, precision and completeness.
162

An Analytical Approach to Efficient Circuit Variability Analysis in Scaled CMOS Design

January 2011 (has links)
abstract: Process variations have become increasingly important for scaled technologies starting at 45nm. The increased variations are primarily due to random dopant fluctuations, line-edge roughness and oxide thickness fluctuation. These variations greatly impact all aspects of circuit performance and pose a grand challenge to future robust IC design. To improve robustness, efficient methodology is required that considers effect of variations in the design flow. Analyzing timing variability of complex circuits with HSPICE simulations is very time consuming. This thesis proposes an analytical model to predict variability in CMOS circuits that is quick and accurate. There are several analytical models to estimate nominal delay performance but very little work has been done to accurately model delay variability. The proposed model is comprehensive and estimates nominal delay and variability as a function of transistor width, load capacitance and transition time. First, models are developed for library gates and the accuracy of the models is verified with HSPICE simulations for 45nm and 32nm technology nodes. The difference between predicted and simulated σ/μ for the library gates is less than 1%. Next, the accuracy of the model for nominal delay is verified for larger circuits including ISCAS'85 benchmark circuits. The model predicted results are within 4% error of HSPICE simulated results and take a small fraction of the time, for 45nm technology. Delay variability is analyzed for various paths and it is observed that non-critical paths can become critical because of Vth variation. Variability on shortest paths show that rate of hold violations increase enormously with increasing Vth variation. / Dissertation/Thesis / M.S. Electrical Engineering 2011
163

Performance Analysis of MIMO Relay Networks with Beamforming

January 2012 (has links)
abstract: This dissertation considers two different kinds of two-hop multiple-input multiple-output (MIMO) relay networks with beamforming (BF). First, "one-way" amplify-and-forward (AF) and decode-and-forward (DF) MIMO BF relay networks are considered, in which the relay amplifies or decodes the received signal from the source and forwards it to the destination, respectively, where all nodes beamform with multiple antennas to obtain gains in performance with reduced power consumption. A direct link from source to destination is included in performance analysis. Novel systematic upper-bounds and lower-bounds to average bit or symbol error rates (BERs or SERs) are proposed. Second, "two-way" AF MIMO BF relay networks are investigated, in which two sources exchange their data through a relay, to improve the spectral efficiency compared with one-way relay networks. Novel unified performance analysis is carried out for five different relaying schemes using two, three, and four time slots in sum-BER, the sum of two BERs at both sources, in two-way relay networks with and without direct links. For both kinds of relay networks, when any node is beamforming simultaneously to two nodes (i.e. from source to relay and destination in one-way relay networks, and from relay to both sources in two-way relay networks), the selection of the BF coefficients at a beamforming node becomes a challenging problem since it has to balance the needs of both receiving nodes. Although this "BF optimization" is performed for BER, SER, and sum-BER in this dissertation, the solution for optimal BF coefficients not only is difficult to implement, it also does not lend itself to performance analysis because the optimal BF coefficients cannot be expressed in closed-form. Therefore, the performance of optimal schemes through bounds, as well as suboptimal ones such as strong-path BF, which beamforms to the stronger path of two links based on their received signal-to-noise ratios (SNRs), is provided for BERs or SERs, for the first time. Since different channel state information (CSI) assumptions at the source, relay, and destination provide different error performance, various CSI assumptions are also considered. / Dissertation/Thesis / Ph.D. Electrical Engineering 2012
164

ROBIN HOOD : um ambiente para a avaliação de políticas de balanceamento de carga / Robin Hood: an environment to load balancing policies evaluation

Nogueira, Mauro Lucio Baioneta January 1998 (has links)
É ponto passivo a importância dos sistemas distribuídos no desenvolvimento da computação de alto desempenho nas próximas décadas. No entanto, ainda muito se debate sobre políticas de gerenciamento adequadas para os recursos computacionais espacialmente dispersos disponíveis em tais sistemas. Políticas de balanceamento de carga procuram resolver o problema da ociosidade das maquinas(ou, por outro lado, da super-utilização) em um sistema distribuído. Não são raras situações nas quais somente algumas maquinas da rede estão sendo efetivamente utilizadas, enquanto que varias outras se encontram subutilizadas, ou mesmo completamente ociosas. Aberta a possibilidade de executarmos remotamente uma tarefa, com o intuito de reduzirmos o tempo de resposta da mesma, ainda falta decidirmos "como" fazê-lo. Das decisões envolvidas quanto a execução remota de tarefas tratam as políticas de balanceamento de carga. Tais políticas, muito embora a aparente simplicidade quanto as decisões de controle tomadas ou ao reduzido numero de parâmetros envolvidos, não possuem um comportamento fácil de se prever. Sob determinadas condições, tais políticas podem ser tomar excessivamente instáveis, tomando sucessivas decisões equivocadas e, como consequência, degradando de forma considerável o desempenho do sistema. Em tais casos, muitas das vezes, melhor seria não tê-las. Este trabalho apresenta um ambiente desenvolvido com o objetivo de auxiliar projetistas de sistema ou analistas de desempenho a construir, simular e compreender mais claramente o impacto causado pelas decisões de balanceamento no desempenho do sistema. / There is no doubts about the importance of distributed systems in the development of high performance computing in the next decades. However, there are so much debates about appropriated management policies to spatially scattered computing resources available in this systems. Load balancing policies intend to resolve the problem of underloaded machines (or, in other hand, overloaded machines) in a distributed system. Moments in which few machines are really being used, meanwhile several others are underused, or even idle, aren't rare. Allowed the remote execution of tasks in order to decrease the response time of theirs, it remains to decide 'how' to do it. Load balancing policies deal with making decisions about remote execution. Such policies, in spite of the supposed simplicity about their control decisions and related parameters, doesn't have a predictable behavior. In some cases, such policies can become excessively unstable, making successive wrong decisions and, as consequence, degrading the system performance. In such cases, it's better no policy at all. This work presents an environment developed whose purpose is to help system designers or performance analysts to build, to simulate and to understand the impact made by balancing decisions over the system performance.
165

Análise de desempenho de uma aplicação VoIP em redes veiculares / Performance analysis of a VoIP application in vehicular networks

Vieira, Leandro Kravczuk January 2011 (has links)
VIEIRA, Leandro Kravczuk. Análise de desempenho de uma aplicação VoIP em redes veiculares. 2011. 136 f. Dissertação (Mestrado em ciência da computação)- Universidade Federal do Ceará, Fortaleza-CE, 2011. / Submitted by Elineudson Ribeiro (elineudsonr@gmail.com) on 2016-07-11T17:42:55Z No. of bitstreams: 1 2011_dis_lkvieira.pdf: 2177171 bytes, checksum: b12f234dbae0e70d63d4ffaacdf15aa0 (MD5) / Approved for entry into archive by Rocilda Sales (rocilda@ufc.br) on 2016-07-18T15:59:47Z (GMT) No. of bitstreams: 1 2011_dis_lkvieira.pdf: 2177171 bytes, checksum: b12f234dbae0e70d63d4ffaacdf15aa0 (MD5) / Made available in DSpace on 2016-07-18T15:59:47Z (GMT). No. of bitstreams: 1 2011_dis_lkvieira.pdf: 2177171 bytes, checksum: b12f234dbae0e70d63d4ffaacdf15aa0 (MD5) Previous issue date: 2011 / Vehicular networks have emerged as a particular case of mobile networks and then became a specific field of research in computer networks. They have been the subject of numerous scientific research in recent years, whose main focus is the development of Intelligent Transport System. Furthermore, given that cars are increasingly important in people's lives, smart board software in their cars can substantially improve the quality of life of users. This fact and the significant market demand for more reliability, security and entertainment in vehicles, has led to significant development and support for vehicular networks and their applications. Among these applications we can mention the use of VoIP, however, VoIP applications suffer from problems of delay, packet loss and jitter. These technical challenges are further aggravated when used in wireless networks. One factor that directly influences the use of an application in wireless networks is the routing protocol. Routing is a challenging task due to the high node mobility, the instability of wireless links and the diversity of scenarios. For this reason, several routing protocols have been designed with the goal of solving one or more specific problems of each scenario. However, although there are several proposed solutions to the problem routing in vehicular networks, no general solution was found, in other words, any proposed protocol obtained considerable performance in the various scenarios that exist in vehicular networks. Thus, in this paper, we analyze through simulations the impact of density, of the reach of transmission, the mobility and the type of routing protocol on the performance of a VoIP application in urban and highway scenarios of vehicular networks. / As redes veiculares surgiram como um caso particular de redes móveis e passaram a formar um campo específico de pesquisa na área de redes de computadores. Elas têm sido alvo de inúmeras pesquisas científicas nos últimos anos, cujo principal foco é o desenvolvimento do Sistema Inteligente de Transporte. Além disso, dado que os automóveis são cada vez mais importantes na vida das pessoas, embarcar softwares inteligentes em seus carros pode melhorar substancialmente a qualidade de vida dos usuários. Esse fato, somado à significante demanda do mercado por mais confiabilidade, segurança e entretenimento nos veículos, levou ao desenvolvimento e suporte significantes para as redes veiculares e suas aplicações. Dentre estas aplicações pode-se citar a utilização do VoIP. Entretanto, os aplicativos VoIP sofrem com problemas de atraso, perda de pacotes e jitter. Estes desafios técnicos se agravam ainda mais quando utilizado em redes sem fio. Um fator que influencia diretamente a utilização de uma aplicação em redes em fio é o protocolo de roteamento. O roteamento é uma tarefa desafiadora devido à alta mobilidade dos nós, à instabilidade dos enlaces sem-fio e a diversidade de cenários. Por essa razão, diversos protocolos de roteamento foram projetados com o objetivo de solucionar um ou mais problemas específicos de cada cenário. Entretanto, apesar de existirem várias soluções propostas para o problema do roteamento em redes veiculares, nenhuma solução geral foi encontrada, ou seja, nenhum protocolo proposto obteve desempenho considerável nos diversos cenários existentes nas redes veiculares. Sendo assim, nesta dissertação, analisamos através de simulações o impacto da densidade, do alcance de transmissão, da mobilidade e do tipo de protocolo de roteamento no desempenho de uma aplicação VoIP nos cenários urbano e de rodovia em redes veiculares.
166

FOLE: Um framework conceitual para avaliação de desempenho da elasticidade em ambientes de computação em nuvem / FOLE: A conceptual framework for elasticity performance analysis in cloud computing environments

Coutinho, Emanuel Ferreira January 2014 (has links)
COUTINHO, Emanuel Ferreira. FOLE: Um framework conceitual para avaliação de desempenho da elasticidade em ambientes de computação em nuvem. 2014. 151 f. Tese (Doutorado em ciência da computação)- Universidade Federal do Ceará, Fortaleza-CE, 2014. / Submitted by Elineudson Ribeiro (elineudsonr@gmail.com) on 2016-07-12T19:32:22Z No. of bitstreams: 1 2014_tese_efcoutinho.pdf: 4432785 bytes, checksum: c8a0bea866dc212f28683d4c6ecef785 (MD5) / Approved for entry into archive by Rocilda Sales (rocilda@ufc.br) on 2016-07-25T11:37:49Z (GMT) No. of bitstreams: 1 2014_tese_efcoutinho.pdf: 4432785 bytes, checksum: c8a0bea866dc212f28683d4c6ecef785 (MD5) / Made available in DSpace on 2016-07-25T11:37:49Z (GMT). No. of bitstreams: 1 2014_tese_efcoutinho.pdf: 4432785 bytes, checksum: c8a0bea866dc212f28683d4c6ecef785 (MD5) Previous issue date: 2014 / Currently, many customers and providers are using resources of Cloud Computing environments,such as processing and storage, for their applications and services. Through ease of use, based on the pay per use model, it is natural that the number of users and their workloads also grow. As a result, providers should expand their resources and maintain the agreed level of quality for customers, otherwise breaks the Service Level Agreement (SLA) and the resulting penalties. With the increase in computational resources usage, a key feature of Cloud Computing has become quite attractive: the elasticity. Elasticity can be defined as how a computational cloud adapts to variations in its workload through resources provisioning and deprovisioning. Due to limited availability of information regarding configuration of the experiments, in general is not trivial to implement elasticity concepts, much less apply them in cloud environments. Furthermore, the way of measuring cloud elasticity is not obvious, and there is not yet a standard for this task. Moreover, its evaluation could be performed in different ways due to many technologies and strategies for providing cloud elasticity. A common aspect of elasticity performance analysis is the use of environmental resources, such as CPU and memory, and even without a specific metric, to allow an indirectly assess of elasticity. In this context, this work proposes FOLE, a conceptual framework for conducting performance analysis of elasticity in Cloud Computing environments in a systematic, flexible and reproducible way. To support the framework, we proposed a set of specific metrics for elasticity and metrics for its indirect measurement. For the measurement of elasticity in Cloud Computing, we proposed metrics based on concepts of Physics, such as strain and stress, and Microeconomics, such as Price Elasticity of Demand. Additionally, we also proposed metrics based on resources allocation and deallocation operation times, and used resources, to support the measurement of elasticity. For verification and validation of the proposal, we performed two experiments, one in a private cloud and other in a hybrid cloud, using microbenchmarks and a classic scientific application, through a designed infrastructure based on concepts of Autonomic Computing. Through these experiments, FOLE had validated their activities, allowing the systematization of a elasticity performance analysis. The results show it is possible to assess the elasticity of a Cloud Computing environment using specific metrics based on other areas of knowledge, and also complemented by metrics related to time and resources operations satisfactorily. / Atualmente muitos clientes e provedores estão utilizando recursos de ambientes de Computação em Nuvem, tais como processamento e armazenamento, para suas aplicações e serviços. Devido à facilidade de utilização, baseada no modelo de pagamento por uso, é natural que a quantidade de usuários e suas respectivas cargas de trabalho também cresçam. Como consequência, os provedores devem ampliar seus recursos e manter o nível de qualidade acordado com os clientes, sob pena de quebras do Service Level Agreement (SLA) e consequentes multas. Com o aumento na utilização de recursos computacionais, uma das características principais da Computação em Nuvem tem se tornado bastante atrativa: a elasticidade. A elasticidade pode ser definida como o quanto uma nuvem computacional se adapta a variações na sua carga de trabalho através do provisionamento e desprovisionamento de recursos. Devido à pouca disponibilidade de informação em relação à configuração dos experimentos, em geral não é trivial implementar conceitos de elasticidade, muito menos aplicá-los em ambientes de nuvens computacionais. Além disso, a maneira de se medir a elasticidade não é tão óbvia, e bastante variada, não havendo ainda uma padronização para esta tarefa, e sua avaliação pode ser executada de diferentes maneiras devido às diversas tecnologias e estratégias para o provimento da elasticidade. Um aspecto comum na avaliação de desempenho da elasticidade é a utilização de recursos do ambiente, como CPU e memória, e mesmo sem ter uma métrica específica para a elasticidade, é possível se obter uma avaliação indireta. Nesse contexto, este trabalho propõe o FOLE, um framework conceitual para a realização de análise de desempenho da elasticidade em nuvens computacionais de maneira sistemática, flexível e reproduzível. Para apoiar o framework, métricas específicas para a elasticidade e métricas para sua medição indireta foram propostas. Para a medição da elasticidade em Computação em Nuvem, propomos métricas baseadas em conceitos da Física, como tensão e estresse, e da Microeconomia, como Elasticidade do Preço da Demanda. Adicionalmente, métricas baseadas em tempos de operações de alocação e desalocação de recursos, e na utilização desses recursos foram propostas para apoiar a medição da elasticidade. Para verificação e validação da proposta, dois estudos de caso foram realizados, um em uma nuvem privada e outro em uma nuvem híbrida, com experimentos projetados utilizando microbenchmarks e uma aplicação científica clássica, executados sobre uma infraestrutura baseada em conceitos de Computação Autonômica. Por meio desses experimentos, o FOLE foi validado em suas atividades, permitindo a sistematização de uma análise de desempenho da elasticidade. Os resultados mostram que é possível avaliar a elasticidade de um ambiente de Computação em Nuvem por meio de métricas específicas baseadas em conceitos de outras áreas de conhecimento, e também complementada por métricas relacionadas a tempos de operações e recursos de maneira satisfatória.
167

MUSCLE FATIGUE ANALYSIS IN MINIMALLY INVASIVE SURGERY

Panahi, Ali 01 December 2016 (has links)
Due to its inherent complexity such as limited work volume and degree of freedom, minimally invasive surgery (MIS) is ergonomically challenging to surgeons than traditional open surgery. Specifically, MIS can expose performing surgeons to excessive ergonomic risks including muscle fatigue that may lead to critical errors in surgical procedures. Therefore, detecting the vulnerable muscles and time-to-fatigue during MIS is of great importance in order to prevent these errors. In this research, different surgical skill and ergonomic assessment methods are reviewed and their advantages and disadvantages are studied. According to the literature review, which is included in chapter 1, some of these methods are subjective and those that are objective provide inconsistent results. Muscle fatigue analysis has shown promising results for skill and ergonomic assessments. However, due to the data analysis issues, this analysis has only been successful in intense working conditions. The goal of this research is to apply an appropriate data analysis method to minimally invasive surgical setting which is considered as a low-force muscle activity. Therefore, surface electromyography is used to record muscle activations of subjects while they performed various real laparoscopic operations and dry lab surgical tasks. The muscle activation data is then reconstructed using Recurrence Quantification Analysis (RQA), which has been proven to be a reliable analysis, to detect possible signs of muscle fatigue on different muscle groups. The results of this data analysis method is validated using subjective fatigue assessment method. In order to study the effect of muscle fatigue on subject’s performance, standard Fundamental of Laparoscopic Surgery (FLS) tasks performance analysis is used.
168

Analytical Frameworks of Cooperative and Cognitive Radio Systems with Practical Considerations

Khan, Fahd Ahmed 08 1900 (has links)
Cooperative and cognitive radio systems have been proposed as a solution to improve the quality-of-service (QoS) and spectrum efficiency of existing communication systems. The objective of this dissertation is to propose and analyze schemes for cooperative and cognitive radio systems considering real world scenarios and to make these technologies implementable. In most of the research on cooperative relaying, it has been assumed that the communicating nodes have perfect channel state information (CSI). However, in reality, this is not the case and the nodes may only have an estimate of the CSI or partial knowledge of the CSI. Thus, in this dissertation, depending on the amount of CSI available, novel receivers are proposed to improve the performance of amplify-and forward relaying. Specifically, new coherent receivers are derived which do not perform channel estimation at the destination by using the received pilot signals directly for decoding. The derived receivers are based on new metrics that use distribution of the channels and the noise to achieve improved symbol-error-rate (SER) performance. The SER performance of the derived receivers is further improved by utilizing the decision history in the receivers. In cases where receivers with low complexity are desired, novel non-coherent receiver which detects the signal without knowledge of CSI is proposed. In addition, new receivers are proposed for the situation when only partial CSI is available at the destination i.e. channel knowledge of either the source-relay link or the relay-destination link but not both, is available. These receivers are termed as `half-coherent receivers' since they have channel-state-information of only one of the two links in the system. In practical systems, the CSI at the communicating terminals becomes outdated due to the time varying nature of the channel and results in system performance degradation. In this dissertation, the impact of using outdated CSI for relay selection on the performance of a network where two sources communicate with each other via fixed-gain amplify-and-forward relays is studied and for a Rayleigh faded channel, closed-form expressions for the outage probability (OP), moment generating function (MGF) and SER are derived. Relay location is also taken into consideration and it is shown that the performance can be improved by placing the relay closer to the source whose channel is more outdated. Some practical issues encountered in cognitive radio systems (CRS) are also investigated. The QoS of CRS can be improved through spatial diversity which can be achieved by either using multiple antennas or exploiting the independent channels of each user in a multi-user network. In this dissertation, both approaches are examined and in multi-antenna CRS, transmit antenna selection (TAS) is proposed where as in a multi-user CRS, user selection is proposed to achieve performance gains. TAS reduces the implementation cost and complexity and thus makes CRS more feasible. Additionally, unlike previous works, in accordance with real world systems, the transmitter is assumed to have limited peak transmit power. For both these schemes, considering practical channel models, closed-form expression for the OP performance, SER performance and ergodic capacity (EC) are obtained and the performance in the asymptotic regimes is also studied. Furthermore, the OP performance is also analyzed taking into account the interference from the primary network on the cognitive network.
169

Performance Analysis of the Modernized GNSS Signal Acquisition / Analyse des Performances de l'Acquisition des Nouveaux Signaux GNSS

Foucras, Myriam 06 February 2015 (has links)
Depuis le développement du GPS, les systèmes de navigation par satellites (GNSS) se sont largement diversifiés : maintenance, modernisation et déploiement de nouveaux systèmes, comme l’européen Galileo. De plus, le nombre d’applications basées sur l’utilisation de signaux GNSS ne cesse d’augmenter. Pour répondre à ces nouveaux challenges et besoins, les récepteurs GNSS ne cessent d’évoluer. Un nouvel axe est le développement du récepteur logiciel qui présente la particularité d’un traitement logiciel des signaux contrairement au récepteur matériel, équipant nos véhicules, smartphones par exemple. Cette thèse de doctorat s’inscrit dans le projet commun d’un laboratoire et d’une PME consistant au développement d’un récepteur logiciel poursuivant les signaux GPS L1 C/A et Galileo E1 OS. L’objectif plus spécifique de la thèse est d’étudier l’acquisition, première étape du traitement du signal GNSS qui doit fournir une estimation grossière des paramètres du signal entrant. Ce travail vise particulièrement les signaux à faible puissance, un seuil d’acquisition est fixé à 27 dB-Hz pouvant s’apparenter à l’acquisition en milieu urbain ou dégradé. Il est important de noter qu’une des contraintes est de réussir l’acquisition de tels signaux au moins 9 fois sur 10, sans aucune aide extérieure ou connaissance des almanachs ou éphémérides. Dans un premier temps, une solide étude théorique portant sur les performances de l’acquisition et les sources de dégradations est menée. Parmi elles, peuvent être citée, les transitions de bits dues à la présence du message de navigation et du code secondaire sur la voie pilote des nouveaux signaux. Est ainsi mis en lumière la nécessité d’avoir recours à une méthode d’acquisition insensible aux inversions de signe du message de navigation. Dans un deuxième temps, une méthode innovante, le Double-Block Zero-Padding Transition-Insensitive (DBZPTI), est donc développée pour permettre l’acquisition du signal Galileo E1 OS de façon efficiente. Elle prend part au développement de la stratégie globale d’acquisition dont l’objectif est d’avoir en sortie une estimation de la fréquence Doppler et du retard de code du signal entrant, assez fine et fiable pour une satisfaisante poursuite du signal. / Since the development of the GPS, the global navigation satellite systems (GNSS) have been widely diversified: maintenance, modernization and deployment of new systems such as the European Galileo. In addition, the number of GNSS signals applications, based on the use of GNSS signals, is increasing. To meet these new challenges and requirements, GNSS receivers are constantly evolving. A new trend is the development of software receiver which processes the GNSS signal in a software way unlike hardware receiver, equipping our vehicles, smartphones, for example. This thesis is part of a common project between a laboratory and a company, consisting of the development of a software receiver tracking GPS L1 C/A and Galileo E1 OS. The more specific aim of the thesis is to study the acquisition, first signal processing which provides a rough estimation of the incoming signal parameters. This work focuses particularly the low power signals, an acquisition threshold is set at 27 dB-Hz considered as a representative of urban or degraded environments. It is important to note that the success of the acquisition of such signals should be at least 9 times out of 10, without any aid or knowledge of almanac or ephemeris. Initially, a solid theoretical study of the acquisition performance and sources of degradation is conducted. One of them is the bit transitions due to the presence of the navigation message and the secondary code on pilot component of the new signals. It is thus highlighted the need to use a Transition-Insensitive acquisition method. Secondly, an innovative method, the Double-Block Zero-Padding Transition-Insensitive (DBZPTI) is developed to permit efficiently the acquisition of Galileo E1 OS signal. It takes part in the development of the global acquisition strategy, which should provide an estimate of the Doppler frequency and code delay, fine and reliable, for a satisfactory signal tracking.
170

High performance trace replay event simulation of parallel programs behavior / Ferramenta de alto desempenho para análise de comportamento de programas paralelos baseada em rastos de execução

Korndorfer, Jonas Henrique Muller January 2016 (has links)
Sistemas modernos de alto desempenho compreendem milhares a milhões de unidades de processamento. O desenvolvimento de uma aplicação paralela escalável para tais sistemas depende de um mapeamento preciso da utilização recursos disponíveis. A identificação de recursos não utilizados e os gargalos de processamento requere uma boa análise desempenho. A observação de rastros de execução é uma das técnicas mais úteis para esse fim. Infelizmente, o rastreamento muitas vezes produz grandes arquivos de rastro, atingindo facilmente gigabytes de dados brutos. Portanto ferramentas para análise de desempenho baseadas em rastros precisam processar esses dados para uma forma legível e serem eficientes a fim de permitirem uma análise rápida e útil. A maioria das ferramentas existentes, tais como Vampir, Scalasca e TAU, focam no processamento de formatos de rastro com semântica associada, geralmente definidos para lidar com programas desenvolvidos com bibliotecas populares como OpenMP, MPI e CUDA. No entanto, nem todas aplicações paralelas utilizam essas bibliotecas e assim, algumas vezes, essas ferramentas podem não ser úteis. Felizmente existem outras ferramentas que apresentam uma abordagem mais dinâmica, utilizando um formato de arquivo de rastro aberto e sem semântica específica. Algumas dessas ferramentas são Paraver, Pajé e PajeNG. Por outro lado, ser genérico tem custo e assim tais ferramentas frequentemente apresentam baixo desempenho para o processamento de grandes rastros. O objetivo deste trabalho é apresentar otimizações feitas para o conjunto de ferramentas PajeNG. São apresentados o desenvolvimento de um estratégia de paralelização para o PajeNG e uma análise de desempenho para demonstrar nossos ganhos. O PajeNG original funciona sequencialmente, processando um único arquivo de rastro que contém todos os dados do programa rastreado. Desta forma, a escalabilidade da ferramenta fica muito limitada pela leitura dos dados. Nossa estratégia divide o arquivo em pedaços permitindo seu processamento em paralelo. O método desenvolvido para separar os rastros permite que cada pedaço execute em um fluxo de execução separado. Nossos experimentos foram executados em máquinas com acesso não uniforme à memória (NUMA).Aanálise de desempenho desenvolvida considera vários aspectos como localidade das threads, o número de fluxos, tipo de disco e também comparações entre os nós NUMA. Os resultados obtidos são muito promissores, escalando o PajeNG cerca de oito a onze vezes, dependendo da máquina. / Modern high performance systems comprise thousands to millions of processing units. The development of a scalable parallel application for such systems depends on an accurate mapping of application processes on top of available resources. The identification of unused resources and potential processing bottlenecks requires good performance analysis. The trace-based observation of a parallel program execution is one of the most helpful techniques for such purpose. Unfortunately, tracing often produces large trace files, easily reaching the order of gigabytes of raw data. Therefore tracebased performance analysis tools have to process such data to a human readable way and also should be efficient to allow an useful analysis. Most of the existing tools such as Vampir, Scalasca, TAU have focus on the processing of trace formats with a fixed and well-defined semantic. The corresponding file format are usually proposed to handle applications developed using popular libraries like OpenMP, MPI, and CUDA. However, not all parallel applications use such libraries and so, sometimes, these tools cannot be useful. Fortunately, there are other tools that present a more dynamic approach by using an open trace file format without specific semantic. Some of these tools are the Paraver, Pajé and PajeNG. However the fact of being generic comes with a cost. These tools very frequently present low performance for the processing of large traces. The objective of this work is to present performance optimizations made in the PajeNG tool-set. This comprises the development of a parallelization strategy and a performance analysis to set our gains. The original PajeNG works sequentially by processing a single trace file with all data from the observed application. This way, the scalability of the tool is very limited by the reading of the trace file. Our strategy splits such file to process several pieces in parallel. The created method to split the traces allows the processing of each piece in each thread. The experiments were executed in non-uniform memory access (NUMA) machines. The performance analysis considers several aspects like threads locality, number of flows, disk type and also comparisons between the NUMA nodes. The obtained results are very promising, scaling up the PajeNG about eight to eleven times depending on the machine.

Page generated in 0.037 seconds