• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 21
  • 5
  • 5
  • 4
  • 1
  • Tagged with
  • 47
  • 47
  • 14
  • 8
  • 7
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

The GraphGrind framework : fast graph analytics on large shared-memory systems

Sun, Jiawen January 2018 (has links)
As shared memory systems support terabyte-sized main memory, they provide an opportunity to perform efficient graph analytics on a single machine. Graph analytics is characterised by frequent synchronisation, which is addressed in part by shared memory systems. However, performance is limited by load imbalance and poor memory locality, which originate in the irregular structure of small-world graphs. This dissertation demonstrates how graph partitioning can be used to optimise (i) load balance, (ii) Non-Uniform Memory Access (NUMA) locality and (iii) temporal locality of graph partitioning in shared memory systems. The developed techniques are implemented in GraphGrind, a new shared memory graph analytics framework. At first, this dissertation shows that heuristic edge-balanced partitioning results in an imbalance in the number of vertices per partition. Thus, load imbalance exists between partitions, either for loops iterating over vertices, or for loops iterating over edges. To address this issue, this dissertation introduces a classification of algorithms to distinguish whether they algorithmically benefit from edge-balanced or vertex-balanced partitioning. This classification supports the adaptation of partitions to the characteristics of graph algorithms. Evaluation in GraphGrind, shows that this outperforms state-of-the-art graph analytics frameworks for shared memory including Ligra by 1.46x on average, and Polymer by 1.16x on average, using a variety of graph algorithms and datasets. Secondly, this dissertation demonstrates that increasing the number of graph partitions is effective to improve temporal locality due to smaller working sets. However, the increasing number of partitions results in vertex replication in some graph data structures. This dissertation resorts to using a graph layout that is immune to vertex replication and an automatic graph traversal algorithm that extends the previously established graph traversal heuristics to a 3-way graph layout choice is designed. This new algorithm furthermore depends upon the classification of graph algorithms introduced in the first part of the work. These techniques achieve an average speedup of 1.79x over Ligra and 1.42x over Polymer. Finally, this dissertation presents a graph ordering algorithm to challenge the widely accepted heuristic to balance the number of edges per partition and minimise edge or vertex cut. This algorithm balances the number of edges per partition as well as the number of unique destinations of those edges. It balances edges and vertices for graphs with a power-law degree distribution. Moreover, this dissertation shows that the performance of graph ordering depends upon the characteristics of graph analytics frameworks, such as NUMA-awareness. This graph ordering algorithm achieves an average speedup of 1.87x over Ligra and 1.51x over Polymer.
22

Estudo e implementação da otimização de Preload de dados usando o processador XScale / Study and implementation of data Preload optimization using XScale

Oliveira, Marcio Rodrigo de 08 October 2005 (has links)
Orientador: Guido Costa Souza Araujo / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-06T14:27:52Z (GMT). No. of bitstreams: 1 Oliveira_MarcioRodrigode_M.pdf: 1563381 bytes, checksum: 52e2e029998b3539a26f5c2b76284d88 (MD5) Previous issue date: 2005 / Resumo: Atualmente existe um grande mercado para o desenvolvimento de aplicações para sistemas embutidos, pois estes estão fazendo parte crescente do cotidiano das pessoas em produtos de eletrônica de consumo como telefones celulares, palmtop's, agendas eletrônicas, etc. Os produtos de eletrônica de consumo possuem grandes restrições de projeto, tais como custo reduzido, baixo consumo de potência e muitas vezes alto desempenho. Deste modo, o código produzido pelos compiladores para os programas executados nestes produtos, devem executar rapidamente, economizando energia de suas baterias. Estes melhoramentos são alcançados através de transformações no programa fonte chamadas de otimizações de código. A otimização preload de dados consiste em mover dados de um alto nível da hierarquia de memória para um baixo nível dessa hierarquia antes deste dado ser usado. Este é um método que pode reduzir a penalidade da latência de memória. Este trabalho mostra o desenvolvimento da otimização de preload de dados no compilador Xingo para a plataforma Pocket PC, cuja arquitetura possui um processador XScale. A arquitetura XScale possui a instrução preload, cujo objetivo é fazer uma pré-busca de dados para a cache. Esta otimização insere (através de previsões) a instrução preload no código intermediário do programa fonte, tentando prever quais dados serão usados e que darão miss na cache (trazendo-os para esta cache antes de seu uso). Com essa estratégia, tenta-se minimizar a porcentagem de misses na cache de dados, reduzindo o tempo gasto em acessos à memória. Foram usados neste trabalho vários programas de benchmarks conhecidos para a avaliação dos resultados, dentre eles destacam-se DSPstone e o MiBench. Os resultados mostram que esta otimização de preload de dados para o Pocket PC produz um aumento considerável de desempenho para a maioria dos programa testados, sendo que em vários programas observou-se uma melhora de desempenho maior que 30%! / Abstract: Nowadays, there is a big market for applications for embedded systems, in products as celIular phones, palmtops, electronic schedulers, etc. Consumer electronics are designed under stringent design constraints, like reduced cost, low power consumption and high performance. This way, the code produced by compiling programs to execute on these products, must execute quickly, and also should save power consumption. In order to achieve that, code optimizations must be performed at compile time. Data preload consists of moving data from a higher leveI of the memory hierarchy to a lower leveI before data is actualIy needed, thus reducing memory latency penalty. This dissertation shows how data preload optimization was implemented into the Xingo compiler for the Pocket PC platform, a XScale based processor. The XScale architecture has a preload instruction, whose main objective is to prefetch program data into cache. This optimization inserts (through heuristics) preload instructions into the program source code, in order to anticipate which data will be used. This strategy minimizes cache misses, allowing to reduce the cache miss latency while running the program code. Some benchmark programs have been used for evaluation, like DSPstone and MiBench. The results show a considerable performance improvement for almost alI tested programs, subject to the preload optimization. Many of the tested programs achieved performance improvements larger than 30% / Mestrado / Otimização de Codigo / Mestre em Ciência da Computação
23

Caching Strategies And Design Issues In CD-ROM Based Multimedia Storage

Shastri, Vijnan 04 1900 (has links) (PDF)
No description available.
24

Genotypic Handedness, Memory, and Cerebral Lateralization

Perotti, Laurence Peter 08 1900 (has links)
The relationship of current manual preference (phenotypic handedness) and family history of handedness (genotypic handedness) to memory for imageable stimuli was studied. The purpose of the study was to test the hypothesis that genotypic handedness was related to lessened cerebral lateralization of Paivio's (1969) dual memory systems. The structure of memory was not at issue, but the mediation of storage and retrieval in memory has been explained with reference to verbal or imaginal processes. Verbal mediation theories and supporting data were reviewed along with imaginal theories and supporting data for these latter theories. Paivio's (1969) dual coding and processing theory was considered a conceptual bridge between the competing positions.
25

Exploring new boundaries in team cognition: Integrating knowledge in distributed teams

Zajac, Stephanie 01 January 2014 (has links)
Distributed teams continue to emerge in response to the complex organizational environments brought about by globalization, technological advancements, and the shift toward a knowledge-based economy. These teams are comprised of members who hold the disparate knowledge necessary to take on cognitively demanding tasks. However, knowledge coordination between team members who are not co-located is a significant challenge, often resulting in process loss and decrements to the effectiveness of team level knowledge structures. The current effort explores the configuration dimension of distributed teams, and specifically how subgroup formation based on geographic location, may impact the effectiveness of a team's transactive memory system and subsequent team process. In addition, the role of task cohesion as a buffer to negative intergroup interaction is explored.
26

Modeling and Runtime Systems for Coordinated Power-Performance Management

Li, Bo 28 January 2019 (has links)
Emergent systems in high-performance computing (HPC) expect maximal efficiency to achieve the goal of power budget under 20-40 megawatts for 1 exaflop set by the Department of Energy. To optimize efficiency, emergent systems provide multiple power-performance control techniques to throttle different system components and scale of concurrency. In this dissertation, we focus on three throttling techniques: CPU dynamic voltage and frequency scaling (DVFS), dynamic memory throttling (DMT), and dynamic concurrency throttling (DCT). We first conduct an empirical analysis of the performance and energy trade-offs of different architectures under the throttling techniques. We show the impact on performance and energy consumption on Intel x86 systems with accelerators of Intel Xeon Phi and a Nvidia general-purpose graphics processing unit (GPGPU). We show the trade-offs and potentials for improving efficiency. Furthermore, we propose a parallel performance model for coordinating DVFS, DMT, and DCT simultaneously. We present a multivariate linear regression-based approach to approximate the impact of DVFS, DMT, and DCT on performance for performance prediction. Validation using 19 HPC applications/kernels on two architectures (i.e., Intel x86 and IBM BG/Q) shows up to 7% and 17% prediction error correspondingly. Thereafter, we develop the metrics for capturing the performance impact of DVFS, DMT, and DCT. We apply the artificial neural network model to approximate the nonlinear effects on performance impact and present a runtime control strategy accordingly for power capping. Our validation using 37 HPC applications/kernels shows up to a 20% performance improvement under a given power budget compared with the Intel RAPL-based method. / Ph. D. / System efficiency on high-performance computing (HPC) systems is the key to achieving the goal of power budget for exascale supercomputers. Techniques for adjusting the performance of different system components can help accomplish this goal by dynamically controlling system performance according to application behaviors. In this dissertation, we focus on three techniques: adjusting CPU performance, memory performance, and the number of threads for running parallel applications. First, we profile the performance and energy consumption of different HPC applications on both Intel systems with accelerators and IBM BG/Q systems. We explore the trade-offs of performance and energy under these techniques and provide optimization insights. Furthermore, we propose a parallel performance model that can accurately capture the impact of these techniques on performance in terms of job completion time. We present an approximation approach for performance prediction. The approximation has up to 7% and 17% prediction error on Intel x86 and IBM BG/Q systems respectively under 19 HPC applications. Thereafter, we apply the performance model in a runtime system design for improving performance under a given power budget. Our runtime strategy achieves up to 20% performance improvement to the baseline method.
27

The Crash Consistency, Performance, and Security of Persistent Memory Objects

Greenspan, Derrick Alex 01 January 2024 (has links) (PDF)
Persistent memory (PM) is expected to augment or replace DRAM as main memory. PM combines byte-addressability with non-volatility, providing an opportunity to host byte-addressable data persistently. There are two main approaches for utilizing PM: either as memory mapped files or as persistent memory objects (PMOs). Memory mapped files require that programmers reconcile two different semantics (file system and virtual memory) for the same underlying data, and require the programmer use complicated transaction semantics to keep data crash consistent. To solve this problem, the first part of this dissertation designs, implements, and evaluates a new PMO abstraction that addresses these problems by hosting data in pointer-rich data structures without the backing of a filesystem, and introduces a new primitive, psync, that when invoked renders data crash consistent while concealing the implementation details from the programmer via shadowing. This new approach outperforms a state-of-the-art memory mapped design by 3.2 times depending on the workload. It also addresses the security of at-rest PMOs, by providing for encryption and integrity verification of PMOs. To do this, it performs encryption and integrity verification on the entire PMO, which adds an overhead of between 3-46% depending on the level of protection. The second part of this dissertation demonstrates how crash consistency, security, and integrity verification can be conserved while the overall overhead is reduced by decrypting individual memory pages instead of the entire PMO, yielding performance improvements compared to the original whole PMO design of 2.62 times depending on the workload. The final part of this dissertation improves the performance of PMOs even further by mapping userspace pages to volatile memory and copying them into PM, rather than directly writing to PM. Bundling this design with a stream buffer predictor to decrypt pages into DRAM ahead of time improves performance by 1.9 times.
28

How do teams learn? shared mental models and transactive memory systems as determinants of team learning and effectiveness

Nandkeolyar, Amit Kumar 01 January 2008 (has links)
Shared mental models (SMM) and Transactive memory systems (TMS) have been advocated as the main team learning mechanisms. Despite multiple appeals for collaboration, research in both these fields has progressed in parallel and little effort has been made to integrate these theories. The purpose of this study was to test the relationship between SMM and TMS in a field setting and examine their influence on various team effectiveness outcomes such as team performance, team learning, team creativity, team members' satisfaction and team viability. Contextual factors relevant to an organizational setting were tested and these included team size, tenure, country of origin, team reward and organizational support. Based on responses from 41 teams from 7 industries across two countries (US and India), results indicate that team size, country of origin and team tenure impact team performance and team learning. In addition, team reward and organizational support predicted team viability and satisfaction. Results indicated that TMS components (specialization, coordination and credibility) were better predictors of team outcomes than the omnibus TMS construct. In particular, TMS credibility predicted team performance and creativity while TMS coordination predicted team viability and satisfaction. SMM was measured in two different ways: an average deviation index and a 6-item scale. Both methods resulted in a conceptually similar interpretation although average deviation indices provided slightly better results in predicting effectiveness outcomes. TMS components moderated the relationship between SMM and team outcomes. Team performance was lowest when both SMM and TMS were low. However, contrary to expectations, high levels of SMM did not always result in effective team outcomes (performance, learning and creativity) especially when teams exhibited high TMS specialization and credibility. An interaction pattern was observed under conditions of low levels of SMM such that high TMS resulted in higher levels of team outcomes. The theoretical and practical implications of these results are discussed.
29

Coordination de systèmes de mémoire : modèles théoriques du comportement animal et humain / Coordination of memory systems : theoretical models of human and animals behavior

Viejo, Guillaume 28 November 2016 (has links)
Durant ce doctorat financé par l'observatoire B2V des mémoires, nous avons réalisé une modélisation mathématique du comportement dans trois tâches distinctes (avec des sujets humains, des sujets singes et des rongeurs), mais qui supposent toutes une coordination entre systèmes de mémoire. Dans la première expérience, nous avons reproduit le comportement de sujets humains (choix et temps de réaction) en combinant les modèles mathématiques d'une mémoire de travail et d'une mémoire inflexible. Nous avons associé pour un sujet son comportement au meilleur modèle possible en comparant des modèles génériques de coordination de ces deux mémoires issues de la littérature actuelle ainsi que notre propre proposition d'une interaction dynamique entre les mémoires. Au final, c'est notre proposition d'une interaction au lieu d'une séparation stricte qui s'est avérée la plus efficace dans la majorité des cas pour expliquer le comportement des sujets. Dans une deuxième expérience, les mêmes modèles de coordination ont été testés dans une tâche chez le singe. Considérée comme un test de transférabilité, cette expérience démontre principalement la nécessité de coordination de mémoires pour expliquer le comportement de certains singes. Dans une troisième expérience, nous avons modélisé le comportement d'un groupe de souris confronté à l'apprentissage d'une séquence d'action motrice dans un labyrinthe sans indices externes. En comparant avec deux autres stratégies d'apprentissages (intégration de chemin et planification dans un graphe), la combinaison d'une mémoire épisodique avec une mémoire inflexible s'est révélée être le meilleur modèle pour reproduire le comportement des souris. / During this PhD funded by the B2V Memories Observatory, we performed a mathematical modeling of behavior in three distinct tasks (with human subjects, monkeys and rodents), all involving coordination between memory systems. In the first experiment, we reproduced the behavior of human subjects (choice and reaction time) by combining the mathematical models of working memory and procedural memory. For each subject, we associated their behavior to the best possible model by comparing generic models of coordination of these two memories from the current literature as well as our own proposal of a dynamic interaction between memories. In the end, it was our proposal of an interaction instead of a strict separation which proved most effective in the majority of cases to explain the behavior of the subjects. In a second experiment, the same coordination models were tested in a monkey task. Considered as a transferability test, this experiment mainly demonstrates the need for coordination of memories to explain the behavior of certain monkeys. In a third experiment, we modeled the behavior of a group of mice confronted with the learning of a motor action sequence in a labyrinth without visual cues. Comparing with two other learning strategies (path integration and graph planning), the combination of an episodic memory with a procedural memory proved to be the best model to reproduce the behavior of mice.
30

Traduzir na contemporaneidade: efeitos da adoção de sistemas de memórias sobre a concepção ética da prática tradutória

Stupiello, Érika Nogueira de Andrade [UNESP] 25 March 2010 (has links) (PDF)
Made available in DSpace on 2014-06-11T19:32:45Z (GMT). No. of bitstreams: 0 Previous issue date: 2010-03-25Bitstream added on 2014-06-13T19:22:18Z : No. of bitstreams: 1 stupiello_ena_dr_sjrp.pdf: 980318 bytes, checksum: 65f85d4d951959755414469d5e56e14d (MD5) / Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) / As transformações experimentadas no mundo considerado globalizado têm gerado o crescimento do montante de informações e a urgência de disseminação das mesmas além fronteiras, promovendo o expressivo aumento da demanda por traduções elaboradas de maneira rápida e segundo padrões de produção específicos. Para atender a essas exigências e manterem-se competitivos, tradutores, cada vez mais, estão lançando mão das ferramentas tecnológicas atualmente disponíveis, em especial, sistemas de memórias de tradução. A aplicação dessas ferramentas requer que o tradutor siga determinadas regras que garantam o desempenho prometido, especialmente pela manipulação de termos e fraseologias utilizados na tradução a fim de garantir seu reaproveitamento em trabalhos futuros. A crescente adoção de ferramentas pelo tradutor contemporâneo suscita uma reflexão de cunho ético sobre a extensão de sua responsabilidade pelo material traduzido. Visando a esse fim, nesta tese, investigam-se os pressupostos teóricos que sustentam os projetos dessas ferramentas tecnológicas de tradução, analisando-se tanto as contribuições que elas têm proporcionado ao tradutor, como algumas das questões que procedem do modo como a profissão é concebida como resultado do uso dos recursos por elas disponibilizados. Para fomentar a análise proposta, foram examinados os recursos pressupostos como dinamizadores do trabalho do tradutor, principalmente pelas funções de segmentação do texto de origem, alinhamento de traduções e pelo processo de correspondência textual disponíveis em três sistemas de memória: o Wordfast, o Trados e o Transit. O estudo dos projetos e dos recursos disponibilizados por essas ferramentas auxiliou a análise sobre o envolvimento do tradutor com a tradução, quando esse profissional integra um processo maior de produção... / Transformations in the globalized world have generated the growth of the amount of information and the urgency of its dissemination beyond borders, promoting a significant increase in the demand for translations performed fast and according to specific production standards. In order to comply with these requirements and remain competitive, translators are more and more embracing the technological tools currently available, mainly, translation memory systems. The application of these tools requires the translator to follow certain rules that guarantee the promised performance, mainly by manipulating terms and phraseologies used in the translation so as to ascertain their reuse in future translations. The growing adoption of tools by the contemporary translator calls for an ethical consideration of the extension of the translator’s responsibility for the translated material. In this thesis, the theoretical assumptions supporting the projects of these translation technological tools are investigated through the analysis of both the contributions they have been providing for the translator and some issues that arise from the way the profession is conceived as a result of the use of the resources made available by these tools. To foment the proposed analysis, resources deemed to make the translator’s work more dynamic have been examined, mainly through the functions of source-text segmentation, translation alignment and textual matching available in three translation memory systems: Wordfast, Trados and Transit. The study of the projects and resources made available by these tools encouraged the analysis of the translator’s involvement with the translation when he/she is part of a larger process of production and distribution of information to audiences located in the most varied places in the world. From this analysis, a survey was carried out of issues... (Complete abstract click electronic access below)

Page generated in 0.0412 seconds