• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 64
  • 27
  • 23
  • 13
  • 11
  • 7
  • 6
  • 5
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 206
  • 27
  • 26
  • 22
  • 17
  • 16
  • 16
  • 15
  • 14
  • 14
  • 14
  • 11
  • 11
  • 11
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Υλοποίηση μεταφραστή πηγαίου προς πηγαίο κώδικα για μοντέλο προγραμματισμού OpenMP σε γλώσσα προγραμματισμού C / Implementation of a source-to-source OpenMP compiler for the C programming language

Γιασλάς, Γεώργιος 16 May 2007 (has links)
Η εργασία αυτή ασχολείται με την υλοποίηση ενός μεταφραστή για το μοντέλο παράλληλου προγραμματισμού OpenMP. Το μοντέλο αυτό χρησιμοποιείται για την υλοποίηση παράλληλων εφαρμογών για την αρχιτεκτονική κοινής μνήμης, και δίνει έμφαση στην ευκολία του προγραμματισμού και την μεταφερσιμότητα των παράλληλων εφαρμογών που αναπτύσσονται μ’ αυτό. Στην εργασία αυτή παρουσιάζεται η σχεδίαση και η υλοποίηση ενός μεταγλωττιστή παράλληλων εφαρμογών OpenMP για τη γλώσσα προγραμματισμού C σύμφωνα με την έκδοση 2.0 του προτύπου OpenMP. Ο μεταγλωττιστής που υλοποιήθηκε ανήκει στην κατηγορία των μεταφραστών, δηλαδή μεταφράζει τον πηγαίο κώδικα σε γλώσσα προγραμματισμού C με τις επεκτάσεις OpenMP, σε ισοδύναμο πηγαίο κώδικα σε γλώσσα προγραμματισμού C, στον οποίο οι επεκτάσεις OpenMP έχουν αντικατασταθεί από ισοδύναμα τμήματα κώδικα τα οποία χρησιμοποιούν κλήσεις βιβλιοθήκης νημάτων για την υλοποίηση του παραλληλισμού. Ο μεταφραστής έχει σχεδιαστεί με τέτοιο τρόπο ώστε να είναι μεταφέρσιμος, υποστηρίζοντας πολλαπλές πλατφόρμες εκτέλεσης, και επεκτάσιμος, υποστηρίζοντας πολλαπλές βιβλιοθήκες νημάτων κατά τη μετάφραση των οδηγιών OpenMP. Η υλοποίηση του μεταφραστή έγινε στη γλώσσα προγραμματισμού Java, χρησιμοποιώντας το γλωσσικό εργαλείο ANTLR για την υλοποίηση του λεκτικού και συντακτικού αναλυτή. Ο μεταφραστής συνοδεύεται, επίσης, από μια βιβλιοθήκη χρόνου εκτέλεσης, η οποία περιέχει τις συναρτήσεις που ορίζονται από το πρότυπο OpenMP v2.0, καθώς και άλλες συναρτήσεις που είναι απαραίτητες για την εκτέλεση των μεταφρασμένων παράλληλων εφαρμογών. Επίσης, στη βιβλιοθήκη υλοποιούνται οι πολιτικές χρονοδρομολόγησης, επιτρέπωντας εύκολα την υλοποίηση κάποιας νέας πολιτικής. Η βιβλιοθήκη έχει υλοποιηθεί σε γλώσσα προγραμματισμού C. Στα πλαίσια της εργασίας αυτής, έχει υλοποιηθεί η υποστήριξη των βιβλιοθηκών νημάτων POSIX και NANOS σε πολυεπεξεργαστικά συστήματα κοινής μνήμης, καθώς και οι πολιτικές χρονοδρομολόγησης που ορίζονται στο πρότυπο OpenMP v2.0. / The subject of this thesis is the implementation of a translator for the OpenMP parallel programming model. This model is used for the development of parallel applications for the shared memory model and emphasizes on the ease of programming and the portability of the applications developed with it. This thesis describes the design and the implementatino of a compiler for OpenMP parallel applications for the C programming language according to version 2.0 of the OpenMP model. The compiler that has been implemented belongs to the translator class, that is translates the source code in C programming language with the OpenMP extensions to equivalent source code in C programming language where the OpenMP extensions have been replaced with equivalent code segments which use thread library calls to implement the parallelism. The translator has been designed in order to be portable, supporting multiple execution platforms, and extensible, supporting multiple thread libraries during the translation of the OpenMP directives. The translator has been implemented using the Java language and using the language tool ANTLR for the implementation of the lexer and the parser. The translator comes with a run-time library, which contains all of the functions which are defined by the OpenMP v2.0, as well as other functions which are needed for the execution of the translated parallel applications. Also, the library contains the scheduling policies allowing easy implementation of a new one. The library has been implemented using the C language. The current implementation supports the POSIX and NANOS thread libraries in shared memory SMPs, as well as all the scheduling policies defined in OpenMP v2.0
72

Parallel Electromagnetic Transient Simulation of Large-Scale Power Systems on Massive-threading Hardware

Zhou, Zhiyin Unknown Date
No description available.
73

Cikliškai lenkiamų srieginių jungčių deformavimas ir stiprumas / Deformation and Strength af a Cyclically Bent Threaded Connection

Juchnevičius, Žilvinas 06 February 2012 (has links)
Disertacijoje, taikant apkrovų pasiskirstymo sriegio vijose modeliavimą, nagrinėjamos lenkiamų srieginių jungčių mažaciklio stiprumo problemos. Mo-dernesnės daugiaciklio ir mažaciklio ilgaamžiškumo skaičiavimo metodikos detaliai kiekybiškai įvertina ašinės apkrovos pasiskirstymą vijose, kuris leidžia detaliau ir tiksliau įvertinti konstrukcijos ypatumų įtaką. Disertacijoje siekiama ir cikliškai lenkiamoms srieginėms jungtims sukurti apkrovos pasiskirstymo sriegyje skaičiavimo metodą ir pritaikyti jį ciklinio stiprumo skaičiavimui Sudaryta lenkiamos srieginės jungties elementų poslinkių darnos lygtis įga-lino sukurti trijų ruožų ir daugiaruožį lenkimo apkrovos pasiskirstymo jung¬tyje modelius ir įdiegti juos į norminio mažaciklio ilgaamžiškumo skaičiavimo gran-dinę. Pirmame skyriuje pateikta literatūros šaltinių analizė. Jame apžvelgiama sunkiai apkrautų srieginių jungčių taikymo sritis ir apkrovimo sąlygos, apžvel¬giami ir analizuojami atlikti tyrimai, susiję su srieginių jungčių cikliniu stipru-mu. Antrame skyriuje pateikti sriegio vijų poros deformavimo savybių eksperi¬mentinio tyrimo rezultatai ir anksčiau literatūroje neminėtų nukraunamos ir pa-kartotinai apkraunamos vijų poros savybių analizė. Trečiame skyriuje pateikta srieginės jungties elementų deformavimo sche¬ma, poslinkių darnos diferencialinės lygtys, diferencialinių lygčių analitiniai sprendiniai ir lenkiamos srieginės jungties modeliai – trijų ruožų ir daugiaruožis tamprieji modeliai lenkimo apkrovos... [toliau žr. visą tekstą] / Industry equipment such as pressure vessels, mining equipment, heat exchang-ers, steam generators and other structures are provided with bolted closures for the purpose of in-service inspection and maintenance of internal components. Threaded connections often experience variable cyclic loads due to temperature, inner pressure and variation in the deformation of connection fittings. Often, studs and screws are not only affected by an axial load, but also by bending moments. More sophisticated high-cycle and low-cycle durability calculation meth-odologies have been already developed for threaded connections experiencing cyclic axial loads, and in these methodologies the distribution of axial load among turns is assessed quantitatively. The quantitative data of load distribu-tion in the thread enables a more accurate assessment of the influence of the constructional design particularities (connection length, material, nut and turn’s form) and the deformation stages of the connection element. These durability calculation methodologies are not applied for threaded connections that are cyclically bent, as the analytical models that are suitable for practical application in the load distribution of the turns have not been cre-ated for bent threaded connections. In this field, no models have been created to be calculated by the BE method. As the threaded connection is a complex node consisting of deformed el-ements, the load distribution among turns is influenced by the... [to full text]
74

Multithreaded PDE Solvers on Non-Uniform Memory Architectures

Nordén, Markus January 2006 (has links)
A trend in parallel computer architecture is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important. The focus of the present thesis is to study multithreaded PDE solvers on cc-NUMA systems, in particular their memory access pattern with respect to geographical locality. The questions posed are: (1) How large is the influence on performance of the non-uniformity of the memory system? (2) How should a program be written in order to reduce this influence? (3) Is it possible to introduce optimizations in the computer system for this purpose? The main conclusion is that geographical locality is important for performance on cc-NUMA systems. This is shown experimentally for a broad range of PDE solvers as well as theoretically using a model involving characteristics of computer systems and applications. Geographical locality can be achieved through migration directives that are inserted by the programmer or — possibly in the future — automatically by the compiler. On some systems, it can also be accomplished by means of transparent, hardware initiated migration and replication. However, a necessary condition that must be fulfilled if migration is to be effective is that the memory access pattern must not be "speckled", i.e. as few threads as possible shall make accesses to each memory page. We also conclude that OpenMP is competitive with MPI on cc-NUMA systems if care is taken to get a favourable data distribution.
75

Online thread and data mapping using the memory management unit / Mapeamento dinâmico de threads e dados usando a unidade de gerência de memória

Cruz, Eduardo Henrique Molina da January 2016 (has links)
Conforme o paralelismo a nível de threads aumenta nas arquiteturas modernas devido ao aumento do número de núcleos por processador e processadores por sistema, a complexidade da hierarquia de memória também aumenta. Tais hierarquias incluem diversos níveis de caches privadas ou compartilhadas e tempo de acesso não uniforme à memória. Um desafio importante em tais arquiteturas é a movimentação de dados entre os núcleos, caches e bancos de memória primária, que ocorre quando um núcleo realiza uma transação de memória. Neste contexto, a redução da movimentação de dados é um dos pilares para futuras arquiteturas para manter o aumento de desempenho e diminuir o consumo de energia. Uma das soluções adotadas para reduzir a movimentação de dados é aumentar a localidade dos acessos à memória através do mapeamento de threads e dados. Mecanismos de mapeamento do estado-da-arte aumentam a localidade de memória mapeando threads que compartilham um grande volume de dados em núcleos próximos na hierarquia de memória (mapeamento de threads), e mapeando os dados em bancos de memória próximos das threads que os acessam (mapeamento de dados). Muitas propostas focam em mapeamento de threads ou dados separadamente, perdendo oportunidades de ganhar desempenho. Outras propostas dependem de traços de execução para realizar um mapeamento estático, que podem impor uma sobrecarga alta e não podem ser usados em aplicações cujos comportamentos de acesso à memória mudam em diferentes execuções. Há ainda propostas que usam amostragem ou informações indiretas sobre o padrão de acesso à memória, resultando em informação imprecisa sobre o acesso à memória. Nesta tese de doutorado, são propostas soluções inovadoras para identificar um mapeamento que otimize o acesso à memória fazendo uso da unidade de gerência de memória para monitor os acessos à memória. As soluções funcionam dinamicamente em paralelo com a execução da aplicação, detectando informações para o mapeamento de threads e dados. Com tais informações, o sistema operacional pode realizar o mapeamento durante a execução das aplicações, não necessitando de conhecimento prévio sobre o comportamento da aplicação. Como as soluções funcionam diretamente na unidade de gerência de memória, elas podem monitorar a maioria dos acessos à memória com uma baixa sobrecarga. Em arquiteturas com TLB gerida por hardware, as soluções podem ser implementadas com pouco hardware adicional. Em arquiteturas com TLB gerida por software, algumas das soluções podem ser implementadas sem hardware adicional. As soluções aqui propostas possuem maior precisão que outros mecanismos porque possuem acesso a mais informações sobre o acesso à memória. Para demonstrar os benefícios das soluções propostas, elas são avaliadas com uma variedade de aplicações usando um simulador de sistema completo, uma máquina real com TLB gerida por software, e duas máquinas reais com TLB gerida por hardware. Na avaliação experimental, as soluções reduziram o tempo de execução em até 39%. O ganho de desempenho se deu por uma redução substancial da quantidade de faltas na cache, e redução do tráfego entre processadores. / As thread-level parallelism increases in modern architectures due to larger numbers of cores per chip and chips per system, the complexity of their memory hierarchies also increase. Such memory hierarchies include several private or shared cache levels, and Non-Uniform Memory Access nodes with different access times. One important challenge for these architectures is the data movement between cores, caches, and main memory banks, which occurs when a core performs a memory transaction. In this context, the reduction of data movement is an important goal for future architectures to keep performance scaling and to decrease energy consumption. One of the solutions to reduce data movement is to improve memory access locality through sharing-aware thread and data mapping. State-of-the-art mapping mechanisms try to increase locality by keeping threads that share a high volume of data close together in the memory hierarchy (sharing-aware thread mapping), and by mapping data close to where its accessing threads reside (sharing-aware data mapping). Many approaches focus on either thread mapping or data mapping, but perform them separately only, losing opportunities to improve performance. Some mechanisms rely on execution traces to perform a static mapping, which have a high overhead and can not be used if the behavior of the application changes between executions. Other approaches use sampling or indirect information about the memory access pattern, resulting in imprecise memory access information. In this thesis, we propose novel solutions to identify an optimized sharing-aware mapping that make use of the memory management unit of processors to monitor the memory accesses. Our solutions work online in parallel to the execution of the application and detect the memory access pattern for both thread and data mappings. With this information, the operating system can perform sharing-aware thread and data mapping during the execution of the application, without any prior knowledge of their behavior. Since they work directly in the memory management unit, our solutions are able to track most memory accesses performed by the parallel application, with a very low overhead. They can be implemented in architectures with hardwaremanaged TLBs with little additional hardware, and some can be implemented in architectures with software-managed TLBs without any hardware changes. Our solutions have a higher accuracy than previous mechanisms because they have access to more accurate information about the memory access behavior. To demonstrate the benefits of our proposed solutions, we evaluate them with a wide variety of applications using a full system simulator, a real machine with software-managed TLBs, and a trace-driven evaluation in two real machines with hardware-managed TLBs. In the experimental evaluation, our proposals were able to reduce execution time by up to 39%. The improvements happened to a substantial reduction in cache misses and interchip interconnection traffic.
76

A transparent and energy aware reconfigurable multiprocessor platform for efficient ILP and TLP exploitation

Rutzig, Mateus Beck January 2012 (has links)
As the number of embedded applications is increasing, the current strategy of several companies is to launch a new platform within short periods, to execute the application set more efficiently, with low energy consumption. However, for each new platform deployment, new tool chains must come along, with additional libraries, debuggers and compilers. This strategy implies in high hardware redesign costs, breaks binary compatibility and results in a high overhead in the software development process. Therefore, focusing on area savings, low energy consumption, binary compatibility maintenance and mainly software productivity improvement, we propose the exploitation of Custom Reconfigurable Arrays for Multiprocessor System (CReAMS). CReAMS is composed of multiple adaptive reconfigurable systems to efficiently explore Instruction and Thread Level Parallelism (ILP and TLP) at hardware level, in a totally transparent fashion. Conceived as homogeneous organization, CReAMS shows a reduction of 37% in energy-delay product (EDP) compared to an ordinary multiprocessing platform when assuming the same chip area. When a variety of processor with different capabilities on exploiting ILP are coupled in a single die, conceiving CReAMS as a heterogeneous organization, performance improvements of up to 57% and energy savings of up to 36% are showed in comparison with the homogenous platform. In addition, the efficiency of the adaptability provided by CReAMS is demonstrated in a comparison to a multiprocessing system composed of 4- issue Out-of-Order SparcV8 processors, 28% of performance improvements are shown considering a power budget scenario.
77

Síntese e cicatrização de pele em cães com fio de náilon, fio farpado e grampo cirúrgico / Synthesis and skin healing in dogs with nailon wire, barbed wire and surgical clip

Santos, Eduardo Rosa dos January 2018 (has links)
A dermorrafia é vital para o sucesso do procedimento por ser a última etapa cirúrgica. Este estudo comparou a síntese e o processo de cicatrização de pele em cães com a utilização de três diferentes materiais, fio de náilon (grupo GFN), fio farpado (grupo GFF) e grampo cirúrgico (grupo GGC), após ovariohisterectomia. Foram utilizadas 27 fêmeas caninas aptas a serem castradas eletivamente que não apresentavam comorbidades. Os animais foram divididos randomicamente nos três grupos e submetidos à dermorrafia com os materiais a serem testados. Foram avaliados: o tempo para a sutura da pele, as complicações relatadas pelos tutores e a temperatura local da pele em cicatrização. Foram também avaliados diversos parâmetros clínicos de cicatrização em sete dias de pósoperatório, bem como parâmetros histológicos de biópsias de pele coletadas aos 14 dias. O grampo cirúrgico apresentou o menor tempo (p<0,001) para dermorrafia e a maior ocorrência de deiscências de sutura. O fio farpado apresentou o menor escore (p=0,006) de alterações clínicas aos sete dias de pós-cirúrgico e não apresentou ocorrência de deiscência de sutura. Contudo não houve diferença entre os grupos quanto à avaliação histológica da biópsia cicatricial aos 14 dias. O fio farpado apresentou segurança na sutura e fácil manipulação na dermorrafia de cães, enquanto o grampo cirúrgico utilizado mostrou-se pouco confiável devido ao grau elevado de deiscência. / Dermorrhaphy is vital to the success of the procedure because it is the last surgical step. This study compared the synthesis and the process of skin healing in dogs using three different materials, nylon thread (GFN group), barbed wire (GFF group) and surgical staple [SV1] (GGC group) after ovariohysterectomy. Twenty-seven canine females without comorbidities were electively castrated. The animals were randomly divided into three treatment groups and submitted to dermorrhaphy with the materials to be tested.. The following were evaluated: time to suture the skin with each material, the complications reported by the tutors and the local temperature of the skin in healing process. Several clinical parameters of healing were also evaluated in seven postoperative days, as well as histological parameters of skin biopsies collected at 14 days. The surgical staple showed the shortest time (p<0.001) for dermorrhaphy and the higher occurrence of suture dehiscences. The barbed wire had the lowest score (p=0.006) of clinical changes on the seventh postoperative day and did not presented an occurrence of suture dehiscence. However, there was no difference between the groups regarding the histological evaluation of cicatricial biopsy at 14 days. The barbed wire showed security in the suture and easy manipulation in the dermorrhaphy of dogs, while the surgical staple used proved to be unreliable due to the high degree of dehiscence.
78

Costurando suspensões : Linha, objetos e construções

García, Nathalia Padilha January 2013 (has links)
O texto aqui apresentado tem como ponto de partida o meu trabalho artístico executado a partir de 2007, momento em que algumas características que impulsionariam a atual produção já eram perceptíveis. Entre essas características havia o uso da linha de costura sobre suportes planos, e também o desejo de expandir o desenho para o espaço, explorando uma possível tridimensionalidade dessa linguagem. Os desdobramentos desses desejos iniciais configuram o foco desta dissertação, sob um ponto de vista poético e conceitual. / The text presented here has, as a starting point, my artistic work carried out from 2007 on, when some aspects that propelled my current production were already noticeable. Among these aspects, there were the use of sewing thread over plain supports and the intention to expand the drawing into space - explore a possible tridimensionality of this language. The unfolding of these initial intentions make up the focus of this dissertation from a poetic and conceptual standpoint.
79

Optimisation de la performance des applications de mémoire transactionnelle sur des plates-formes multicoeurs : une approche basée sur l'apprentissage automatique / Improving the Performance of Transactional Memory Applications on Multicores : A Machine Learning-based Approach

Castro, Márcio 03 December 2012 (has links)
Le concept de processeur multicœurs constitue le facteur dominant pour offrir des hautes performances aux applications parallèles. Afin de développer des applications parallèles capable de tirer profit de ces plate-formes, les développeurs doivent prendre en compte plusieurs aspects, allant de l'architecture aux caractéristiques propres à l'application. Dans ce contexte, la Mémoire Transactionnelle (Transactional Memory – TM) apparaît comme une alternative intéressante à la synchronisation basée sur les verrous pour ces plates-formes. Elle permet aux programmeurs d'écrire du code parallèle encapsulé dans des transactions, offrant des garanties comme l'atomicité et l'isolement. Lors de l'exécution, les opérations sont exécutées spéculativement et les conflits sont résolus par ré-exécution des transactions en conflit. Bien que le modèle de TM ait pour but de simplifier la programmation concurrente, les meilleures performances ne pourront être obtenues que si l'exécutif est capable de s'adapter aux caractéristiques des applications et de la plate-forme. Les contributions de cette thèse concernent l'analyse et l'amélioration des performances des applications basées sur la Mémoire Transactionnelle Logicielle (Software Transactional Memory – STM) pour des plates-formes multicœurs. Dans un premier temps, nous montrons que le modèle de TM et ses performances sont difficiles à analyser. Pour s'attaquer à ce problème, nous proposons un mécanisme de traçage générique et portable qui permet de récupérer des événements spécifiques à la TM afin de mieux analyser les performances des applications. Par exemple, les données tracées peuvent être utilisées pour détecter si l'application présente des points de contention ou si cette contention est répartie sur toute l'exécution. Notre approche peut être utilisée sur différentes applications et systèmes STM sans modifier leurs codes sources. Ensuite, nous abordons l'amélioration des performances des applications sur des plate-formes multicœurs. Nous soulignons que le placement des threads (thread mapping) est très important et peut améliorer considérablement les performances globales obtenues. Pour faire face à la grande diversité des applications, des systèmes STM et des plates-formes, nous proposons une approche basée sur l'Apprentissage Automatique (Machine Learning) pour prédire automatiquement les stratégies de placement de threads appropriées pour les applications de TM. Au cours d'une phase d'apprentissage préliminaire, nous construisons les profiles des applications s'exécutant sur différents systèmes STM pour obtenir un prédicteur. Nous utilisons ensuite ce prédicteur pour placer les threads de façon statique ou dynamique dans un système STM récent. Finalement, nous effectuons une évaluation expérimentale et nous montrons que l'approche statique est suffisamment précise et améliore les performances d'un ensemble d'applications d'un maximum de 18%. En ce qui concerne l'approche dynamique, nous montrons que l'on peut détecter des changements de phase d'exécution des applications composées des diverses charges de travail, en prévoyant une stratégie de placement appropriée pour chaque phase. Sur ces applications, nous avons obtenu des améliorations de performances d'un maximum de 31% par rapport à la meilleure stratégie statique. / Multicore processors are now a mainstream approach to deliver higher performance to parallel applications. In order to develop efficient parallel applications for those platforms, developers must take care of several aspects, ranging from the architectural to the application level. In this context, Transactional Memory (TM) appears as a programmer friendly alternative to traditional lock-based concurrency for those platforms. It allows programmers to write parallel code as transactions, which are guaranteed to execute atomically and in isolation regardless of eventual data races. At runtime, transactions are executed speculatively and conflicts are solved by re-executing conflicting transactions. Although TM intends to simplify concurrent programming, the best performance can only be obtained if the underlying runtime system matches the application and platform characteristics. The contributions of this thesis concern the analysis and improvement of the performance of TM applications based on Software Transactional Memory (STM) on multicore platforms. Firstly, we show that the TM model makes the performance analysis of TM applications a daunting task. To tackle this problem, we propose a generic and portable tracing mechanism that gathers specific TM events, allowing us to better understand the performances obtained. The traced data can be used, for instance, to discover if the TM application presents points of contention or if the contention is spread out over the whole execution. Our tracing mechanism can be used with different TM applications and STM systems without any changes in their original source codes. Secondly, we address the performance improvement of TM applications on multicores. We point out that thread mapping is very important for TM applications and it can considerably improve the global performances achieved. To deal with the large diversity of TM applications, STM systems and multicore platforms, we propose an approach based on Machine Learning to automatically predict suitable thread mapping strategies for TM applications. During a prior learning phase, we profile several TM applications running on different STM systems to construct a predictor. We then use the predictor to perform static or dynamic thread mapping in a state-of-the-art STM system, making it transparent to the users. Finally, we perform an experimental evaluation and we show that the static approach is fairly accurate and can improve the performance of a set of TM applications by up to 18%. Concerning the dynamic approach, we show that it can detect different phase changes during the execution of TM applications composed of diverse workloads, predicting thread mappings adapted for each phase. On those applications, we achieve performance improvements of up to 31% in comparison to the best static strategy.
80

Fiandografia: experimentações entre leitura e escrita numa pesquisa em educação

Dalmaso, Alice Copetti 11 April 2016 (has links)
Reading and writing walk together as experimentation in a doctorate research in the perspective of taking the reading of a text (or any other thing that we can come up with) that is not only interpretation, but is also experimentation. People experiment, avoiding the impetus of conclusions, risking themselves in a writing exercise that is not in debt to anyone or anything, a proposal or specifically the ideal of a text or a research in Education. The readings crossing the notions of reading and writing in this work, by authors such as Deleuze, Guattari, Barthes, and Larrosa, along with concepts of becoming and happening, allow me to think reading and writing as a (de)forming process. What can reading and writing produce as movement of learning, thinking, and living in a research in Education? The writing that moves and is moved, which is made along with the experimentation of reading, is developed in a fog of intention which persists between the crossings of life: such as unlearning a little about oneself, destroying things that do not want to be conceived as ours, but that are kept deep into the flesh. These (un)constitutions are about just one process: a doctorate student s process, a (de)forming process, any other process a process of how to experiment with reading and writing in a doctorate research in Education and with what is here called in a twisted and foggy way (de)forming, experimentation, learning. Thread-writing, name given to mean the construction of relations between reading and writing, composes a capable and open weave-fabric/texture that gives body to the research as a path, a trace-weave-sewing of writing threads in a research. This thread-writing persists in teaching, in shared spaces and times, together with the discourses that make us ill. Conjugating writings: a cure, an endless thread-writing. / Leitura e escrita caminham juntas como experimentação numa pesquisa de doutorado, na perspectiva de tomar a leitura de um texto (ou de qualquer outra coisa com a qual se encontre) que não apenas o/a interprete, mas que o/a experimente. Experimenta-se, abre-se mão do ímpeto de conclusões, arriscando-se, assim, a um exercício de escrita que não esteja em falta com alguém ou alguma coisa, uma proposta ou especificamente o ideal de um texto e de uma pesquisa em educação. As leituras que perpassam as noções de leitura e escrita neste trabalho, a partir de autores como Deleuze, Guattari, Barthes e Larrosa, junto aos conceitos de devir e acontecimento, permitem pensar o ler e escrever como processo (de)formativo. O que a leitura e a escrita podem produzir como movimento de aprender, de pensamento e de vida numa pesquisa em educação? A escrita que move e é movimentada, que se faz junto à experimentação da leitura, realiza-se numa névoa de intenção que persiste entre os cruzamentos da vida: o de desaprender um pouco sobre si mesmo, de destruir coisas que não querem ser dadas como nossas, mas que se mantêm encrustadas na pele. Trata-se, nestas (des)constituições, de apenas um processo: de uma estudante de doutorado, um processo (de)formativo, outro qualquer. Um processo de como se experimenta com a leitura e a escrita numa pesquisa de doutorado em educação e do que aqui, até então, tem-se chamado de forma torta e nebulosa de (de)formação, experimentação, aprendizagem. A Fiandografia, nome dado para dizer sobre a construção das relações entre leitura e escrita, compõe uma trama-tecido, passível e aberta, que vai dando corpo à pesquisa, como um caminho, um tracejar-tecer-costurar fios de escrita numa pesquisa; um fiandar que persiste em meio à docência, aos espaços e tempos que se compartilham, junto aos discursos que nos adoecem. Conjugar escritos: uma cura, um fiandar infinito.

Page generated in 0.041 seconds