• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 60
  • 7
  • 6
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 96
  • 42
  • 29
  • 28
  • 19
  • 18
  • 17
  • 13
  • 11
  • 11
  • 10
  • 10
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Efficient dispatch policy for SMT processors

Shmachkov, Igor. January 2009 (has links)
Thesis (M.S.)--State University of New York at Binghamton, Thomas J. Watson School of Engineering and Applied Science, Department of Computer Science, 2009. / Includes bibliographical references.
72

Vývoj vláknových aplikací v jazyce Java / Development of threads's applications in Java

ATTL, Karel January 2008 (has links)
This diploma thesis is aimed at programming of multithreaded applications in Java. With Java 5 comes package java.util.concurrent, which in an important way makes developing of parallel applications easier and more effective. This work is conceived as an introduction to programming of multithreaded applications in Java and could be also used as an educational material. Theoretical introduction about processes and technological background of multitasking gives analogy to threads, at the same time it is touching on Java technology and how Java works with memory. The rest of this diploma thesis concerns practical work with threads. This topic is covered from absolute beginning, which means creating Thread objects, including advanced topics like working with package java.util.concurrent and also some problems that can appear when writing multithreaded applications.
73

Peer to peer systém pro vzdálené ovládání počítače

LEJTNAR, Michal January 2017 (has links)
This thesis deals with creating of decentralized peer to peer system designed for remote control of computers. P2P network and single nodes in this network is inspired by hybrid peer-to-peer network architecture used by Skype application. The application uses terminal services available in operation system Windows for remote control of computers. Namely, MS Remote Desktop and MS Remote Assistance is used. The entire application is created in programing language C#.
74

Ambiente independente de idioma para suporte a identificação de tuplas duplicadas por meio da similaridade fonética e numérica: otimização de algoritmo baseado em multithreading /

Andrade, Tiago Luís de. January 2011 (has links)
Resumo: Com o objetivo de garantir maior confiabilidade e consistência dos dados armazenados em banco de dados, a etapa de limpeza de dados está situada no início do processo de Descoberta de Conhecimento em Base de Dados (Knowledge Discovery in Database - KDD). Essa etapa tem relevância significativa, pois elimina problemas que refletem fortemente na confiabilidade do conhecimento extraído, como valores ausentes, valores nulos, tuplas duplicadas e valores fora do domínio. Trata-se de uma etapa importante que visa a correção e o ajuste dos dados para as etapas posteriores. Dentro dessa perspectiva, são apresentadas técnicas que buscam solucionar os diversos problemas mencionados. Diante disso, este trabalho tem como metodologia a caracterização da detecção de tuplas duplicadas em banco de dados, apresentação dos principais algoritmos baseados em métricas de distância, algumas ferramentas destinadas para tal atividade e o desenvolvimento de um algoritmo para identificação de registros duplicados baseado em similaridade fonética e numérica independente de idioma, desenvolvido por meio da funcionalidade multithreading para melhorar o desempenho em relação ao tempo de execução do algoritmo. Os testes realizados demonstram que o algoritmo proposto obteve melhores resultados na identificação de registros duplicados em relação aos algoritmos fonéticos existentes, fato este que garante uma melhor limpeza da base de dados / Abstract: In order to ensure greater reliability and consistency of data stored in the database, the data cleaning stage is set early in the process of Knowledge Discovery in Database - KDD. This step has significant importance because it eliminates problems that strongly reflect the reliability of the knowledge extracted as missing values, null values, duplicate tuples and values outside the domain. It is an important step aimed at correction and adjustment for the subsequent stages. Within this perspective, techniques are presented that seek to address the various problems mentioned. Therefore, this work is the characterization method of detecting duplicate tuples in the database, presenting the main algorithms based on distance metrics, some tools designed for such activity and the development of an algorithm to identify duplicate records based on phonetic similarity numeric and language-independent, developed by multithreading functionality to improve performance over the runtime of the algorithm. Tests show that the proposed algorithm achieved better results in identifying duplicate records regarding phonetic algorithms exist, a fact that ensures better cleaning of the database / Orientador: Carlos Roberto Valêncio / Coorientador: Maurizio Babini / Banca: Pedro Luiz Pizzigatti Corrêa / Banca: José Márcio Machado / Mestre
75

Proposta de um processador multithreading com caracter?sticas de previsibilidade / Proposal of predictable multithreading processor

Siqueira, Hadley Magno da Costa 18 August 2015 (has links)
Submitted by Automa??o e Estat?stica (sst@bczm.ufrn.br) on 2016-06-14T19:51:32Z No. of bitstreams: 1 HadleyMagnoDaCostaSiqueira_DISSERT.pdf: 1452990 bytes, checksum: 84d7f3a1709799f4355ce71e68b94d8b (MD5) / Approved for entry into archive by Arlan Eloi Leite Silva (eloihistoriador@yahoo.com.br) on 2016-06-15T22:22:57Z (GMT) No. of bitstreams: 1 HadleyMagnoDaCostaSiqueira_DISSERT.pdf: 1452990 bytes, checksum: 84d7f3a1709799f4355ce71e68b94d8b (MD5) / Made available in DSpace on 2016-06-15T22:22:57Z (GMT). No. of bitstreams: 1 HadleyMagnoDaCostaSiqueira_DISSERT.pdf: 1452990 bytes, checksum: 84d7f3a1709799f4355ce71e68b94d8b (MD5) Previous issue date: 2015-08-18 / O projeto de sistemas embarcados de tempo real requer um controle preciso da passagem de tempo na computa??o realizada pelos m?dulos e na comunica??o entre os mesmos. Geralmente, esses sistemas s?o constitu?dos de v?rios m?dulos, cada um projetado para uma tarefa espec?fica e com comunica??o restrita com os demais m?dulos a fim de se obter a temporiza??o necess?ria. Essa estrat?gia, chamada de arquitetura federada, j? est? se tornando invi?vel em frente as demandas atuais de custo, desempenho e qualidade exigidas dos sistema embarcados. Para atacar esse problema, atualmente se prop?e o uso de arquiteturas integradas, que consistem em um ou poucos circuitos realizando v?rias tarefas em paralelo de forma mais eficiente e com redu??o de custos. Entretanto, ? preciso garantir que a arquitetura integrada possua componibilidade temporal, ou seja, a capacidade de projetar cada tarefa temporalmente isolada das demais a fim de manter as caracter?sticas individuais de cada tarefa. As ?Precision Timed Machines? s?o uma abordagem de arquitetura integrada que advoca o uso de processadores ?multithreaded? para garantir componibilidade temporal. Dessa forma, o presente trabalho apresenta a implementa??o de uma ?Precision Timed Machine? chamada Hivek-RT. Este processador, que ? um VLIW com suporte ? ?Simultaneous Multithreading?, ? capaz de executar eficientemente tarefas de tempo real quando comparado ? um processador tradicional. Al?m da execu??o eficiente, a arquitetura facilita a implementa??o, do ponto de vista de programa??o, de tarefas de tempo real. / The real-time embedded systems design requires precise control of the passage of time in the computation performed by the modules and communication between them. Generally, these systems consist of several modules, each designed for a specific task and restricted communication with other modules in order to obtain the required timing. This strategy, called federated architecture, is already becoming unviable in front of the current demands of cost, required performance and quality of embedded system. To address this problem, it has been proposed the use of integrated architectures that consist of one or few circuits performing multiple tasks in parallel in a more efficient manner and with reduced costs. However, one has to ensure that the integrated architecture has temporal composability, ie the ability to design each task temporally isolated from the others in order to maintain the individual characteristics of each task. The Precision Timed Machines are an integrated architecture approach that makes use of multithreaded processors to ensure temporal composability. Thus, this work presents the implementation of a Precision Machine Timed named Hivek-RT. This processor which is a VLIW supporting Simultaneous Multithreading is capable of efficiently execute real-time tasks when compared to a traditional processor. In addition to the efficient implementation, the proposed architecture facilitates the implementation real-time tasks from a programming point of view.
76

Neural network computing using on-chip accelerators

Eldridge, Schuyler 05 November 2016 (has links)
The use of neural networks, machine learning, or artificial intelligence, in its broadest and most controversial sense, has been a tumultuous journey involving three distinct hype cycles and a history dating back to the 1960s. Resurgent, enthusiastic interest in machine learning and its applications bolsters the case for machine learning as a fundamental computational kernel. Furthermore, researchers have demonstrated that machine learning can be utilized as an auxiliary component of applications to enhance or enable new types of computation such as approximate computing or automatic parallelization. In our view, machine learning becomes not the underlying application, but a ubiquitous component of applications. This view necessitates a different approach towards the deployment of machine learning computation that spans not only hardware design of accelerator architectures, but also user and supervisor software to enable the safe, simultaneous use of machine learning accelerator resources. In this dissertation, we propose a multi-transaction model of neural network computation to meet the needs of future machine learning applications. We demonstrate that this model, encompassing a decoupled backend accelerator for inference and learning from hardware and software for managing neural network transactions can be achieved with low overhead and integrated with a modern RISC-V microprocessor. Our extensions span user and supervisor software and data structures and, coupled with our hardware, enable multiple transactions from different address spaces to execute simultaneously, yet safely. Together, our system demonstrates the utility of a multi-transaction model to increase energy efficiency improvements and improve overall accelerator throughput for machine learning applications.
77

Vývoj paralelních aplikací s Intel Threading Tools / Parallel Application Development with Intel Threading Tools

Vadkerti, Ladislav Unknown Date (has links)
Today's trend in microprocessor design is increasing the number of execution cores within one single chip. Increasing the processor's clock speed reached its limit with growing power consumption. This trend brings new opportunities to software developers, as they can take advantage of real multithreading in their applications. But a lot of new problems to solve appear with threading compared to sequential programming. With proper design, threading can enhance performance by making better use of hardware resources. However, the improper use of threading can lead to performance degradation, unpredictible behavior, or error conditions that are difficult to solve. For this reason Intel developed a suite of tools, that can help software developers to analyze performance and detect coding errors in thread interactions. This thesis focuses on the examination of ways that this tools can be used in multithreaded application development.
78

Performance Considerations for the Deployment of Video Streaming Pipelines Using Containers / Prestationsöverväganden vid distribution av videoströmningsrörledningar med behållare

Winiarski, Michal January 2020 (has links)
Cloud-based video processing is an area depending heavily on the hardware’s ability to process huge amounts of packets. Nowadays, we can observe industry drifting away from commonly used FPGAs in lieu of a more flexible software approach. Docker container has been considered a promising technology for constructing video streaming pipelines as it provides a fast and easy way to package and distribute software. Recent developments in the Network Function Virtualization field showed that fast packet processing frameworks like Intel Data Plane Development Kit (DPDK) have a potential to improve the performance of network function chains. This technology could be an enabler for software video processing to approach hardware solutions, yet it is still in quite an early stage and generates many questions about usage, deployment, and performance. This thesis shows that it is possible to build packet processing pipelines using DPDK, running dozens of video processing microservices simultaneously on a single machine. The project implementation was evaluated in terms of latency and throughput and behaviour of co-running applications on a single CPU core was modelled. / Molntjänster Inom området videobearbetning är starkt beroende av hårdvarans förmåga att kontinuerligt bearbeta mycket stora mängder paket. Idag har det blivit vanligt inom professionell användning att välja bort tidigare vanliga FPGA-baserade lösningar, till förmån för mer flexibla mjukvarubaserade lösningar. Docker containers har setts som en lovande teknologi för att konstruera pipelines för strömmande video, då de erbjuder ett snabbt och enkelt sätt att paketera och distribuera mjukvara. Utvecklingen inom virtualisering av nätverksfunktioner (NFV) har visat att ramverk för snabb paketprocessning, såsom Intel DPDK, har potential att förbättra prestandan hos kedjor av nätverksfunktioner. Denna teknologi gör det möjligt för mjukvarubaserad videobehandling att hävda sig i jämförense med hårdvarubaserade varianter. Den är dock relativt ny och oprövad och det kvarstår många öppna frågor om användning, driftsättning och prestanda. Detta examensarbete visar att det är möjligt att bygga paketbearbetande pipelines med DPDK, som kör dussintals nätverksfunktioner samtidigt på en maskin. En implementation har konstruerats och utvärderats med fokus på latens och flöde, och beteendemönster för applikationer som kör samtidigt på samma CPU har modellerats.
79

Alinhamento de seqüências biológicas em arquiteturas com memória distribuída

Peranconi, Daniela Saccol 04 March 2005 (has links)
Made available in DSpace on 2015-03-05T13:53:44Z (GMT). No. of bitstreams: 0 Previous issue date: 4 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / A utilização de aglomerados de computadores na solução de problemas que demandam grande quantidade de recursos computacionais vem se mostrando uma alternativa interessante. aglomerados são economicamente viáveis e de fácil manutenção, oferecendo poder computacional equivalente ao de supercomputadores. No entanto, o desenvolvimento de aplicações para este tipo de arquitetura é complexo, uma vez que envolve questões não presentes na programação seqüencial, como a comunicação de dados e a sincronização de tarefas concorrentes, problemas estes que, em geral, são tratados em supercomputadores por pacotes de software especializados. Neste contexto, este trabalho apresenta o desenvolvimento de um mecanismo de suporte à comunicação sobre aglomerados de computadores, focado na exploração desta plataforma de hardware para o processamento de alto desempenho. O mecanismo criado e disponibilizado sob a forma de uma biblioteca de funções em C, é baseado no modelo de Mensagens Ativas. Sua implementação é realizada na cama / The use of cluster of computers for solving problems that require a great quantity of computational resources is becoming an interesting alternative. Clusters are economically feasible and of easy maintenance, offering a computational power equivalent to that of supercomputers. However developing applications for this kind of architecture is complex because it involves issues that are not present in the sequential programming such as data communication and concurrent tasks synchronization, problems that usually are handled by specialized software packages in supercomputers. Considering this context, this work presents the development of a mechanism for supporting communication on clusters of computers focused on exploring this hardware platform for high performance processing. The mechanism was created as a library of functions written in C and it is based on the Active Messages model. Its implementation was performed on the applicative level, using light multiprogramming techniques as programming resou
80

Exploring coordinated software and hardware support for hardware resource allocation

Figueiredo Boneti, Carlos Santieri de 04 September 2009 (has links)
Multithreaded processors are now common in the industry as they offer high performance at a low cost. Traditionally, in such processors, the assignation of hardware resources between the multiple threads is done implicitly, by the hardware policies. However, a new class of multithreaded hardware allows the explicit allocation of resources to be controlled or biased by the software. Currently, there is little or no coordination between the allocation of resources done by the hardware and the prioritization of tasks done by the software.This thesis targets to narrow the gap between the software and the hardware, with respect to the hardware resource allocation, by proposing a new explicit resource allocation hardware mechanism and novel schedulers that use the currently available hardware resource allocation mechanisms.It approaches the problem in two different types of computing systems: on the high performance computing domain, we characterize the first processor to present a mechanism that allows the software to bias the allocation hardware resources, the IBM POWER5. In addition, we propose the use of hardware resource allocation as a way to balance high performance computing applications. Finally, we propose two new scheduling mechanisms that are able to transparently and successfully balance applications in real systems using the hardware resource allocation. On the soft real-time domain, we propose a hardware extension to the existing explicit resource allocation hardware and, in addition, two software schedulers that use the explicit allocation hardware to improve the schedulability of tasks in a soft real-time system.In this thesis, we demonstrate that system performance improves by making the software aware of the mechanisms to control the amount of resources given to each running thread. In particular, for the high performance computing domain, we show that it is possible to decrease the execution time of MPI applications biasing the hardware resource assignation between threads. In addition, we show that it is possible to decrease the number of missed deadlines when scheduling tasks in a soft real-time SMT system.

Page generated in 0.0693 seconds