• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 41
  • 8
  • 6
  • 5
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 76
  • 26
  • 15
  • 15
  • 14
  • 13
  • 13
  • 13
  • 12
  • 11
  • 11
  • 11
  • 11
  • 10
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Estudo da análise de fadiga pelo MEF considerando os efeitos da estampagem / Study of fatigue analysis by FEM considering the metal stamping effects

Aguado, Clodoaldo Garcia 18 August 2018 (has links)
Orientador: Alfredo Rocha de Faria / Dissertação (mestrado profissional) - Universidade Estadual de Campinas, Faculdade de Engenharia Mecânica / Made available in DSpace on 2018-08-18T21:18:25Z (GMT). No. of bitstreams: 1 Aguado_ClodoaldoGarcia_M.pdf: 53529976 bytes, checksum: c01081cee3de72ee7fd5c2ac1f9d0612 (MD5) Previous issue date: 2011 / Resumo: As simulações dentro do ambiente de projeto devem prever a utilização de novas variáveis, especialmente de processos, de forma a aumentar a proximidade entre os modelos virtual e real e, com isso, aumentar a precisão das simulações. O objetivo deste trabalho foi o de estudar um modelo pelo Método dos Elementos Finitos que incluísse os efeitos da variação de espessura provenientes do processo de estampagem para a análise de vida em fadiga de um componente do sistema de exaustão automotivo. Primeiramente foi realizada a simulação do processo de estampagem, onde foi possível identificar as mudanças de espessura na geometria do componente estudado. O resultado dessa simulação foi transportado para a malha de elementos finitos, de forma que as análises posteriores, estrutural e de fadiga, considerassem os efeitos da redução e aumento da espessura local. Como base de comparação, as mesmas análises foram realizadas para a condição de espessura constante, tradicionalmente adotada durante a fase de projeto. Utilizando o modelo de Wöhler-Goodman-Miner para o cálculo do Dano acumulado e comparando com dados experimentais aquisitados em simulador veicular, os resultados cálculo de fadiga demonstraram que ambas as condições de espessura atingem vida infinita. Entretanto, após a seleção e análise de algumas regiões consideradas críticas nas simulações precedentes (estampagem e estrutural), pôde-se notar que na maior parte das regiões os valores de dano acumulado eram inferiores para a condição de espessura constante. Já a condição de espessura variável se aproximou mais ao resultado medido. Pôde-se concluir com os resultados que a variação de espessura, como efeito do processo de estampagem, desempenha um papel importante na vida do componente estudado, indicando que a utilização dos dados de processo auxilia na aproximação entre os resultados do cálculo de vida e a condição real / Abstract: The simulations within the Product Development environment must give the chance to use new variables, particularly from processes, in order to improve the correlation between the virtual and real models, and thus enhance the accuracy of the simulations. The objective of this work was to study a model by the finite element method (FEM) that includes the effects of thickness variation from stamping process for the fatigue life prediction in an automotive exhaust system component. Firstly it was carried out a simulation of the stamping process, where it was possible to identify the thickness distribution all along the geometry of selected component. Then, the result of this simulation was transported to the finite element mesh, so that further analyses, structural and fatigue, could consider the effects of reduction and increasing in local thickness. As a basis for comparison, the same analyses were performed for the homogeneous geometry (uniform thickness), which are typically taken during the design phase. Using the model proposed by Wöhler-Goodman-Miner to calculate the accumulated damage and comparing with real data acquisited from a vehicle, the calculation results of fatigue showed that both conditions (equal and unequal thickness) reached infinite life. However, after the selection and analysis of some critical regions in the previous simulations (stamping and structural), it might be noted that, in most regions, the values of accumulated damage were lower for the uniform thickness condition. Besides, for the unequal thickness condition, the results got closer to those measured in vehicle. With these results, it could be demonstrated that the unequal thickness, as a process variable, plays an important role in the component life, indicating that the use of process data helps in bringing together the results from fatigue calculation and real condition / Mestrado / Manufatura / Mestre em Engenharia Automobilistica
22

Deferred rendering using Compute shaders / Deferred rendering med Compute shaders

Golba, Benjamin January 2010 (has links)
Game developers today are putting a lot of effort into their games. Consumers are hard to please and demand a game which can provide both fun and visual quality. This is why developers aim to make the most use of what hardware resources are available to them to achieve the best possible quality of the game. It is easy to use too many performance demanding techniques in a game, making the game unplayable. The hard part is to make the game look good without decreasing the performance. This can be done by using techniques in a smart way to make the graphics as smooth and efficient as they can be without compromising the visual quality. One of these techniques is deferred rendering. The latest version of Microsoft’s graphics platform, DirectX 11, comes with several new features. One of these is the Compute shader which is a feature making it easier to execute general computation on the graphics card. Developers do not need to use DirectX 11 cards to be able to use this feature though. Microsoft has made it available on graphic cards made for DirectX 10 as well. There are however a few differences between the two versions. The focus of this report will be to investigate the possible performance differences between these versions on when using deferred rendering. An application was made supporting both shader model 4 and 5 of the compute shader, to be able to investigate this.
23

Real-time generation of kd-trees for ray tracing using DirectX 11

Säll, Martin, Cronqvist, Fredrik January 2017 (has links)
Context. Ray tracing has always been a simple but effective way to create a photorealistic scene but at a greater cost when expanding the scene. Recent improvements in GPU and CPU hardware have made ray tracing faster, making more complex scenes possible with the same amount of time needed to process the scene. Despite the improvements in hardware ray tracing is still rarely run at a interactive speed. Objectives. The aim of this experiment was to implement a new kdtree generation algorithm using DirectX 11 compute shaders. Methods. The implementation created during the experiment was tested using two platforms and five scenarios where the generation time for the kd-tree was measured in milliseconds. The results where compared to a sequential implementation running on the CPU. Results. In the end the kd-tree generation algorithm implemented did not run within our definition of real-time. Comparing the generation times from the implementations shows that there is a speedup for the GPU implementation compared to our CPU implementation, it also shows linear scaling for the generation time as the number of triangles in the scene increase. Conclusions. Noticeable limitations encountered during the experiment was that the handling of dynamic structures and sorting of arrays are limited which forced us to use less memory efficient solutions.
24

A framework to migrate and replicate VMware Virtual Machines to Amazon Elastic Compute Cloud : Performance comparison between on premise and the migrated Virtual Machine

Bachu, Rajesh January 2015 (has links)
Context Cloud Computing is the new trend in the IT industry. Traditionally obtaining servers was quiet time consuming for companies. The whole process of research on what kind of hardware to buy, get budget approval, purchase the hardware and get access to the servers could take weeks or months. In order to save time and reduce expenses, most companies are moving towards the cloud. One of the known cloud providers is Amazon Elastic Compute Cloud (EC2). Amazon EC2 makes it easy for companies to obtain virtual servers (known as computer instances) in a cloud quickly and inexpensively. Another advantage of using Amazon EC2 is the flexibility that they offer, so the companies can even import/export the Virtual Machines (VM) that they have built which meets the companies IT security, configuration, management and compliance requirements into Amazon EC2. Objectives In this thesis, we investigate importing a VM running on VMware into Amazon EC2. In addition, we make a performance comparison between a VM running on VMware and the VM with same image running on Amazon EC2. Methods A Case study research has been done to select a persistent method to migrate VMware VMs to Amazon EC2. In addition an experimental research is conducted to measure the performance of Virtual Machine running on VMware and compare it with same Virtual Machine running on EC2. We measure the performance in terms of CPU, memory utilization as well as disk read/write speed using well-known open-source benchmarks from Phoronix Test Suite (PTS). Results Investigation on importing VM snapshots (VMDK, VHD and RAW format) to EC2 was done using three methods provided by AWS. Comparison of performance was done by running each benchmark for 25 times on each Virtual Machine. Conclusions Importing VM to EC2 was successful only with RAW format and replication was not successful as AWS installs some software and drivers while importing the VM to EC2. Migrated EC2 VM performs better than on premise VMware VM in terms of CPU, memory utilization and disk read/write speed.
25

Screen-Space Subsurface Scattering, A Real-time Implementation Using Direct3D 11.1 Rendering API

Andersen, Dennis January 2015 (has links)
Context Subsurface scattering - the effect of light scattering within a material. Lots of materials on earth possess translucent properties. It is therefore an important factor to consider when trying to render realistic images. Historically the effect has been used for offline rendering with ray tracers, but is now considered a real-time rendering technique and is done based on approximations off previous models. Early real-time methods approximates the effect in object texture space which does not scale well with real-time applications such as games. A relatively new approach makes it possible to apply the effect as a post processing effect using GPGPU capabilities, making this approach compatible with most modern rendering pipelines. Objectives The aim of this thesis is to explore the possibilities of a dynamic real-time solution to subsurface scattering with a modern rendering API to utilize GPGPU programming and modern data management, combined with previous techniques Methods The proposed subsurface scattering technique is implemented in a delimited real-time graphics engine using a modern rendering API to evaluate the impact on performance by conducting several experiments with specific properties. Results The result obtained hints that by using a flexible solution to represent materials, execution time lands at an acceptable rate and could be used in real-time. These results shows that the execution time grows nearly linearly with consideration to the number of layers and the strength of the effect. Because the technique is performed in screen space, the performance scales with subsurface scattering screen coverage and screen resolution. Conclusions The technique could be used in real-time and could trivially be integrated to most existing rendering pipelines. Further research and testing should be done in order to determine how the effect scales in a complex 3D-game environment.
26

Enhancing productivity and performance portability of OpenCL applications on heterogeneous systems using runtime optimizations

Lutz, Thibaut January 2015 (has links)
Initially driven by a strong need for increased computational performance in science and engineering, heterogeneous systems have become ubiquitous and they are getting increasingly complex. The single processor era has been replaced with multi-core processors, which have quickly been surrounded by satellite devices aiming to increase the throughput of the entire system. These auxiliary devices, such as Graphics Processing Units, Field Programmable Gate Arrays or other specialized processors have very different architectures. This puts an enormous strain on programming models and software developers to take full advantage of the computing power at hand. Because of this diversity and the unachievable flexibility and portability necessary to optimize for each target individually, heterogeneous systems remain typically vastly under-utilized. In this thesis, we explore two distinct ways to tackle this problem. Providing automated, non intrusive methods in the form of compiler tools and implementing efficient abstractions to automatically tune parameters for a restricted domain are two complementary approaches investigated to better utilize compute resources in heterogeneous systems. First, we explore a fully automated compiler based approach, where a runtime system analyzes the computation flow of an OpenCL application and optimizes it across multiple compute kernels. This method can be deployed on any existing application transparently and replaces significant software engineering effort spent to tune application for a particular system. We show that this technique achieves speedups of up to 3x over unoptimized code and an average of 1.4x over manually optimized code for highly dynamic applications. Second, a library based approach is designed to provide a high level abstraction for complex problems in a specific domain, stencil computation. Using domain specific techniques, the underlying framework optimizes the code aggressively. We show that even in a restricted domain, automatic tuning mechanisms and robust architectural abstraction are necessary to improve performance. Using the abstraction layer, we demonstrate strong scaling of various applications to multiple GPUs with a speedup of up to 1.9x on two GPUs and 3.6x on four.
27

Bewertung der Compute-Leistung von Workstations mit SPEC-CPU Benchmarks

Mund, Carsten 29 July 1996 (has links)
Nach einer Einfürung in SPEC und deren Berwertungs- verfahren wird die Art und Weise der SPEC-Leistungsmessungen eingehender beleuchtet. Der Hauptteil beinhaltet Durchführung sowie Auswertung von SPEC-Benchmarks an 5 Workstations. Die dabei gewonnenen Ergebnisse werden mit den offiziell verbreiteten SPEC-Werten verglichen und diskutiert.
28

Integer-forcing architectures: cloud-radio access networks, time-variation and interference alignment

El Bakoury, Islam 04 June 2019 (has links)
Next-generation wireless communication systems will need to contend with many active mobile devices, each of which will require a very high data rate. To cope with this growing demand, network deployments are becoming denser, leading to higher interference between active users. Conventional architectures aim to mitigate this interference through careful design of signaling and scheduling protocols. Unfortunately, these methods become less effective as the device density increases. One promising option is to enable cellular basestations (i.e., cell towers) to jointly process their received signals for decoding users’ data packets as well as to jointly encode their data packets to the users. This joint processing architecture is often enabled by a cloud radio access network that links the basestations to a central processing unit via dedicated connections. One of the main contributions of this thesis is a novel end-to-end communications architecture for cloud radio access networks as well as a detailed comparison to prior approaches, both via theoretical bounds and numerical simulations. Recent work has that the following high-level approach has numerous advantages: each basestation quantizes its observed signal and sends it to the central processing unit for decoding, which in turn generates signals for the basestations to transmit, and sends them quantized versions. This thesis follows an integer-forcing approach that uses the fact that, if codewords are drawn from a linear codebook, then their integer-linear combinations are themselves codewords. Overall, this architecture requires integer-forcing channel coding from the users to the central processing unit and back, which handles interference between the users’ codewords, as well as integer-forcing source coding from the basestations to the central processing unit and back, which handles correlations between the basestations’ analog signals. Prior work on integer-forcing has proposed and analyzed channel coding strategies as well as a source coding strategy for the basestations to the central processing unit, and this thesis proposes a source coding strategy for the other direction. Iterative algorithms are developed to optimize the parameters of the proposed architecture, which involve real-valued beamforming and equalization matrices and integer-valued coefficient matrices in a quadratic objective. Beyond the cloud radio setting, it is argued that the integer-forcing approach is a promising framework for interference alignment between multiple transmitter-receiver pairs. In this scenario, the goal is to align the interfering data streams so that, from the perspective of each receiver, there seems to be only a signal receiver. Integer-forcing interference alignment accomplishes this objective by having each receiver recover two linear combinations that can then be solved for the desired signal and the sum of the interference. Finally, this thesis investigates the impact of channel coherence on the integer-forcing strategy via numerical simulations.
29

Fault Insertion and Fault Analysis of Neural Cache Memory

Koneru, Venkata Raja Ramchandar 16 June 2020 (has links)
No description available.
30

Hardware-Aware Distributed Pipelined Neural Network Models Inference

Alshams, Mojtaba 07 1900 (has links)
Neural Network models got the attention of the scientific community for their increasing accuracy in predictions and good emulation of some human tasks. This led to extensive enhancements in their architecture, resulting in models with fast-growing memory and computation requirements. Due to hardware constraints such as memory and computing capabilities, the inference of a large neural network model can be distributed across multiple devices by a partitioning algorithm. The proposed framework finds the optimal model splits and chooses which device shall compute a corresponding split to minimize inference time and energy. The framework is based on PipeEdge algorithm and extends it by not only increasing inference throughput but also simultaneously minimizing inference energy consumption. Another thesis contribution is the augmentation of the emerging technology Compute-in-memory (CIM) devices to the system. To the best of my knowledge, no one studied the effect of including CIM, specifically DNN+NeuroSim simulator, devices in a distributed inference. My proposed framework could partition VGG8 and ResNet152 on ImageNet and achieve a comparable trade-off between inference slowest stage increase and energy reduction when it tried to decrease inference energy (e.g. 19% energy reduction with 34% time increase) and when CIM devices were augmenting the system (e.g. 34% energy reduction with 45% time increase).

Page generated in 0.0968 seconds