Global ETD Search

31	Audiovisual voice activity detection and localization of simultaneous speech sources / Detecção de atividade de voz e localização de fontes sonoras simultâneas utilizando informações audiovisuais Minotto, Vicente Peruffo January 2013 (has links) Em vista da tentência de se criarem intefaces entre humanos e máquinas que cada vez mais permitam meios simples de interação, é natural que sejam realizadas pesquisas em técnicas que procuram simular o meio mais convencional de comunicação que os humanos usam: a fala. No sistema auditivo humano, a voz é automaticamente processada pelo cérebro de modo efetivo e fácil, também comumente auxiliada por informações visuais, como movimentação labial e localizacão dos locutores. Este processamento realizado pelo cérebro inclui dois componentes importantes que a comunicação baseada em fala requere: Detecção de Atividade de Voz (Voice Activity Detection - VAD) e Localização de Fontes Sonoras (Sound Source Localization - SSL). Consequentemente, VAD e SSL também servem como ferramentas mandatórias de pré-processamento em aplicações de Interfaces Humano-Computador (Human Computer Interface - HCI), como no caso de reconhecimento automático de voz e identificação de locutor. Entretanto, VAD e SSL ainda são problemas desafiadores quando se lidando com cenários acústicos realísticos, particularmente na presença de ruído, reverberação e locutores simultâneos. Neste trabalho, são propostas abordagens para tratar tais problemas, para os casos de uma e múltiplas fontes sonoras, através do uso de informações audiovisuais, explorando-se variadas maneiras de se fundir as modalidades de áudio e vídeo. Este trabalho também emprega um arranjo de microfones para o processamento de som, o qual permite que as informações espaciais dos sinais acústicos sejam exploradas através do algoritmo estado-da-arte SRP (Steered Response Power). Por consequência adicional, uma eficiente implementação em GPU do SRP foi desenvolvida, possibilitando processamento em tempo real do algoritmo. Os experimentos realizados mostram uma acurácia média de 95% ao se efetuar VAD de até três locutores simultâneos, e um erro médio de 10cm ao se localizar tais locutores. / Given the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in an effortless and effective way, also commonly aided by visual cues, such as mouth movement and location of the speakers. This processing done by the brain includes two important components that speech-based communication require: Voice Activity Detection (VAD) and Sound Source Localization (SSL). Consequently, VAD and SSL also serve as mandatory preprocessing tools for high-end Human Computer Interface (HCI) applications in a computing environment, as the case of automatic speech recognition and speaker identification. However, VAD and SSL are still challenging problems when dealing with realistic acoustic scenarios, particularly in the presence of noise, reverberation and multiple simultaneous speakers. In this work we propose some approaches for tackling these problems using audiovisual information, both for the single source and the competing sources scenario, exploiting distinct ways of fusing the audio and video modalities. Our work also employs a microphone array for the audio processing, which allows the spatial information of the acoustic signals to be explored through the stateof- the art method Steered Response Power (SRP). As an additional consequence, a very fast GPU version of the SRP is developed, so that real-time processing is achieved. Our experiments show an average accuracy of 95% when performing VAD of up to three simultaneous speakers and an average error of 10cm when locating such speakers. Reconhecimento : Padroes Reconhecimento : Voz Voz computacional Tempo real Voice activity detection Sound source localization Multiple speakers Competing sources Multimodal fusion Microphone array HiddenMarkov model Support vector machine GPU programming
32	Adaptive rendering of celestial bodies in WebGL Zeitler, Jonas January 2015 (has links) This report covers theory and comparison of techniques for rendering massive scale 3D geospa- tial planet data in a web browser. It also presents implementation details of a few of these tech- niques in WebGL and Javascript, using the Three.js [1] 3D library. The thesis project is part of the implementation of Unitea, a web based education platform for interactive astronomy visualizations. Unitea is a derivative of Uniview, which is a fulldome interactive simulation of the universe. A major part of this thesis is dedicated to the implementa- tion of Hierarchical Level of Detail (HLOD) modules for Three.js based on the theory presented by T. Ulrich [2] and later generalized by Cozzi and Ring [3]. HLOD techniques are dynamic level of detail algorithms that represent the surface of objects as accurately as possible from a certain viewing angle. By using space partitioning tree-structures, view based error metrics and culling techniques detailed representations of the objects (in this case planets) can be efficiently rendered in real-time. The modules developed provide a general-purpose library for rendering planets (or other spher- ical objects) with dynamic level of detail in Three.js. The library also features connections to online web map services (WMS) and tile services. WebGL Three.js Javascript HLOD Geometry Clipmaps Planet rendering Level of Detail Real-time rendering Mobile devices Heightmaps Massive terrain Out-of-core rendering GPU programming Astronomy Visualization Web Map Services Datateknik Datateknik Computer Engineering Datorteknik
33	[en] RAY TRACING DYNAMIC SCENES ON THE GPU / [pt] TRAÇADO DE RAIOS DE CENAS DINÂMICAS NA GPU PAULO IVSON NETTO SANTOS 14 September 2017 (has links) [pt] O objetivo deste trabalho é desenvolver uma solução completa para o traçado de raios de cenas dinâmicas utilizando a GPU. Para que este algoritmo atinja desempenho interativo, é necessário utilizar uma estrutura espacial para reduzir os testes de interseção entre raios e triângulos da cena. Observa-se que, quando há movimento na cena, é necessário atualizar esta estrutura de aceleração, seja alterando-a parcialmente ou reconstruindo-a inteiramente. Adotamos a segunda estratégia por ser capaz de tratar o caso geral de movimento não-estruturado. Como a construção da estrutura deve ser feita da forma mais eficiente possível, escolhemos utilizar uma Grade Uniforme como foco de nossa pesquisa. Suas vantagens incluem um algoritmo de construção simples e um percurso de raios eficiente. Para explorar o poder de processamento em paralelo de uma GPU, é necessário manter os dados da cena e da estrutura de aceleração dentro da placa gráfica, evitando transferências custosas de memória entre a GPU e a CPU. Propomos neste trabalho uma técnica para construir uma grade uniforme inteiramente na GPU. Usando nosso método, é possível reconstruir toda a estrutura em poucos milissegundos, enquanto mantém-se a alta qualidade da grade obtida. Além disso, propomos uma implementaçoes do algoritmo de traçado de raios de forma a aproveitar o processamento em paralelo da GPU. Nosso procedimento é implementado inteiramente dentro da placa gráfica, onde há acesso direto para os dados dos triângulos da cena, bem como as informações da grade uniforme construída. Utilizando a solução proposta, somos capazes de obter taxas de visualização interativas mesmo para cenas com movimentos não-estruturados, incluindo texturas, sombras e até mesmo reflexões. / [en] We present a technique for ray tracing dynamic scenes using the GPU. In order to achieve interactive rendering rates, it is necessary to use a spatial structure to reduce the number of ray-triangle intersections performed. Whenever there is movement in the scene, this structure is entirely rebuilt. This way, it is possible to handle general unstructured motion. For this purpose, we have developed an algorithm for reconstructing Uniform Grids entirely inside the graphics hardware. In addition, we present ray-traversal and shading algorithms fully implemented on the GPU, including textures, shadows and reections. Combining these techniques, we demonstrate interactive ray tracing performance for dynamic scenes, even with unstructured motion and advanced shading effects. [pt] PROGRAMACAO EM GPU [en] GPU PROGRAMMING [pt] TRACADO DE RAIOS INTERATIVO [en] INTERACTIVE RAY TRACING [pt] GRADES UNIFORMES [en] UNIFORM GRIDS [pt] CENAS DINAMICAS [en] DYNAMIC SCENES [pt] RECONSTRUCAO DE GRADES NA GPU [en] GRID REBUILD ON THE GPU
34	Contributions to parallel stochastic simulation: Application of good software engineering practices to the distribution of pseudorandom streams in hybrid Monte-Carlo simulations Passerat-Palmbach, Jonathan 11 October 2013 (has links) (PDF) The race to computing power increases every day in the simulation community. A few years ago, scientists have started to harness the computing power of Graphics Processing Units (GPUs) to parallelize their simulations. As with any parallel architecture, not only the simulation model implementation has to be ported to the new parallel platform, but all the tools must be reimplemented as well. In the particular case of stochastic simulations, one of the major element of the implementation is the pseudorandom numbers source. Employing pseudorandom numbers in parallel applications is not a straightforward task, and it has to be done with caution in order not to introduce biases in the results of the simulation. This problematic has been studied since parallel architectures are available and is called pseudorandom stream distribution. While the literature is full of solutions to handle pseudorandom stream distribution on CPU-based parallel platforms, the young GPU programming community cannot display the same experience yet. In this thesis, we study how to correctly distribute pseudorandom streams on GPU. From the existing solutions, we identified a need for good software engineering solutions, coupled to sound theoretical choices in the implementation. We propose a set of guidelines to follow when a PRNG has to be ported to GPU, and put these advice into practice in a software library called ShoveRand. This library is used in a stochastic Polymer Folding model that we have implemented in C++/CUDA. Pseudorandom streams distribution on manycore architectures is also one of our concerns. It resulted in a contribution named TaskLocalRandom, which targets parallel Java applications using pseudorandom numbers and task frameworks. Eventually, we share a reflection on the methods to choose the right parallel platform for a given application. In this way, we propose to automatically build prototypes of the parallel application running on a wide set of architectures. This approach relies on existing software engineering tools from the Java and Scala community, most of them generating OpenCL source code from a high-level abstraction layer. Pseudorandom Number Generation (PRNG) High Performance Computing (HPC) Software Engineering Stochastic Simulation Graphics Processing Units (GPUs) GPU Programming Automatic Parallelization

Page generated in 0.0408 seconds