Spelling suggestions: "subject:"ppu"" "subject:"upu""
901 |
Eismo dalyvių kelyje atpažinimas naudojant dirbtinius neuroninius tinklus ir grafikos procesorių / On - road vehicle recognition using neural networks and graphics processing unitKinderis, Povilas 27 June 2014 (has links)
Kasmet daugybė žmonių būna sužalojami autoįvykiuose, iš kurių dalis sužalojimų būna rimti arba pasibaigia mirtimi. Dedama vis daugiau pastangų kuriant įvairias sistemas, kurios padėtų mažinti nelaimių skaičių kelyje. Tokios sistemos gebėtų perspėti vairuotojus apie galimus pavojus, atpažindamos eismo dalyvius ir sekdamos jų padėtį kelyje. Eismo dalyvių kelyje atpažinimas iš vaizdo yra pakankamai sudėtinga, daug skaičiavimų reikalaujanti problema. Šiame darbe šiai problemai spręsti pasitelkti stereo vaizdai, nesugretinamumo žemėlapis bei konvoliuciniai neuroniniai tinklai. Konvoliuciniai neuroniniai tinklai reikalauja daug skaičiavimų, todėl jie optimizuoti pasitelkus grafikos procesorių ir OpenCL. Gautas iki 33,4% spartos pagerėjimas lyginant su centriniu procesoriumi. Stereo vaizdai ir nesugretinamumo žemėlapis leidžia atmesti didelius kadro regionus, kurių nereikia klasifikuoti su konvoliuciniu neuroniniu tinklu. Priklausomai nuo scenos vaizde, reikalingų klasifikavimo operacijų skaičius sumažėja vidutiniškai apie 70-95% ir tai leidžia kadrą apdoroti atitinkamai greičiau. / Many people are injured during auto accidents each year, some injures are serious or end in death. Many efforts are being put in developing various systems, which could help to reduce accidents on the road. Such systems could warn drivers of a potential danger, while recognizing on-road vehicles and tracking their position on the road. On-road vehicle recognition on image is a complex and computationally very intensive problem. In this paper, to solve this problem, stereo images, disparity map and convolutional neural networks are used. Convolutional neural networks are very computational intensive, so to optimize it GPU and OpenCL are used. 33.4% speed improvement was achieved compared to the central processor. Stereo images and disparity map allows to discard large areas of the image, which are not needed to be classified using convolutional neural networks. Depending on the scene of the image, the number of the required classification operations decreases on average by 70-95% and this allows to process the image accordingly faster.
|
902 |
On continuous maximum flow image segmentation algorithm / Segmentation d'images par l'algorithme des flot maximum continuMarak, Laszlo 28 March 2012 (has links)
Ces dernières années avec les progrès matériels, les dimensions et le contenu des images acquises se sont complexifiés de manière notable. Egalement, le différentiel de performance entre les architectures classiques mono-processeur et parallèles est passé résolument en faveur de ces dernières. Pourtant, les manières de programmer sont restées largement les mêmes, instituant un manque criant de performance même sur ces architectures. Dans cette thèse, nous explorons en détails un algorithme particulier, les flots maximaux continus. Nous explicitons pourquoi cet algorithme est important et utile, et nous proposons plusieurs implémentations sur diverses architectures, du mono-processeur à l'architecture SMP et NUMA, ainsi que sur les architectures massivement parallèles des GPGPU. Nous explorons aussi des applications et nous évaluons ses performances sur des images de grande taille en science des matériaux et en biologie à l'échelle nano / In recent years, with the advance of computing equipment and image acquisition techniques, the sizes, dimensions and content of acquired images have increased considerably. Unfortunately as time passes there is a steadily increasing gap between the classical and parallel programming paradigms and their actual performance on modern computer hardware. In this thesis we consider in depth one particular algorithm, the continuous maximum flow computation. We review in detail why this algorithm is useful and interesting, and we propose efficient and portable implementations on various architectures. We also examine how it performs in the terms of segmentation quality on some recent problems of materials science and nano-scale biology
|
903 |
Accelerated sampling of energy landscapesMantell, Rosemary Genevieve January 2017 (has links)
In this project, various computational energy landscape methods were accelerated using graphics processing units (GPUs). Basin-hopping global optimisation was treated using a version of the limited-memory BFGS algorithm adapted for CUDA, in combination with GPU-acceleration of the potential calculation. The Lennard-Jones potential was implemented using CUDA, and an interface to the GPU-accelerated AMBER potential was constructed. These results were then extended to form the basis of a GPU-accelerated version of hybrid eigenvector-following. The doubly-nudged elastic band method was also accelerated using an interface to the potential calculation on GPU. Additionally, a local rigid body framework was adapted for GPU hardware. Tests were performed for eight biomolecules represented using the AMBER potential, ranging in size from 81 to 22\,811 atoms, and the effects of minimiser history size and local rigidification on the overall efficiency were analysed. Improvements relative to CPU performance of up to two orders of magnitude were obtained for the largest systems. These methods have been successfully applied to both biological systems and atomic clusters. An existing interface between a code for free energy basin-hopping and the SuiteSparse package for sparse Cholesky factorisation was refined, validated and tested. Tests were performed for both Lennard-Jones clusters and selected biomolecules represented using the AMBER potential. Significant acceleration of the vibrational frequency calculations was achieved, with negligible loss of accuracy, relative to the standard diagonalisation procedure. For the larger systems, exploiting sparsity reduces the computational cost by factors of 10 to 30. The acceleration of these computational energy landscape methods opens up the possibility of investigating much larger and more complex systems than previously accessible. A wide array of new applications are now computationally feasible.
|
904 |
EFFECTS OF TOPOGRAPHIC DEPRESSIONS ON OVERLAND FLOW: SPATIAL PATTERNS AND CONNECTIVITYFeng Yu (5930453) 17 January 2019 (has links)
Topographic depressions are naturally occurring low land areas surrounded by areas of high elevations, also known as “pits” or “sinks”, on terrain surfaces. Traditional watershed modeling often neglects the potential effects of depressions by implementing removal (mostly filling) procedures on the digital elevation model (DEM) prior to the simulation of physical processes. The assumption is that all the depressions are either spurious in the DEM or of negligible importance for modeling results. However, studies suggested that naturally occurring depressions can change runoff response and connectivity in a watershed based on storage conditions and their spatial arrangement, e.g., shift active contributing areas and soil moisture distributions, and timing and magnitude of flow discharge at the watershed outlet. In addition, recent advances in remote sensing techniques, such as LiDAR, allow us to examine this modeling assumption because naturally occurring depressions can be represented using high-resolution DEM. This dissertation provides insights on the effects of depressions on overland flow processes at multiple spatial scales, from internal depression areas to the watershed scale, based on hydrologic connectivity metrics. Connectivity describes flow pathway connectedness and is assessed using geostatistical measures of heterogeneity in overland flow patterns, i.e., connectivity function and integral connectivity scale lengths. A new algorithm is introduced here to upscale connectivity metrics to large gridded patterns (i.e., with > 1,000,000 cells) using GPU-accelerated computing. This new algorithm is sensitive to changes of connectivity directions and magnitudes in spatial patterns and is robust for large DEM grids with depressions. Implementation of the connectivity metrics to overland flow patterns generated from original and depression filled DEMs for a study watershed indicates that depressions typically decrease overland flow connectivity. A series of macro connectivity stages based on spatial distances are identified, which represent changes in the interaction mechanisms between overland flow and depressions, i.e., the relative dominance of fill and spill, and the relative speed of fill and formation of connected pathways. In addition, to study the role of spatial resolutions on such interaction mechanisms at watershed scale, two revised functional connectivity metrics are also introduced, based on depressions that are hydraulically connected to the watershed outlet and runoff response to rainfall. These two functional connectivity metrics are sensitive to connectivity changes in overland flow patterns because of depression removal (filling) for DEMs at different grid resolutions. Results show that these two metrics indicate the spatial and statistical characteristics of depressions and their implications on overland flow connectivity, and may also relate to storage and infiltration conditions. In addition, grid resolutions have a more significant impact on overland flow connectivity than depression removal (filling).
|
905 |
Cloud computing v herním průmyslu / Cloud Computing in Gaming IndustryGleza, Jan January 2012 (has links)
This thesis analyzes current status of software distribution in gaming industry and its biggest challenges. Thesis also includes insight in completely different approach to the software distribution -- Cloud Gaming. In practical part are thoroughly tested existing solutions and done functionality analysis. Practical part also includes experiment which regards to building own platform using currently existing tools and it is followed by discussion of results.
|
906 |
Proposição de plataforma co-design para processamento de imagens de sensoriamento remoto /Cardim, Guilherme Pina. January 2019 (has links)
Orientador: Erivaldo Antonio da Silva / Resumo: O processamento digital de imagens (PDI) consiste em uma área de grande interesse científico. Em Cartografia, o PDI é muito utilizado para extração de feições cartográficas de interesse presentes nas imagens de sensoriamento remoto (SR). Dentre as feições cartográficas, a detecção de malhas viárias é de grande interesse científico, pois proporciona a obtenção de informações atualizadas e acuradas para a realização de planejamentos urbanos. Devido à sua importância, a literatura científica possui diversos trabalhos propondo diferentes metodologias de extração de malhas viárias em imagens digitais. Dentre as metodologias, é possível encontrar metodologias propostas baseadas em lógica fuzzy, em detector de bordas e crescimento de regiões, por exemplo. Contudo, os estudos existentes focam na aplicação da metodologia de extração para determinadas áreas ou situações e utilizam recortes da imagem em seus estudos devido à grande quantidade de informações contidas nessas imagens. O avanço tecnológico proporcionou que imagens de SR sejam adquiridas com alta resolução espacial, espectral e temporal. Esse fato produz uma grande quantidade de dados a serem processados durante estudos desenvolvidos nessas imagens, o que acarreta um alto custo computacional e, consequentemente, um alto tempo de processamento. Na tentativa de reduzir o tempo de execução das metodologias de extração, os desenvolvedores dedicam esforços na redução da complexidade dos algoritmos e na utilização de outros recurs... (Resumo completo, clicar acesso eletrônico abaixo) / Resumen: El procesamiento digital de imágenes (PDI) consiste en un área de gran interés científico en diferentes áreas. En Cartografía, el PDI es muy utilizado en estudios de teledetección para extracción de los objetos cartográficos de interés presentes en las imágenes orbitales. Entre los objetos cartográficos de interés, la detección de redes viales se ha vuelto de gran interés científico proporcionando la obtención de informaciones actualizadas y precisas para la realización de planificaciones urbanas, por ejemplo. En este sentido, la literatura científica posee diversos trabajos proponiendo diferentes metodologías de extracción de redes viales en imágenes orbitales. Es posible encontrar metodologías propuestas basadas en lógica fuzzy, detector de bordes y crecimiento por región, por ejemplo. Sin embargo, los estudios existentes se centran en la aplicación de la metodología de extracción para determinadas áreas o situaciones y utilizan recortes de la imagen orbitales en sus estudios debido a la gran cantidad de informaciones contenidas en esas imágenes. Además, el avance tecnológico proporcionó que las imágenes de teledetección se adquieran con altas resoluciones espacial, espectral y temporal. Este hecho produce una gran cantidad de datos a ser procesados durante estudios desarrollados en esas imágenes, lo que acarrea en un alto costo computacional y, consecuentemente, un alto tiempo de procesamiento. En el intento de reducir el tiempo de respuesta de las metodologías de extracci... (Resumen completo clicar acceso eletrônico abajo) / Abstract: Digital image processing (DIP) consists of an area of great scientific interest in different areas. In Cartography, the DIP is widely used in remote sensing studies to extract cartographic features of interest present in orbital images. Among the cartographic features, the detection of road networks has become of great scientific interest, since it can provide accurate and updated information for urban planning, for example. In this sense, the scientific literature has several works proposing different methodologies of extraction of road networks in orbital images. It is possible to find proposed methodologies based on fuzzy logic, edge detector and growth by region, for example. However, the existing studies focus on the application of the extraction methodology to certain areas or situations and use orbital image cuts in their studies due to the large amount of information contained in these images. In addition, the technological advance has allowed the acquisition of remote sensing images with high spatial, spectral and temporal resolutions. This fact produces a large amount of data to be processed during studies developed in these images, which results in a high computational cost and, consequently, a high processing time. In an attempt to reduce the response time of the extraction methodologies, the developers dedicate efforts in reducing the complexity of the algorithms and in using some available hardware resources suggesting solutions that include software and hardwar... (Complete abstract click electronic access below) / Doutor
|
907 |
The Thermal-Constrained Real-Time Systems Design on Multi-Core Platforms -- An Analytical ApproachSHA, SHI 21 March 2018 (has links)
Over the past decades, the shrinking transistor size enabled more transistors to be integrated into an IC chip, to achieve higher and higher computing performances. However, the semiconductor industry is now reaching a saturation point of Moore’s Law largely due to soaring power consumption and heat dissipation, among other factors. High chip temperature not only significantly increases packing/cooling cost, degrades system performance and reliability, but also increases the energy consumption and even damages the chip permanently. Although designing 2D and even 3D multi-core processors helps to lower the power/thermal barrier for single-core architectures by exploring the thread/process level parallelism, the higher power density and longer heat removal path has made the thermal problem substantially more challenging, surpassing the heat dissipation capability of traditional cooling mechanisms such as cooling fan, heat sink, heat spread, etc., in the design of new generations of computing systems. As a result, dynamic thermal management (DTM), i.e. to control the thermal behavior by dynamically varying computing performance and workload allocation on an IC chip, has been well-recognized as an effective strategy to deal with the thermal challenges.
Over the past decades, the shrinking transistor size, benefited from the advancement of IC technology, enabled more transistors to be integrated into an IC chip, to achieve higher and higher computing performances. However, the semiconductor industry is now reaching a saturation point of Moore’s Law largely due to soaring power consumption and heat dissipation, among other factors. High chip temperature not only significantly increases packing/cooling cost, degrades system performance and reliability, but also increases the energy consumption and even damages the chip permanently. Although designing 2D and even 3D multi-core processors helps to lower the power/thermal barrier for single-core architectures by exploring the thread/process level parallelism, the higher power density and longer heat removal path has made the thermal problem substantially more challenging, surpassing the heat dissipation capability of traditional cooling mechanisms such as cooling fan, heat sink, heat spread, etc., in the design of new generations of computing systems. As a result, dynamic thermal management (DTM), i.e. to control the thermal behavior by dynamically varying computing performance and workload allocation on an IC chip, has been well-recognized as an effective strategy to deal with the thermal challenges.
Different from many existing DTM heuristics that are based on simple intuitions, we seek to address the thermal problems through a rigorous analytical approach, to achieve the high predictability requirement in real-time system design. In this regard, we have made a number of important contributions. First, we develop a series of lemmas and theorems that are general enough to uncover the fundamental principles and characteristics with regard to the thermal model, peak temperature identification and peak temperature reduction, which are key to thermal-constrained real-time computer system design. Second, we develop a design-time frequency and voltage oscillating approach on multi-core platforms, which can greatly enhance the system throughput and its service capacity. Third, different from the traditional workload balancing approach, we develop a thermal-balancing approach that can substantially improve the energy efficiency and task partitioning feasibility, especially when the system utilization is high or with a tight temperature constraint. The significance of our research is that, not only can our proposed algorithms on throughput maximization and energy conservation outperform existing work significantly as demonstrated in our extensive experimental results, the theoretical results in our research are very general and can greatly benefit other thermal-related research.
|
908 |
Echantillonage d'importance des sources de lumières réalistes / Importance Sampling of Realistic Light SourcesLu, Heqi 27 February 2014 (has links)
On peut atteindre des images réalistes par la simulation du transport lumineuse avec des méthodes de Monte-Carlo. La possibilité d’utiliser des sources de lumière réalistes pour synthétiser les images contribue grandement à leur réalisme physique. Parmi les modèles existants, ceux basés sur des cartes d’environnement ou des champs lumineuse sont attrayants en raison de leur capacité à capter fidèlement les effets de champs lointain et de champs proche, aussi bien que leur possibilité d’être acquis directement. Parce que ces sources lumineuses acquises ont des fréquences arbitraires et sont éventuellement de grande dimension (4D), leur utilisation pour un rendu réaliste conduit à des problèmes de performance.Dans ce manuscrit, je me concentre sur la façon d’équilibrer la précision de la représentation et de l’efficacité de la simulation. Mon travail repose sur la génération des échantillons de haute qualité à partir des sources de lumière par des estimateurs de Monte-Carlo non-biaisés. Dans ce manuscrit, nous présentons trois nouvelles méthodes.La première consiste à générer des échantillons de haute qualité de manière efficace à partir de cartes d’environnement dynamiques (i.e. qui changent au cours du temps). Nous y parvenons en adoptant une approche GPU qui génère des échantillons de lumière grâce à une approximation du facteur de forme et qui combine ces échantillons avec ceux issus de la BRDF pour chaque pixel d’une image. Notre méthode est précise et efficace. En effet, avec seulement 256 échantillons par pixel, nous obtenons des résultats de haute qualité en temps réel pour une résolution de 1024 × 768. La seconde est une stratégie d’échantillonnage adaptatif pour des sources représente comme un "light field". Nous générons des échantillons de haute qualité de manière efficace en limitant de manière conservative la zone d’échantillonnage sans réduire la précision. Avec une mise en oeuvre sur GPU et sans aucun calcul de visibilité, nous obtenons des résultats de haute qualité avec 200 échantillons pour chaque pixel, en temps réel et pour une résolution de 1024×768. Le rendu est encore être interactif, tant que la visibilité est calculée en utilisant notre nouvelle technique de carte d’ombre (shadow map). Nous proposons également une approche totalement non-biaisée en remplaçant le test de visibilité avec une approche CPU. Parce que l’échantillonnage d’importance à base de lumière n’est pas très efficace lorsque le matériau sous-jacent de la géométrie est spéculaire, nous introduisons une nouvelle technique d’équilibrage pour de l’échantillonnage multiple (Multiple Importance Sampling). Cela nous permet de combiner d’autres techniques d’échantillonnage avec le notre basé sur la lumière. En minimisant la variance selon une approximation de second ordre, nous sommes en mesure de trouver une bonne représentation entre les différentes techniques d’échantillonnage sans aucune connaissance préalable. Notre méthode est pertinence, puisque nous réduisons effectivement en moyenne la variance pour toutes nos scènes de test avec différentes sources de lumière, complexités de visibilité et de matériaux. Notre méthode est aussi efficace par le fait que le surcoût de notre approche «boîte noire» est constant et représente 1% du processus de rendu dans son ensemble. / Realistic images can be rendered by simulating light transport with Monte Carlo techniques. The possibility to use realistic light sources for synthesizing images greatly contributes to their physical realism. Among existing models, the ones based on environment maps and light fields are attractive due to their ability to capture faithfully the far-field and near-field effects as well as their possibility of being acquired directly. Since acquired light sources have arbitrary frequencies and possibly high dimension (4D), using such light sources for realistic rendering leads to performance problems.In this thesis, we focus on how to balance the accuracy of the representation and the efficiency of the simulation. Our work relies on generating high quality samples from the input light sources for unbiased Monte Carlo estimation. In this thesis, we introduce three novel methods.The first one is to generate high quality samples efficiently from dynamic environment maps that are changing over time. We achieve this by introducing a GPU approach that generates light samples according to an approximation of the form factor and combines the samples from BRDF sampling for each pixel of a frame. Our method is accurate and efficient. Indeed, with only 256 samples per pixel, we achieve high quality results in real time at 1024 × 768 resolution. The second one is an adaptive sampling strategy for light field light sources (4D), we generate high quality samples efficiently by restricting conservatively the sampling area without reducing accuracy. With a GPU implementation and without any visibility computations, we achieve high quality results with 200 samples per pixel in real time at 1024 × 768 resolution. The performance is still interactive as long as the visibility is computed using our shadow map technique. We also provide a fully unbiased approach by replacing the visibility test with a offline CPU approach. Since light-based importance sampling is not very effective when the underlying material of the geometry is specular, we introduce a new balancing technique for Multiple Importance Sampling. This allows us to combine other sampling techniques with our light-based importance sampling. By minimizing the variance based on a second-order approximation, we are able to find good balancing between different sampling techniques without any prior knowledge. Our method is effective, since we actually reduce in average the variance for all of our test scenes with different light sources, visibility complexity, and materials. Our method is also efficient, by the fact that the overhead of our "black-box" approach is constant and represents 1% of the whole rendering process.
|
909 |
Adéquation Algorithme Architecture pour la reconstruction 3D en imagerie médicale TEPGac, Nicolas 17 July 2008 (has links) (PDF)
L'amélioration constante de la résolution dynamique et temporelle des scanners et des méthodes de reconstruction en imagerie médicale, s'accompagne d'un besoin croissant en puissance de calcul. Les accélérations logicielles, algorithmiques et matérielles sont ainsi appelées à réduire le fossé technologique existant entre les systèmes d'acquisition et ceux de reconstruction.<br />Dans ce contexte, une architecture matérielle de rétroprojection 3D en Tomographie à Emission de Positons (TEP) est proposée. Afin de lever le verrou technologique constitué par la forte latence des mémoires externes de type SDRAM, la meilleure Adéquation Algorithme Architecture a été recherchée. Cette architecture a été implémentée sur un SoPC (System on Programmable Chip) et ses performances comparées à celles d'un PC, d'un serveur de calcul et d'une carte graphique. Associée à un module matériel de projection 3D, cette architecture permet de définir une paire matérielle de projection/rétroprojection et de constituer ainsi un système de reconstruction complet.
|
910 |
Suivi d'objets d'intérêt dans une séquence d'images : des points saillants aux mesures statistiquesVincent, Garcia 11 December 2008 (has links) (PDF)
Le problème du suivi d'objets dans une vidéo se pose dans des domaines tels que la vision par ordinateur (vidéo-surveillance par exemple) et la post-production télévisuelle et cinématographique (effets spéciaux). Il se décline en deux variantes principales : le suivi d'une région d'intérêt, qui désigne un suivi grossier d'objet, et la segmentation spatio-temporelle, qui correspond à un suivi précis des contours de l'objet d'intérêt. Dans les deux cas, la région ou l'objet d'intérêt doivent avoir été préalablement détourés sur la première, et éventuellement la dernière, image de la séquence vidéo. Nous proposons dans cette thèse une méthode pour chacun de ces types de suivi ainsi qu'une implémentation rapide tirant partie du Graphics Processing Unit (GPU) d'une méthode de suivi de régions d'intérêt développée par ailleurs.<br />La première méthode repose sur l'analyse de trajectoires temporelles de points saillants et réalise un suivi de régions d'intérêt. Des points saillants (typiquement des lieux de forte courbure des lignes isointensité) sont détectés dans toutes les images de la séquence. Les trajectoires sont construites en liant les points des images successives dont les voisinages sont cohérents. Notre contribution réside premièrement dans l'analyse des trajectoires sur un groupe d'images, ce qui améliore la qualité d'estimation du mouvement. De plus, nous utilisons une pondération spatio-temporelle pour chaque trajectoire qui permet d'ajouter une contrainte temporelle sur le mouvement tout en prenant en compte les déformations géométriques locales de l'objet ignorées par un modèle de mouvement global.<br />La seconde méthode réalise une segmentation spatio-temporelle. Elle repose sur l'estimation du mouvement du contour de l'objet en s'appuyant sur l'information contenue dans une couronne qui s'étend de part et d'autre de ce contour. Cette couronne nous renseigne sur le contraste entre le fond et l'objet dans un contexte local. C'est là notre première contribution. De plus, la mise en correspondance par une mesure de similarité statistique, à savoir l'entropie du résiduel, d'une portion de la couronne et d'une zone de l'image suivante dans la séquence permet d'améliorer le suivi tout en facilitant le choix de la taille optimale de la couronne.<br />Enfin, nous proposons une implémentation rapide d'une méthode de suivi de régions d'intérêt existante. Cette méthode repose sur l'utilisation d'une mesure de similarité statistique : la divergence de Kullback-Leibler. Cette divergence peut être estimée dans un espace de haute dimension à l'aide de multiples calculs de distances au k-ème plus proche voisin dans cet espace. Ces calculs étant très coûteux, nous proposons une implémentation parallèle sur GPU (grâce à l'interface logiciel CUDA de NVIDIA) de la recherche exhaustive des k plus proches voisins. Nous montrons que cette implémentation permet d'accélérer le suivi des objets, jusqu'à un facteur 15 par rapport à une implémentation de cette recherche nécessitant au préalable une structuration des données.
|
Page generated in 0.23 seconds