• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 36
  • 10
  • 9
  • 6
  • 6
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 98
  • 98
  • 76
  • 55
  • 25
  • 24
  • 19
  • 17
  • 15
  • 15
  • 15
  • 11
  • 10
  • 9
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Modélisation ultra-rapide des transferts de chaleur par rayonnement et par conduction et exemple d'application / Fast Modeling of Radiation and Conduction Heat Transfer and application example

Ghannam, Boutros 19 October 2012 (has links)
L'apparition de CUDA en 2007 a rendu les GPU hautement programmables permettant ainsi aux applications scientifiques et techniques de profiter de leur capacité de calcul élevée. Des solutions ultra-rapides pour la résolution des transferts de chaleur par rayonnement et par conduction sur GPU sont présentées dans ce travail. Tout d'abord, la méthode MACZM pour le calcul des facteurs de transferts radiatifs directs en 3D et en milieu semi-transparent est représentée et validée. Ensuite, une implémentation efficace de la méthode à la base d'algorithmes de géométrie discrète et d'une parallélisation optimisée sur GPU dans CUDA atteignant 300 à 600 fois d'accélération, est présentée. Ceci est suivi par la formulation du NRPA, une version non-récursive de l'algorithme des revêtements pour le calcul des facteurs d'échange radiatifs totaux. La complexité du NRPA est inférieure à celle du PA et sont exécution sur GPU est jusqu'à 750 fois plus rapide que l'exécution du PA sur CPU. D'autre part, une implémentation efficace de la LOD sur GPU est présentée, consistant d'une alternance optimisée des solveurs et schémas de parallélisation et achevant une accélération GPU de 75 à 250 fois. Finalement, toutes les méthodes sont appliquées ensemble pour la résolution des transferts de chaleur en 3D dans un four de réchauffage sidérurgique de brames d'acier. Dans ce but, MACZM est appliquée avec un maillage multi-grille et le NRPA est appliqué au four en le découpant en zones, permettant d'avoir un temps de calcul très rapide une précision élevée. Ceci rend les méthodes utilisées de très grande importance pour la conception de stratégies de contrôle efficaces et précises. / The release of CUDA by NVIDIA in 2007 has tremendously increased GPU programmability, thus allowing scientific and engineering applications to take advantage of the high GPU compute capability. In this work, we present ultra-fast solutions for radiation and diffusion heat transfer on the GPU. First, the Multiple Absorption Coefficient Zonal Method (MACZM) for computing direct radiative exchange factors in 3D semi-transparent media is reviewed and validated. Then, an efficient implementation for MACZM is presented, based on discrete geometry algorithms, and an optimized GPU CUDA parallelization. The CUDA implementation achieves 300 to 600 times speed-up. The Non-recursive Plating Algorithm (NRPA), a non-recursive version of the plating algorithm for computing total exchange factors is then formulated. Due to low-complexity matrix multiplication algorithms, the NRPA has lower complexity than the PA does and it runs up to 750 times faster on the GPU by comparison to the CPU PA. On the other hand, an efficient GPU implementation for the Locally One Dimensional (LOD) finite difference split method for solving heat diffusion is presented, based on an optimiwed alternation between parallelization schemes and equation solvers, achieving accelerations from 75 to 250 times. Finally, all the methods are applied together for solving 3D heat transfer in a steel reheating furnace. A multi-grid approach is applied for MACZM and a zone-by zone computation for the NRPA. As a result, high precision and very fast computation time are achieved, making the methods of high interest for building precise and efficient control units.
92

Automatic Data Allocation, Buffer Management And Data Movement For Multi-GPU Machines

Ramashekar, Thejas 10 1900 (has links) (PDF)
Multi-GPU machines are being increasingly used in high performance computing. These machines are being used both as standalone work stations to run computations on medium to large data sizes (tens of gigabytes) and as a node in a CPU-Multi GPU cluster handling very large data sizes (hundreds of gigabytes to a few terabytes). Each GPU in such a machine has its own memory and does not share the address space either with the host CPU or other GPUs. Hence, applications utilizing multiple GPUs have to manually allocate and managed at a on each GPU. A significant body of scientific applications that utilize multi-GPU machines contain computations inside affine loop nests, i.e., loop nests that have affine bounds and affine array access functions. These include stencils, linear-algebra kernels, dynamic programming codes and data-mining applications. Data allocation, buffer management, and coherency handling are critical steps that need to be performed to run affine applications on multi-GPU machines. Existing works that propose to automate these steps have limitations and in efficiencies in terms of allocation sizes, exploiting reuse, transfer costs and scalability. An automatic multi-GPU memory manager that can overcome these limitations and enable applications to achieve salable performance is highly desired. One technique that has been used in certain memory management contexts in the literature is that of bounding boxes. The bounding box of an array, for a given tile, is the smallest hyper-rectangle that encapsulates all the array elements accessed by that tile. In this thesis, we exploit the potential of bounding boxes for memory management far beyond their current usage in the literature. In this thesis, we propose a scalable and fully automatic data allocation and buffer management scheme for affine loop nests on multi-GPU machines. We call it the Bounding Box based Memory Manager (BBMM). BBMM is a compiler-assisted runtime memory manager. At compile time, it use static analysis techniques to identify a set of bounding boxes accessed by a computation tile. At run time, it uses the bounding box set operations such as union, intersection, difference, finding subset and superset relation to compute a set of disjoint bounding boxes from the set of bounding boxes identified at compile time. It also exploits the architectural capability provided by GPUs to perform fast transfers of rectangular (strided) regions of memory and hence performs all data transfers in terms of bounding boxes. BBMM uses these techniques to automatically allocate, and manage data required by applications (suitably tiled and parallelized for GPUs). This allows It to (1) allocate only as much data (or close to) as is required by computations running on each GPU, (2) efficiently track buffer allocations and hence, maximize data reuse across tiles and minimize the data transfer overhead, (3) and as a result, enable applications to maximize the utilization of the combined memory on multi-GPU machines. BBMM can work with any choice of parallelizing transformations, computation placement, and scheduling schemes, whether static or dynamic. Experiments run on a system with four GPUs with various scientific programs showed that BBMM is able to reduce data allocations on each GPU by up to 75% compared to current allocation schemes, yield at least 88% of the performance of hand-optimized Open CL codes and allows excellent weak scaling.
93

DSA Image Registration And Respiratory Motion Tracking Using Probabilistic Graphical Models

Sundarapandian, Manivannan January 2016 (has links) (PDF)
This thesis addresses three problems related to image registration, prediction and tracking, applied to Angiography and Oncology. For image analysis, various probabilistic models have been employed to characterize the image deformations, target motions and state estimations. (i) In Digital Subtraction Angiography (DSA), having a high quality visualization of the blood motion in the vessels is essential both in diagnostic and interventional applications. In order to reduce the inherent movement artifacts in DSA, non-rigid image registration is used before subtracting the mask from the contrast image. DSA image registration is a challenging problem, as it requires non-rigid matching across spatially non-uniform control points, at high speed. We model the problem of sub-pixel matching, as a labeling problem on a non-uniform Markov Random Field (MRF). We use quad-trees in a novel way to generate the non uniform grid structure and optimize the registration cost using graph-cuts technique. The MRF formulation produces a smooth displacement field which results in better artifact reduction than with the conventional approach of independently registering the control points. The above approach is further improved using two models. First, we introduce the concept of pivotal and non-pivotal control points. `Pivotal control points' are nodes in the Markov network that are close to the edges in the mask image, while 'non-pivotal control points' are identified in soft tissue regions. This model leads to a novel MRF framework and energy formulation. Next, we propose a Gaussian MRF model and solve the energy minimization problem for sub-pixel DSA registration using Random Walker (RW). An incremental registration approach is developed using quad-tree based MRF structure and RW, wherein the density of control points is hierarchically increased at each level M depending of the features to be used and the required accuracy. A novel numbering scheme of the control points allows us to reuse the computations done at level M in M + 1. Both the models result in an accelerated performance without compromising on the artifact reduction. We have also provided a CUDA based design of the algorithm, and shown performance acceleration on a GPU. We have tested the approach using 25 clinical data sets, and have presented the results of quantitative analysis and clinical assessment. (ii) In External Beam Radiation Therapy (EBRT), in order to monitor the intra fraction motion of thoracic and abdominal tumors, the lung diaphragm apex can be used as an internal marker. However, tracking the position of the apex from image based observations is a challenging problem, as it undergoes both position and shape variation. We propose a novel approach for tracking the ipsilateral hemidiaphragm apex (IHDA) position on CBCT projection images. We model the diaphragm state as a spatiotemporal MRF, and obtain the trace of the apex by solving an energy minimization problem through graph-cuts. We have tested the approach using 15 clinical data sets and found that this approach outperforms the conventional full search method in terms of accuracy. We have provided a GPU based heterogeneous implementation of the algorithm using CUDA to increase the viability of the approach for clinical use. (iii) In an adaptive radiotherapy system, irrespective of the methods used for target observations there is an inherent latency in the beam control as they involve mechanical movement and processing delays. Hence predicting the target position during `beam on target' is essential to increase the control precision. We propose a novel prediction model (called o set sine model) for the breathing pattern. We use IHDA positions (from CBCT images) as measurements and an Unscented Kalman Filter (UKF) for state estimation. The results based on 15 clinical datasets show that, o set sine model outperforms the state of the art LCM model in terms of prediction accuracy.
94

Měření částečných výbojů u vysokonapěťových kabelů / Partial discharge measurement of middle and high voltage cables

Pelikán, Luděk January 2018 (has links)
The main topic of this work is the analysis of the measurement of partial discharge measurements on high voltage cables. The thesis also deals with other measurements, including the measurement of the loss factor and the voltage retention and breakdown test. Part of the text describes the issue of partial discharge, its measurement by galvanic method and methods of elimination of disturbing influences that affect this method in terms of sensitivity and accuracy of the measured values. Thesis also describes the construction of the cables, their marking and the tests carried out on them. There is also a description of the cable terminals used for measuring, especially water terminals, which are designed for high voltages and which are carried out in the practical part of the thesis. The next part deals with problems prior to the commissioning of water terminals and their preparation for certain tests, including description and evaluation of the results of the measurements.
95

Detekce pohyblivého objektu ve videu na CUDA / Moving Object Detection in Video Using CUDA

Čermák, Michal January 2011 (has links)
This thesis deals with model-based approach to 3D tracking from monocular video. The 3D model pose dynamically estimated through minimization of objective function by particle filter. Objective function is based on rendered scene to real video similarity.
96

An Efficient Framework for Compressed Sensing Reconstruction of Highly Accelerated Dynamic Cardiac MRI

Ting, Samuel T. 08 June 2016 (has links)
No description available.
97

Analysis Design and Implementation of Artificial Intelligence Techniques in Edge Computing Environments

Hernández Vicente, Daniel 27 March 2023 (has links)
Tesis por compendio / [ES] Edge Computing es un modelo de computación emergente basado en acercar el procesamiento a los dispositivos de captura de datos en las infraestructuras Internet of things (IoT). Edge computing mejora, entre otras cosas, los tiempos de respuesta, ahorra anchos de banda, incrementa la seguridad de los servicios y oculta las caídas transitorias de la red. Este paradigma actúa en contraposición a la ejecución de servicios en entornos cloud y es muy útil cuando se desea desarrollar soluciones de inteligencia artificial (AI) que aborden problemas en entornos de desastres naturales, como pueden ser inundaciones, incendios u otros eventos derivados del cambio climático. La cobertura de estos escenarios puede resultar especialmente difícil debido a la escasez de infraestructuras disponibles, lo que a menudo impide un análisis de los datos basado en la nube en tiempo real. Por lo tanto, es fundamental habilitar técnicas de IA que no dependan de sistemas de cómputo externos y que puedan ser embebidas en dispositivos de móviles como vehículos aéreos no tripulados (VANT), para que puedan captar y procesar información que permita inferir posibles situaciones de emergencia y determinar así el curso de acción más adecuado de manera autónoma. Históricamente, se hacía frente a este tipo de problemas utilizando los VANT como dispositivos de recogida de datos con el fin de, posteriormente, enviar esta información a la nube donde se dispone de servidores capacitados para analizar esta ingente cantidad de información. Este nuevo enfoque pretende realizar todo el procesamiento y la obtención de resultados en el VANT o en un dispositivo local complementario. Esta aproximación permite eliminar la dependencia de un centro de cómputo remoto que añade complejidad a la infraestructura y que no es una opción en escenarios específicos, donde las conexiones inalámbricas no cumplen los requisitos de transferencia de datos o son entornos en los que la información tiene que obtenerse en ese preciso momento, por requisitos de seguridad o inmediatez. Esta tesis doctoral está compuesta de tres propuestas principales. En primer lugar se plantea un sistema de despegue de enjambres de VANTs basado en el algoritmo de Kuhn Munkres que resuelve el problema de asignación en tiempo polinómico. Nuestra evaluación estudia la complejidad de despegue de grandes enjambres y analiza el coste computacional y de calidad de nuestra propuesta. La segunda propuesta es la definición de una secuencia de procesamiento de imágenes de catástrofes naturales tomadas desde drones basada en Deep learning (DL). El objetivo es reducir el número de imágenes que deben procesar los servicios de emergencias en la catástrofe natural para poder tomar acciones sobre el terreno de una manera más rápida. Por último, se utiliza un conjunto de datos de imágenes obtenidas con VANTs y relativas a diferentes inundaciones, en concreto, de la DANA de 2019, cedidas por el Ayuntamiento de San Javier, ejecutando un modelo DL de segmentación semántica que determina automáticamente las regiones más afectadas por las lluvias (zonas inundadas). Entre los resultados obtenidos se destacan los siguientes: 1- la mejora drástica del rendimiento del despegue vertical coordinado de una red de VANTs. 2- La propuesta de un modelo no supervisado para la vigilancia de zonas desconocidas representa un avance para la exploración autónoma mediante VANTs. Esto permite una visión global de una zona concreta sin realizar un estudio detallado de la misma. 3- Por último, un modelo de segmentación semántica de las zonas inundadas, desplegado para el procesamiento de imágenes en el VANTs, permite la obtención de datos de inundaciones en tiempo real (respetando la privacidad) para una reconstrucción virtual fidedigna del evento. Esta tesis ofrece una propuesta para mejorar el despegue coordinado de drones y dotar de capacidad de procesamiento de algoritmos de deep learning a dispositivos edge, más concretamente UAVs autónomos. / [CA] Edge Computing és un model de computació emergent basat a acostar el processament als dispositius de captura de dades en les infraestructures Internet of things (IoT). Edge computing millora, entre altres coses, els temps de resposta, estalvia amplades de banda, incrementa la seguretat dels serveis i oculta les caigudes transitòries de la xarxa. Aquest paradigma actua en contraposició a l'execució de serveis en entorns cloud i és molt útil quan es desitja desenvolupar solucions d'intel·ligència artificial (AI) que aborden problemes en entorns de desastres naturals, com poden ser inundacions, incendis o altres esdeveniments derivats del canvi climàtic. La cobertura d'aquests escenaris pot resultar especialment difícil a causa de l'escassetat d'infraestructures disponibles, la qual cosa sovint impedeix una anàlisi de les dades basat en el núvol en temps real. Per tant, és fonamental habilitar tècniques de IA que no depenguen de sistemes de còmput externs i que puguen ser embegudes en dispositius de mòbils com a vehicles aeris no tripulats (VANT), perquè puguen captar i processar informació per a inferir possibles situacions d'emergència i determinar així el curs d'acció més adequat de manera autònoma. Històricament, es feia front a aquesta mena de problemes utilitzant els VANT com a dispositius de recollida de dades amb la finalitat de, posteriorment, enviar aquesta informació al núvol on es disposa de servidors capacitats per a analitzar aquesta ingent quantitat d'informació. Aquest nou enfocament pretén realitzar tot el processament i l'obtenció de resultats en el VANT o en un dispositiu local complementari. Aquesta aproximació permet eliminar la dependència d'un centre de còmput remot que afig complexitat a la infraestructura i que no és una opció en escenaris específics, on les connexions sense fils no compleixen els requisits de transferència de dades o són entorns en els quals la informació ha d'obtindre's en aqueix precís moment, per requisits de seguretat o immediatesa. Aquesta tesi doctoral està composta de tres propostes principals. En primer lloc es planteja un sistema d'enlairament d'eixams de VANTs basat en l'algorisme de Kuhn Munkres que resol el problema d'assignació en temps polinòmic. La nostra avaluació estudia la complexitat d'enlairament de grans eixams i analitza el cost computacional i de qualitat de la nostra proposta. La segona proposta és la definició d'una seqüència de processament d'imatges de catàstrofes naturals preses des de drons basada en Deep learning (DL).L'objectiu és reduir el nombre d'imatges que han de processar els serveis d'emergències en la catàstrofe natural per a poder prendre accions sobre el terreny d'una manera més ràpida. Finalment, s'utilitza un conjunt de dades d'imatges obtingudes amb VANTs i relatives a diferents inundacions, en concret, de la DANA de 2019, cedides per l'Ajuntament de San Javier, executant un model DL de segmentació semàntica que determina automàticament les regions més afectades per les pluges (zones inundades). Entre els resultats obtinguts es destaquen els següents: 1- la millora dràstica del rendiment de l'enlairament vertical coordinat d'una xarxa de VANTs. 2- La proposta d'un model no supervisat per a la vigilància de zones desconegudes representa un avanç per a l'exploració autònoma mitjançant VANTs. Això permet una visió global d'una zona concreta sense realitzar un estudi detallat d'aquesta. 3- Finalment, un model de segmentació semàntica de les zones inundades, desplegat per al processament d'imatges en el VANTs, permet l'obtenció de dades d'inundacions en temps real (respectant la privacitat) per a una reconstrucció virtual fidedigna de l'esdeveniment. / [EN] Edge Computing is an emerging computing model based on bringing data processing and storage closer to the location needed to improve response times and save bandwidth. This new paradigm acts as opposed to running services in cloud environments and is very useful in developing artificial intelligence (AI) solutions that address problems in natural disaster environments, such as floods, fires, or other events of an adverse nature. Coverage of these scenarios can be particularly challenging due to the lack of available infrastructure, which often precludes real-time cloud-based data analysis. Therefore, it is critical to enable AI techniques that do not rely on external computing systems and can be embedded in mobile devices such as unmanned aerial vehicles (UAVs) so that they can capture and process information to understand their context and determine the appropriate course of action independently. Historically, this problem was addressed by using UAVs as data collection devices to send this information to the cloud, where servers can process it. This new approach aims to do all the processing and get the results on the UAV or a complementary local device. This approach eliminates the dependency on a remote computing center that adds complexity to the infrastructure and is not an option in specific scenarios where wireless connections do not meet the data transfer requirements. It is also an option in environments where the information has to be obtained at that precise moment due to security or immediacy requirements. This study consists of three main proposals. First, we propose a UAV swarm takeoff system based on the Kuhn Munkres algorithm that solves the assignment problem in polynomial time. Our evaluation studies the takeoff complexity of large swarms and analyzes our proposal's computational and quality cost. The second proposal is the definition of a Deep learning (DL) based image processing sequence for natural disaster images taken from drones to reduce the number of images processed by the first responders in the natural disaster. Finally, a dataset of images obtained with UAVs and related to different floods is used to run a semantic segmentation DL model that automatically determines the regions most affected by the rains (flooded areas). The results are 1- The drastic improvement of the performance of the coordinated vertical take-off of a network of UAVs. 2- The proposal of an unsupervised model for the surveillance of unknown areas represents a breakthrough for autonomous exploration by UAVs. This allows a global view of a specific area without performing a detailed study. 3- Finally, a semantic segmentation model of flooded areas, deployed for image processing in the UAV, allows obtaining real-time flood data (respecting privacy) for a reliable virtual reconstruction of the event. This thesis offers a proposal to improve the coordinated take-off of drones, to provide edge devices with deep learning algorithms processing capacity, more specifically autonomous UAVs, in order to develop services for the surveillance of areas affected by natural disasters such as fire detection, segmentation of flooded areas or detection of people in danger. Thanks to this research, services can be developed that enable the coordination of large arrays of drones and allow image processing without needing additional devices. This flexibility makes our approach a bet for the future and thus provides a development path for anyone interested in deploying an autonomous drone-based surveillance and actuation system. / I would like to acknowledge the project Development of High-Performance IoT Infrastructures against Climate Change based on Artificial Intelligence (GLOBALoT). Funded by Ministerio de Ciencia e Innovación (RTC2019-007159-5), of which this thesis is part. / Hernández Vicente, D. (2023). Analysis Design and Implementation of Artificial Intelligence Techniques in Edge Computing Environments [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/192605 / Compendio
98

Distributed Support Vector Machine With Graphics Processing Units

Zhang, Hang 06 August 2009 (has links)
Training a Support Vector Machine (SVM) requires the solution of a very large quadratic programming (QP) optimization problem. Sequential Minimal Optimization (SMO) is a decomposition-based algorithm which breaks this large QP problem into a series of smallest possible QP problems. However, it still costs O(n2) computation time. In our SVM implementation, we can do training with huge data sets in a distributed manner (by breaking the dataset into chunks, then using Message Passing Interface (MPI) to distribute each chunk to a different machine and processing SVM training within each chunk). In addition, we moved the kernel calculation part in SVM classification to a graphics processing unit (GPU) which has zero scheduling overhead to create concurrent threads. In this thesis, we will take advantage of this GPU architecture to improve the classification performance of SVM.

Page generated in 0.0996 seconds