• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 91
  • 30
  • 11
  • 11
  • 8
  • 5
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 185
  • 75
  • 52
  • 40
  • 29
  • 28
  • 24
  • 23
  • 23
  • 21
  • 19
  • 19
  • 18
  • 18
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Prédiction de performance d'algorithmes de traitement d'images sur différentes architectures hardwares / Image processing algorithm performance prediction on different hardware architectures

Soucies, Nicolas 07 May 2015 (has links)
Dans le contexte de la vision par ordinateur, le choix d’une architecture de calcul est devenu de plus en plus complexe pour un spécialiste du traitement d’images. Le nombre d’architectures permettant de résoudre des algorithmes de traitement d’images augmente d’année en année. Ces algorithmes s’intègrent dans des cadres eux-mêmes de plus en plus complexes répondant à de multiples contraintes, que ce soit en terme de capacité de calculs, mais aussi en terme de consommation ou d’encombrement. A ces contraintes s’ajoute le nombre grandissant de types d’architectures de calculs pouvant répondre aux besoins d’une application (CPU, GPU, FPGA). L’enjeu principal de l’étude est la prédiction de la performance d’un système, cette prédiction pouvant être réalisée en phase amont d’un projet de développement dans le domaine de la vision. Dans un cadre de développement, industriel ou de recherche, l’impact en termes de réduction des coûts de développement, est d’autant plus important que le choix de l’architecture de calcul est réalisé tôt. De nombreux outils et méthodes d’évaluation de la performance ont été développés mais ceux-ci, se concentrent rarement sur un domaine précis et ne permettent pas d’évaluer la performance sans une étude complète du code ou sans la réalisation de tests sur l’architecture étudiée. Notre but étant de s’affranchir totalement de benchmark, nous nous sommes concentrés sur le domaine du traitement d’images pour pouvoir décomposer les algorithmes du domaine en éléments simples ici nommées briques élémentaires. Dans cette optique, un nouveau paradigme qui repose sur une décomposition de tout algorithme de traitement d’images en ces briques élémentaires a été conçu. Une méthode est proposée pour modéliser ces briques en fonction de paramètres software et hardwares. L’étude démontre que la décomposition en briques élémentaires est réalisable et que ces briques élémentaires peuvent être modélisées. Les premiers tests sur différentes architectures avec des données réelles et des algorithmes comme la convolution et les ondelettes ont permis de valider l'approche. Ce paradigme est un premier pas vers la réalisation d’un outil qui permettra de proposer des architectures pour le traitement d’images et d’aider à l’optimisation d’un programme dans ce domaine. / In computer vision, the choice of a computing architecture is becoming more difficult for image processing experts. Indeed, the number of architectures allowing the computation of image processing algorithms is increasing. Moreover, the number of computer vision applications constrained by computing capacity, power consumption and size is increasing. Furthermore, selecting an hardware architecture, as CPU, GPU or FPGA is also an important issue when considering computer vision applications.The main goal of this study is to predict the system performance in the beginning of a computer vision project. Indeed, for a manufacturer or even a researcher, selecting the computing architecture should be done as soon as possible to minimize the impact on development.A large variety of methods and tools has been developed to predict the performance of computing systems. However, they do not cover a specific area and they cannot predict the performance without analyzing the code or making some benchmarks on architectures. In this works, we specially focus on the prediction of the performance of computer vision algorithms without the need for benchmarking. This allows splitting the image processing algorithms in primitive blocks.In this context, a new paradigm based on splitting every image processing algorithms in primitive blocks has been developed. Furthermore, we propose a method to model the primitive blocks according to the software and hardware parameters. The decomposition in primitive blocks and their modeling was demonstrated to be possible. Herein, the performed experiences, on different architectures, with real data, using algorithms as convolution and wavelets validated the proposed paradigm. This approach is a first step towards the development of a tool allowing to help choosing hardware architecture and optimizing image processing algorithms.
82

Uppdatering av IT-stöd hos Markbyggarna AB

Zakharina, Tatiana January 2019 (has links)
Markbyggarna AB är ett företag som huvudsakligen utför markarbete och maskintjänster. Företagets IT-system är formad för att stödja typiskt kontorsarbete. Projektet inriktades på att hjälpa företaget effektivisera det befintliga systemet och i större omfattning nyttja kostnadsfria alternativ som finns tillgängliga i dagens läge. Arbetet startade med definiering utav systemets avgörande aspekter. För hårdvarorna ställdes krav på CPUs, RAMs och hårddiskens belastning. För mjukvaror definierades funktioner som systemet bör innehava. För trådburen och trådlös nätverk definierades krav på internetanslutning. Efteråt samlades informationen om prestanda i det befintliga systemet med hjälpa av olika övervakningssystem, verktyg och intervjuer. PGRG tillämpades för hårdvarornas övervakning. För nätverksövervakning skapades ett eget övervakningssystem med stöd av de kommandobaserade verktygen Iperf, Speedtest-cli och verktyget Vistumbler. Övervakningsresultaten jämfördes med de önskade egenskaperna och skillnaden mellan dessa två utgjorde underlag för förändringsarbete även med beställarens önskemål i åtanke. Såldes gick den praktiska delen av projektet bland annat på att trådlös signal förstärktes genom installation av AP, introduktion av nya applikationer, en anslutning till företagets PC hemifrån samt skapades system för säkerhetskopiering. Även har en rad andra säkerhetsåtgärder vidtagits. För att kunna utvärdera det genomförda arbetet övervakades det trådlösa nätverket på nytt. Dessa förändringar har gjort att IT-systemet blivit säkrare, med bättre presterande nätverk. Flexibiliteten och funktionaliteten har också ökat. Totalt gick projektet på 1660 kr för inköp av hårda varor. Det gick att täcka de flesta av företagets behov med kostnadsfria mjuka varor.
83

Prestanda och precision på en enkortsdator i ett system med realtidskrav / Performance and precision of a single-board computer in a system with real-time requirements

Wikman, Torbjörn, Hassel, Philip January 2014 (has links)
The report aims to investigate how well a certain type of affordable embedded single board computer can hold up against today's more expensive computers in a computer system by doing various tests on a system with the specified requirements. The system has a Raspberry Pi as the single board computer which task is to control a camera based on coordinates obtained from a server as well as capture and stream a video signal on a network. The researches were conducted to check how much network traffic a single-chip computer sent in different video formats and how much CPU utilization was required. Studies were also made to ensure the accuracy of the camera control. The researches have been experimental, where several tests have been performed and analyzed. The results show that a sufficiently good accuracy can be obtained from the camera steering unit, in which two different servos have been investigated. When the video format MJPEG and H.264 are used, the single-chip computer is able to transmit a video signal up to 1280x720 at 15 fps. The system managed to download and perform calculations on an object from the server at 42.3 ms. When the entire system was up and running at the same time the Raspberry Pi didn’t manage to deliver a video signal and obtain the coordinates from the server fast enough. Depending on the video format the performance on the single-chip computer varied, but no setup managed to keep the system stable enough to reach the requirements. / Rapportens syfte är att undersöka hur väl en viss typ av billigare enkortsdator kan stå sig mot dagens dyrare datorer i ett datorsystem genom att göra olika undersökningar på ett system med uppsatta krav. Systemet har en Raspberry Pi som enkortsdator och har till uppgift att styra en kamera utifrån koordinater som fås från en server samt fånga och strömma en videosignal ut på ett nätverk. De undersökningar som gjordes var att kontrollera hur mycket nätverkstrafik som enkortsdatorn sände vid olika format på videosignalen samt hur mycket CPU- utnyttjande som krävdes. Undersökningar gjordes också för att säkerställa precisionen på kamerastyrningen. Alla undersökningar har varit experimentella, där flera olika tester har utförts och analyserats. Resultatet från undersökningarna visar att en tillräckligt god precision kan fås från kamerastyrningen, där två olika servon har undersökts. När videoformaten MJPEG och H.264 används kan enkortsdatorn klara av att sända ut en videosignal upp till 1280x720 med 15 bildrutor per sekund. I systemet som testerna utfördes på klarade enkortsdatorn av att hämta och utföra beräkningar på ett objekt från servern på 42,3 ms. När hela systemet var igång samtidigt klarade dock inte Raspberry Pi av att leverera en videosignal och hämta koordinater från servern tillräckligt snabbt. Beroende på vilket videoformat som användes presterade enkortsdatorn olika bra, men det var ingen inställning som stabilt klarade av att nå kraven.
84

SYSTEMS SUPPORT FOR DATA ANALYTICS BY EXPLOITING MODERN HARDWARE

Hongyu Miao (11751590) 03 December 2021 (has links)
<p>A large volume of data is continuously being generated by data centers, humans, and the internet of things (IoT). In order to get useful insights, such enormous data must be processed in time with high throughput, low latency, and high accuracy. To meet such performance demands, a large body of new hardware is being shipped by vendors, such as multi-core CPUs, 3D-stacked memory, embedded microcontrollers, and other accelerators.</p><br><p>However, traditional operating systems (OSes) and data analytics frameworks, the key layer that bridges high-level data processing applications and low-level hardware, fails to deliver these requirements due to quickly evolving new hardware and increases in explosion of data. For instance, general OSes are not aware of the unique characters and demands of data processing applications. Data analytics engines for stream processing, e.g., Apache Spark and Beam, always add more machines to deal with more data but leave every single machine underutilized without fully exploiting underlying hardware features, which leads to poor efficiency. Data analytics frameworks for machine learning inference on IoT devices cannot run neural networks that exceed SRAM size, which disqualifies many important use cases.</p><br><p>In order to bridge the gap between the performance demands of data analytics and the new features of emerging hardware, in this thesis we exploit runtime system designs for high-level data processing applications by exploiting low-level modern hardware features. We study two important data analytics applications, including real-time stream processing and on-device machine learning inference, on three important hardware platforms across the Cloud and the Edge, including multicore CPUs, hybrid memory system combining 3D-stacked memory and general DRAM, and embedded microcontrollers with limited resources. </p><br><p>In order to speed up and enable the two data analytics applications on the three hardware platforms, this thesis contributes three related research projects. In project StreamBox, we exploit the parallelism and memory hierarchy of modern multicore hardware on single machines for stream processing, achieving scalable and highly efficient performance. In project StreamBox-HBM, we exploit hybrid memories to balance bandwidth and latency, achieving memory scalability and highly efficient performance. StreamBox and StreamBox-HBM both offer orders of magnitude performance improvements over the prior state of the art, opening up new applications with higher data processing needs. In project SwapNN, we investigate a system solution for microcontrollers (MCUs) to execute neural networks (NNs) inference out-of-core without losing accuracy, enabling new use cases and significantly expanding the scope of NN inference on tiny MCUs. </p><br><p>We report the system designs, system implementations, and experimental results. Based on our experience in building above systems, we provide general guidance on designing runtime systems across hardware/software stack for a wider range of new applications on future hardware platforms.</p><div><br></div>
85

Zpracování obrazu s velkými datovými toky - využití CUDA/OpenCL / High data rate image processing using CUDA/OpenCL

Sedláček, Filip January 2018 (has links)
The main objective of this research is to propose optimization of the defect detection algorithm in the production of nonwoven textile. The algorithm was developed by CAMEA spol. s.r.o. As a consequence of upgrading the current camera system to a more powerful one, it will be necessary to optimize the current algorithm and choose the hardware with the appropriate architecture on which the calculations will be performed. This work will describe a usefull programming techniques of CUDA software architecture and OpenCL framework in details. Using these tools, we proposed to implement a parallel equivalent of the current algorithm, describe various optimization methods, and we designed a GUI to test these methods.
86

Expertní systém / Expert system

Šimková, Jana January 2010 (has links)
The main point of the work is an identification with NPS32 expert system, the describtion of the ways of getting knowledges. By choosing a suitable district for the expert system aplication, the suggestion of the knowledge base for the district will have been the result of the work.
87

Využití GPU pro akceleraci optimalizace systému vodních děl / The GPU Accelerated Optimisation of the Water Management Systems

Marek, Jan January 2014 (has links)
Subject of this thesis is optimalization of storage function of water management system. The work is based on dissertation thesis of Ing. Pavel Menšík Ph.D. Automatization of   storage function of water management system. As optimalization method was chosen diferential evolution. Sequential version of the method will be implemented as a first step, followed by CPU accelerated and   GPU accelerated versions.
88

Akcelerace částicových rojů PSO pomocí GPU / Particle Swarm Optimization on GPUs

Záň, Drahoslav January 2013 (has links)
This thesis deals with a population based stochastic optimization technique PSO (Particle Swarm Optimization) and its acceleration. This simple, but very effective technique is designed for solving difficult multidimensional problems in a wide range of applications. The aim of this work is to develop a parallel implementation of this algorithm with an emphasis on acceleration of finding a solution. For this purpose, a graphics card (GPU) providing massive performance was chosen. To evaluate the benefits of the proposed implementation, a CPU and GPU implementation were created for solving a problem derived from the known NP-hard Knapsack problem. The GPU application shows 5 times average and almost 10 times the maximum speedup of computation compared to an optimized CPU application, which it is based on.
89

A comparison of Hybrid and Progressive Web Applications for the Android platform

Eleskovic, Denis January 2021 (has links)
The Hybrid approach of development has for a long time been the dominating way to develop cross-platform applications targeting both the web and mobile. In recent years, a new combination of technology has appeared called Progressive Web Application (PWA) which aims to combine Native capabilities with best practices of the web to deliver a new Native-like experience to users without the need of Native wrappers. So far PWAs have proven to be the inferior choice when it came to performance and platform support. The purpose of this study is to compare the two technologies based on a literature review and evaluate the current performance across three parameters in an experiment - battery consumption, CPU utilization and time to first activity. Two applications were developed using each respective technique, with the Apache Cordova framework being used for the Hybrid approach and the React framework being used to implement PWA features. The results showed that the Hybrid approach is better in the majority of tests, offering more in terms of platform API access and providing better performance while only being slower when it came to time it took to first activity; but something to consider is that the PWA approach was not far behind. The conclusion this study arrived at was that PWAs have developed significantly since previous studies and is almost able to match Hybrid apps in terms of APIs and performance, but that Hybrid apps are still the preferred choice when it comes to performance. Further development and a wider adaptation of the PWA specification could very well change the way developers choose to approach mobile app development in the future as well as a potential for bringing the web closer to the mobile platform.
90

Parallelizing Digital Signal Processing for GPU

Ekstam Ljusegren, Hannes, Jonsson, Hannes January 2020 (has links)
Because of the increasing importance of signal processing in today's society, there is a need to easily experiment with new ways to process signals. Usually, fast-performing digital signal processing is done with special-purpose hardware that are difficult to develop for. GPUs pose an alternative for fast performing digital signal processing. The work in this thesis is an analysis and implementation of a GPU version of a digital signal processing chain provided by SAAB. Through an iterative process of development and testing, a final implementation was achieved. Two benchmarks, both comprised of 4.2 M test samples, were made to compare the CPU implementation with the GPU implementation. The benchmark was run on three different platforms: a desktop computer, a NVIDIA Jetson AGX Xavier and a NVIDIA Jetson TX2. The results show that the parallelized version can reach several magnitudes higher throughput than the CPU implementation.

Page generated in 0.0573 seconds