• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 29
  • 11
  • 7
  • 7
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 89
  • 89
  • 47
  • 29
  • 22
  • 17
  • 17
  • 15
  • 12
  • 10
  • 10
  • 9
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Parametrizace tváře pomocí videosekvence / Face parameterization using videosequence

Lieskovský, Pavol January 2019 (has links)
This work deals with the problem of face parameterization from the video of a speaking person and estimating Parkinson’s disease and the progress of its symptoms based on face parameters. It describes the syntax and function of the program that was created within this work and solves the problem of face parameterization. The program formats the processed data into a time series of parameters in JSON format. From these data, a dataset was created, based on which artificial intelligence models were trained to predict Parkinson’s disease and the progress of its symptoms. The process of model training and their results are documented within this work.
12

Simultaneous RF/EO Tracking and Characterization of Dismounts

Blackaby, Jason M. 26 June 2008 (has links)
No description available.
13

VIDEO-BASED STANDOFF HEALTH MEASUREMENTS

Jeehyun Choe (6752669) 13 August 2019 (has links)
We addressed two interesting video-based health measurements. First is video-based Heart Rate (HR) estimation, known as video-based Photoplethysmography (PPG) or videoplethysmography (VHR). We adapted an existing video-based HR estimation method to produce more robust and accurate results. Specifically, we removed periodic signals from the recording environment by identifying (and removing) frequency clusters that are present the face region and background. This adaptive passband filter generated more accurate HR estimates and allowed other applied filters to work more effectively. Measuring HR at the presence of motions is one of the most challenging problems in recent VHR studies. We investigated and described the motion effects in VHR in terms of the angle change of the subject’s skin surface in relation to the light source. Based on this understanding, we discussed the future work on how we can compensate for the motion artifacts. Another important health information addressed in this thesis is Videosomnography (VSG), a range of video-based methods used to record and assess sleep vs. wake states in humans. Traditional behavioral-VSG (B-VSG) labeling requires visual inspection of the video by a trained technician to determine whether a subject is asleep or awake. We proposed an automated VSG sleep detection system (auto-VSG) which employs motion analysis to determine sleep vs. wake states in young children. The analyses revealed that estimates generated from the proposed Long Short-term Memory (LSTM)-based method with long-term temporal dependency are suitable for automated sleep or awake labeling.
14

Perceptual methods for video coding

Unknown Date (has links)
The main goal of video coding algorithms is to achieve high compression efficiency while maintaining quality of the compressed signal at the highest level. Human visual system is the ultimate receiver of compressed signal and final judge of its quality. This dissertation presents work towards optimal video compression algorithm that is based on the characteristics of our visual system. Modeling phenomena such as backward temporal masking and motion masking we developed algorithms that are implemented in the state-of- the-art video encoders. Result of using our algorithms is visually lossless compression with improved efficiency, as verified by standard subjective quality and psychophysical tests. Savings in bitrate compared to the High Efficiency Video Coding / H.265 reference implementation are up to 45%. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2014. / FAU Electronic Theses and Dissertations Collection
15

Memory Architecture Template for Fast Block Matching Algorithms on Field Programmable Gate Arrays

Chandrakar, Shant 01 December 2009 (has links)
Fast Block Matching (FBM) algorithms for video compression are well suited for acceleration using parallel data-path architectures on Field Programmable Gate Arrays (FPGAs). However, designing an efficient on-chip memory subsystem to provide the required throughput to this parallel data-path architecture is a complex problem. This thesis presents a memory architecture template that can be parameterized for a given FBM algorithm, number of parallel Processing Elements (PEs), and block size. The template can be parameterized with well known exploration techniques to design efficient on-chip memory subsystems. The memory subsystems are derived for two existing FBM algorithms and are implemented on a Xilinx Virtex 4 family of FPGAs. Results show that the derived memory subsystem in the best case supports up to 27 more parallel PEs than the three existing subsystems and processes integer pixels in a 1080p video sequence up to a rate of 73 frames per second. The speculative execution of an FBM algorithm for the same number of PEs increases the number of frames processed per second by 49%.
16

Video sequence synchronization

Wedge, Daniel John January 2008 (has links)
[Truncated abstract] Video sequence synchronization is necessary for any computer vision application that integrates data from multiple simultaneously recorded video sequences. With the increased availability of video cameras as either dedicated devices, or as components within digital cameras or mobile phones, a large volume of video data is available as input for a growing range of computer vision applications that process multiple video sequences. To ensure that the output of these applications is correct, accurate video sequence synchronization is essential. Whilst hardware synchronization methods can embed timestamps into each sequence on-the-fly, they require specialized hardware and it is necessary to set up the camera network in advance. On the other hand, computer vision-based software synchronization algorithms can be used to post-process video sequences recorded by cameras that are not networked, such as common consumer hand-held video cameras or cameras embedded in mobile phones, or to synchronize historical videos for which hardware synchronization was not possible. The current state-of-the-art software algorithms vary in their input and output requirements and camera configuration assumptions. ... Next, I describe an approach that synchronizes two video sequences where an object exhibits ballistic motions. Given the epipolar geometry relating the two cameras and the imaged ballistic trajectory of an object, the algorithm uses a novel iterative approach that exploits object motion to rapidly determine pairs of temporally corresponding frames. This algorithm accurately synchronizes videos recorded at different frame rates and takes few iterations to converge to sub-frame accuracy. Whereas the method presented by the first algorithm integrates tracking data from all frames to synchronize the sequences as a whole, this algorithm recovers the synchronization by locating pairs of temporally corresponding frames in each sequence. Finally, I introduce an algorithm for synchronizing two video sequences recorded by stationary cameras with unknown epipolar geometry. This approach is unique in that it recovers both the frame rate ratio and the frame offset of the two sequences by finding matching space-time interest points that represent events in each sequence; the algorithm does not require object tracking. RANSAC-based approaches that take a set of putatively matching interest points and recover either a homography or a fundamental matrix relating a pair of still images are well known. This algorithm extends these techniques using space-time interest points in place of spatial features, and uses nested instances of RANSAC to also recover the frame rate ratio and frame offset of a pair of video sequences. In this thesis, it is demonstrated that each of the above algorithms can accurately recover the frame rate ratio and frame offset of a range of real video sequences. Each algorithm makes a contribution to the body of video sequence synchronization literature, and it is shown that the synchronization problem can be solved using a range of approaches.
17

Architectural Enhancements for Color Image and Video Processing on Embedded Systems

Kim, Jongmyon 21 April 2005 (has links)
As emerging portable multimedia applications demand more and more computational throughput with limited energy consumption, the need for high-efficiency, high-throughput embedded processing is becoming an important challenge in computer architecture. In this regard, this dissertation addresses application-, architecture-, and technology-level issues in existing processing systems to provide efficient processing of multimedia in many, or ideally all, of its form. In particular, this dissertation explores color imaging in multimedia while focusing on two architectural enhancements for memory- and performance-hungry embedded applications: (1) a pixel-truncation technique and (2) a color-aware instruction set (CAX) for embedded multimedia systems. The pixel-truncation technique differs from previous techniques (e.g., 4:2:2 and 4:2:0 subsampling) used in image and video compression applications (e.g., JPEG and MPEG) in that it reduces the information content in individual pixel word sizes rather than in each dimension. Thus, this technique drastically reduces the bandwidth and memory required to transport and store color images without perceivable distortion in color. At the same time, it maintains the pixel storage format of color image processing in which each pixel computation is performed simultaneously on 3-D YCbCr components, which are widely used in the image and video processing community. CAX supports parallel operations on two-packed 16-bit (6:5:5) YCbCr data in a 32-bit datapath processor, providing greater concurrency and efficiency for processing color image sequences. This dissertation presents the impact of CAX on processing performance and on both area and energy efficiency for color imaging applications in three major processor architectures: dynamically scheduled (superscalar), statically scheduled (very long instruction word, VLIW), and embedded single instruction multiple data (SIMD) array processors. Unlike typical multimedia extensions, CAX obtains substantial performance and code density improvements through direct support for color data processing rather than depending solely on generic subword parallelism. In addition, the ability to reduce data format size reduces system cost. The reduction in data bandwidth also simplifies system design. In summary, CAX, coupled with the pixel-truncation technique, provides an efficient mechanism that meets the computational requirements and cost goals for future embedded multimedia products.
18

Εκτίμηση οπτικής ροής χρησιμοποιώντας υπερδειγματοληπτημένες ακολουθίες βίντεο

Κατσένου, Αγγελική 21 May 2008 (has links)
Ένα σημαντικό πρόβλημα στην επεξεργασία ακολουθιών βίντεο είναι η εκτίμηση της κίνησης μεταξύ διαδοχικών πλαισίων βίντεο, που συχνά αναφέρεται και σαν εκτίμηση οπτικής ροής. Η εκτίμηση της κίνησης βρίσκει εφαρμογή σε μια πληθώρα εφαρμογών βίντεο, όπως για παράδειγμα στη συμπίεση (video compression), στην τρισδιάστατη εκτίμηση της δομής επιφανειών (3-D surface structure estimation), στη σύνθεση εικόνων υψηλής ανάλυσης (super-resolution) και στην κατάτμηση βάσει της κίνησης (motion-based segmentation). Οι πρόσφατες εξελίξεις στην τεχνολογία των αισθητήρων επιτρέπoυν τη λήψη πλαισίων βίντεο με υψηλούς ρυθμούς. Στη διεθνή βιβλιογραφία έχουν παρουσιασθεί τεχνικές που εκμεταλλεύονται την ακριβέστερη απεικόνιση της οπτικής ροής στην υπερδειγματοληπτημένη ακολουθία πλαισίων επιτυγχάνοντας με αυτόν τον τρόπο καλύτερη εκτίμηση της κίνησης στους τυπικούς ρυθμούς δειγματοληψίας των 30 πλαισίων/δευτ. Η υπολογιστική πολυπλοκότητα, και επομένως, και η χρησιμότητα των τεχνικών αυτών σε εφαρμογές πραγματικού χρόνου εξαρτώνται άμεσα από την πολυπλοκότητα του αλγορίθμου αντιστοίχισης, που χρησιμοποιείται για την εκτίμηση κίνησης. Στα πλαίσια της εργασίας αυτής θα μελετήθηκαν και υλοποιήθηκαν μερικές από τις πιο πρόσφατες τεχνικές που έχουν προταθεί στη διεθνή βιβλιογραφία και αναπτύχθηκε μια αποδοτικότερη (από άποψη πολυπλοκότητας) τεχνική αντιστοίχησης, η οποία όμως συγχρόνως δεν υστερεί σε ακρίβεια. / A significant problem in video processing is the motion estimation between two adjacent video frames, which is often called optical flow estimation. The motion estimation is applicable for a number of different fields of interest like video compression, 3-D surface structure estimation, super-resolution images and motion based segmentation. Recent evolution of sensors’ technology has allowed the capture of video frames at high rates. Several techniques using these video sequences have been presented in recent scientific and technological publications. These techniques are exploiting the better representation and achieve more accurate optical flow estimation at the standard frame rate (30 frames per second). The computational complexity and the ease-of-use of those techniques is in accordance with the complexity of the matching algorithm used for motion estimation. Some of the state-of-the-art algorithms have been studied and implemented during this diploma thesis. Besides this, a more efficient and accurate matching technique has been proposed.
19

Design of a Real-time Image-based Distance Sensing System by Stereo Vision on FPGA

2012 August 1900 (has links)
A stereo vision system is a robust method to sense the distance information in a scene. This research explores the stereo vision system from the fundamentals of stereo vision and the computer stereo vision algorithm to the final implementation of the system on a FPGA chip. In a stereo vision system, images are captured by a pair of stereo image sensors. The distance information can be derived from the disparities between the stereo image pair, based on the theory of binocular geometry. With the increasing focus on 3D vision, stereo vision is becoming a hot topic in the areas of computer games, robot vision and medical applications. Particularly, most stereo vision systems are expected to be used in real-time applications. In this thesis, several stereo correspondence algorithms that determine the disparities between stereo image pair are examined. The algorithms can be categorized into global stereo algorithms and local stereo algorithms depending on the optimization techniques. The global algorithms examined are the Dynamic Time Warp (DTW) algorithm and the DTW with quantization algorithm, while the local algorithms examined are the window based Sum of Squared Differences (SSD), Sum of Absolute Differences (SAD) and Census transform correlation algorithms. With analysis among them, the window based SAD correlation algorithm is proposed for implementation on a FPGA platform. The proposed algorithm is implemented onto an Altera DE2 board featuring an Altera Cyclone II 2C35 FPGA. The implemented module of the algorithm is simulated using ModelSim-Altera to verify the correctness of its functionality. Along with a pair of stere image sensors and a LCD monitor, a stereo vision system is built. The entire system realizes a real-time video frame rate of 16.83 frames per second with an image resolution of 640 by 480 and produces disparity maps in which the objects are clearly distinguished by their relative distance information.
20

Automatic rush generation with application to theatre performances / Cadrage et montage automatique de films de théâtre par analyse sémantique de vidéo

Gandhi, Vineet 18 December 2014 (has links)
Vidéos de direct de qualité professionnelle mises en scène sont créées en les enregistrant à partir de différents points de vue appropriées. Ceux-ci sont ensuite édités ensemble pour présenter une histoire éloquente remplie avec la capacité de tirer l'émotion prévu de téléspectateurs. La création de ces vidéos compétentes, implique la combinaison de multiples caméras de haute qualité et des opérateurs de caméra qualifiés. Nous présentons une thèse à faire même les productions à petit budget adepte et agréable en produisant des vidéos de Youtube professionnels de qualité sans un équipage entièrement équipée et coûteux de cameramen. Une caméra statique haute résolution annule et remplace l'équipe de tournage pluriel et leurs mouvements de caméra efficaces sont ensuite simulé par la quasi-panoramique - inclinaison - zoom dans les enregistrements originaux. Nous montrons que plusieurs caméras virtuelles peuvent être simulés en choisissant des trajectoires différentes de culture fenêtres à l'intérieur de l'enregistrement original. L'une des nouveautés principales de ce travail est un cadre de optimisation pour calculer les trajectoires des caméras virtuelles à l'aide des informations extraites de la vidéo originale basée sur des techniques de vision par ordinateur. Les acteurs présents sur scène sont considérés comme les éléments les plus importants de la scène. Pour la tâche de localiser et de nommer les acteurs, nous introduisons modèles génératifs pour apprendre vue personne indépendante et détecteurs spécifiques costume d'un ensemble d'exemples étiquetés. Nous expliquons comment apprendre les modèles à partir d'un petit nombre d'images clés marqués ou pistes vidéo, et comment détecter de nouveaux aspects des acteurs dans un cadre du maximum de vraisemblance. Nous démontrons que les modèles spécifiques comme des acteurs peuvent localiser avec précision les acteurs malgré les changements de point de vue et des occlusions, et d'améliorer de manière significative les taux de rappel de détection plus détecteurs génériques. La thèse présente ensuite un algorithme hors ligne pour le suivi des objets et des acteurs dans les séquences vidéo longues utilisation de ces modèles spécifiques d'acteurs. Détections sont d'abord effectuées pour sélectionner indépendamment emplacements candidats de l'acteur / objet dans chaque image de la vidéo. Les détections candidats sont ensuite combinés en des trajectoires lisses dans une étape d'optimisation en minimisant une fonction de coût qui représente les fausses détections et les occlusions. Les pistes d'acteur, nous proposons un cadre pour plusieurs clips générant automatiquement adaptés pour le montage vidéo en simulant pan-tilt-zoom mouvements de caméra dans le cadre d'une seule caméra statique. Notre méthode ne nécessite que peu de données utilisateur pour définir l'objet de chaque sous-séquence. La composition de chaque sous-clip est automatiquement calculée dans un cadre nouveau d'optimisation norme L1. Notre approche code pour plusieurs pratiques cinématographiques communs dans un seul problème de minimisation de la fonction de coût convexe, ce qui sous-clips esthétiquement agréables qui peuvent être facilement éditées ensemble en utilisant multi-pince logiciel off-the-shelf montage vidéo. / Professional quality videos of live staged performances are created by recording them from different appropriate viewpoints. These are then edited together to portray an eloquent story replete with the ability to draw out the intended emotion from the viewers. Creating such competent videos, involves the combination of multiple high quality cameras and skilled camera operators. We present a thesis to make even the low budget productions adept and pleasant by producing professional quality vidoes sans a fully and expensively equipped crew of cameramen. A high resolution static camera replaces the plural camera crew and their efficient camera movements are then simulated by virtually panning - tilting - zooming within the original recordings. We show that multiple virtual cameras can be simulated by choosing different trajectories of cropping windows inside the original recording. One of the key novelties of this work is an optimazation framework for computing the virtual camera trajectories using the information extracted from the original video based on computer vision techniques. The actors present on stage are considered as the most important elements of the scene. For the task of localizing and naming actors, we introduce generative models for learning view independent person and costume specific detectors from a set of labeled examples. We explain how to learn the models from a small number of labeled keyframes or video tracks, and how to detect novel appearances of the actors in a maximum likelihood framework. We demonstrate that such actor specific models can accurately localize actors despite changes in view point and occlusions, and significantly improve the detection recall rates over generic detectors. The dissertation then presents an offline algorithm for tracking objects and actors in long video sequences using these actor specific models. Detections are first performed to independently select candidate locations of the actor/object in each frame of the video. The candidate detections are then combined into smooth trajectories in an optimization step minimizing a cost function accounting for false detections and occlusions. Using the actor tracks, we propose a framework for automatically generating multiple clips suitable for video editing by simulating pan-tilt-zoom camera movements within the frame of a single static camera. Our method requires only minimal user input to define the subject matter of each sub-clip. The composition of each sub-clip is automatically computed in a novel L1-norm optimization framework. Our approach encodes several common cinematographic practices into a single convex cost function minimization problem, resulting in aesthetically-pleasing sub-clips which can easily be edited together using off-the-shelf multi-clip video editing software.

Page generated in 0.0953 seconds