Global ETD Search

81	Optimized information processing in resource-constrained vision systems. From low-complexity coding to smart sensor networks MORBEE, MARLEEN 14 October 2011 (has links) Vision systems have become ubiquitous. They are used for traffic monitoring, elderly care, video conferencing, virtual reality, surveillance, smart rooms, home automation, sport games analysis, industrial safety, medical care etc. In most vision systems, the data coming from the visual sensor(s) is processed before transmission in order to save communication bandwidth or achieve higher frame rates. The type of data processing needs to be chosen carefully depending on the targeted application, and taking into account the available memory, computational power, energy resources and bandwidth constraints. In this dissertation, we investigate how a vision system should be built under practical constraints. First, this system should be intelligent, such that the right data is extracted from the video source. Second, when processing video data this intelligent vision system should know its own practical limitations, and should try to achieve the best possible output result that lies within its capabilities. We study and improve a wide range of vision systems for a variety of applications, which go together with different types of constraints. First, we present a modulo-PCM-based coding algorithm for applications that demand very low complexity coding and need to preserve some of the advantageous properties of PCM coding (direct processing, random access, rate scalability). Our modulo-PCM coding scheme combines three well-known, simple, source coding strategies: PCM, binning, and interpolative coding. The encoder first analyzes the signal statistics in a very simple way. Then, based on these signal statistics, the encoder simply discards a number of bits of each image sample. The modulo-PCM decoder recovers the removed bits of each sample by using its received bits and side information which is generated by interpolating previous decoded signals. Our algorithm is especially appropriate for image coding. / Morbee, M. (2011). Optimized information processing in resource-constrained vision systems. From low-complexity coding to smart sensor networks [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/12126 Vision systems Distributed video coding Sensor networks Smart cameras Occupancy sensing Resource-constrained systems Computer vision Task assignment Camera selection Image and video processing Image and video compression Wyner-ziv coding Low-complexity Line sensors Information processing TEORIA DE LA SEÑAL Y COMUNICACIONES
82	A Novel Access Technology Based on Infrared Thermography for People with Severe Motor Impairments Memarian, Negar 18 February 2011 (has links) Many individuals with severe motor impairments are cognitively capable, but because of their physical impairments, unable to express their intention through conventional means of communication. Access technologies are devices that attempt to translate the intention of these individuals into functional activity by harnessing their residual physical or physiological abilities. The primary objective of this thesis was to design and develop a novel non-invasive and non-contact access technology based on infrared thermal imaging. This access technology translates the local temperature change associated with voluntary mouth opening to activation of a binary switch such as a mouse click or key press. To this end, an algorithm based on motion and temperature analyses, and morphological and anthropometric filters was designed to detect mouth opening activity in thermal video in real-time. The secondary objective of this thesis was to introduce a mutual information measure for objective assessment of binary switch users’ performance. A model was suggested, in which combination of cognitive and physical abilities of the human user of a binary access switch constitute a communication channel. The proposed mutual information measure estimates the rate of information transmission in the ‘human communication channel’ during stimulus response tasks. Using this measure, in a study with ten able-bodied participants, the infrared thermal switch was validated against a conventional chin switch. Impairments in body functions and structures that may contraindicate the use of the infrared thermal switch were explored in a study with seven clients, with severe disabilities. Potential hard and soft technological solutions to mitigate the effect of these impairments on infrared thermal switch use were recommended. Finally the infrared thermal switch was tailored to meet the needs of a young man with severe spastic quadriplegic cerebral palsy, who had no other means of physical access. rehabilitation engineering assistive technology infrared thermography image processing infrared thermal imaging access technology human-computer interaction mutual information communication channel contextual factors severe motor impairments people with disability client-centred design video processing access switch access pathway paediatric non-invasive non-contact biomedical engineering algorithm 0541
83	A Novel Access Technology Based on Infrared Thermography for People with Severe Motor Impairments Memarian, Negar 18 February 2011 (has links) Many individuals with severe motor impairments are cognitively capable, but because of their physical impairments, unable to express their intention through conventional means of communication. Access technologies are devices that attempt to translate the intention of these individuals into functional activity by harnessing their residual physical or physiological abilities. The primary objective of this thesis was to design and develop a novel non-invasive and non-contact access technology based on infrared thermal imaging. This access technology translates the local temperature change associated with voluntary mouth opening to activation of a binary switch such as a mouse click or key press. To this end, an algorithm based on motion and temperature analyses, and morphological and anthropometric filters was designed to detect mouth opening activity in thermal video in real-time. The secondary objective of this thesis was to introduce a mutual information measure for objective assessment of binary switch users’ performance. A model was suggested, in which combination of cognitive and physical abilities of the human user of a binary access switch constitute a communication channel. The proposed mutual information measure estimates the rate of information transmission in the ‘human communication channel’ during stimulus response tasks. Using this measure, in a study with ten able-bodied participants, the infrared thermal switch was validated against a conventional chin switch. Impairments in body functions and structures that may contraindicate the use of the infrared thermal switch were explored in a study with seven clients, with severe disabilities. Potential hard and soft technological solutions to mitigate the effect of these impairments on infrared thermal switch use were recommended. Finally the infrared thermal switch was tailored to meet the needs of a young man with severe spastic quadriplegic cerebral palsy, who had no other means of physical access. rehabilitation engineering assistive technology infrared thermography image processing infrared thermal imaging access technology human-computer interaction mutual information communication channel contextual factors severe motor impairments people with disability client-centred design video processing access switch access pathway paediatric non-invasive non-contact biomedical engineering algorithm 0541
84	Iterative tensor factorization based on Krylov subspace-type methods with applications to image processing UGWU, UGOCHUKWU OBINNA 06 October 2021 (has links) No description available. Applied Mathematics Inverse problems Iterative tensor decomposition Krylov subspaces Image and video processing Truncated iterations Tensor Arnoldi process Tensor Golub-Kahan bidiagonalization Tensor Lanczos process tensor SVD Randomized tensor SVD t-product Invertible linear transform
85	Multimedia Forensics Using Metadata Ziyue Xiang (17989381) 21 February 2024 (has links) <p dir="ltr">The rapid development of machine learning techniques makes it possible to manipulate or synthesize video and audio information while introducing nearly indetectable artifacts. Most media forensics methods analyze the high-level data (e.g., pixels from videos, temporal signals from audios) decoded from compressed media data. Since media manipulation or synthesis methods usually aim to improve the quality of such high-level data directly, acquiring forensic evidence from these data has become increasingly challenging. In this work, we focus on media forensics techniques using the metadata in media formats, which includes container metadata and coding parameters in the encoded bitstream. Since many media manipulation and synthesis methods do not attempt to hide metadata traces, it is possible to use them for forensics tasks. First, we present a video forensics technique using metadata embedded in MP4/MOV video containers. Our proposed method achieved high performance in video manipulation detection, source device attribution, social media attribution, and manipulation tool identification on publicly available datasets. Second, we present a transformer neural network based MP3 audio forensics technique using low-level codec information. Our proposed method can localize multiple compressed segments in MP3 files. The localization accuracy of our proposed method is higher compared to other methods. Third, we present an H.264-based video device matching method. This method can determine if the two video sequences are captured by the same device even if the method has never encountered the device. Our proposed method achieved good performance in a three-fold cross validation scheme on a publicly available video forensics dataset containing 35 devices. Fourth, we present a Graph Neural Network (GNN) based approach for the analysis of MP4/MOV metadata trees. The proposed method is trained using Self-Supervised Learning (SSL), which increased the robustness of the proposed method and makes it capable of handling missing/unseen data. Fifth, we present an efficient approach to compute the spectrogram feature with MP3 compressed audio signals. The proposed approach decreases the complexity of speech feature computation by ~77.6% and saves ~37.87% of MP3 decoding time. The resulting spectrogram features lead to higher synthetic speech detection performance.</p> Audio processing Computer vision Image and video coding Image processing Pattern recognition Video processing Digital forensics Deep learning Deepfake detection Digital forensics Video forensics Audio forensics Video metadata Audio metadata H.264 MP3 MP4 Video manipulation detection Video compression Audio compression Decision tree Deep learning Dimensionality reduction Spectrogram Graph neural networks Neural networks Transformer neural networks
86	Video extraction for fast content access to MPEG compressed videos Jiang, Jianmin, Weng, Y. 09 June 2009 (has links) No / As existing video processing technology is primarily developed in the pixel domain yet digital video is stored in compressed format, any application of those techniques to compressed videos would require decompression. For discrete cosine transform (DCT)-based MPEG compressed videos, the computing cost of standard row-by-row and column-by-column inverse DCT (IDCT) transforms for a block of 8 8 elements requires 4096 multiplications and 4032 additions, although practical implementation only requires 1024 multiplications and 896 additions. In this paper, we propose a new algorithm to extract videos directly from MPEG compressed domain (DCT domain) without full IDCT, which is described in three extraction schemes: 1) video extraction in 2 2 blocks with four coefficients; 2) video extraction in 4 4 blocks with four DCT coefficients; and 3) video extraction in 4 4 blocks with nine DCT coefficients. The computing cost incurred only requires 8 additions and no multiplication for the first scheme, 2 multiplication and 28 additions for the second scheme, and 47 additions (no multiplication) for the third scheme. Extensive experiments were carried out, and the results reveal that: 1) the extracted video maintains competitive quality in terms of visual perception and inspection and 2) the extracted videos preserve the content well in comparison with those fully decompressed ones in terms of histogram measurement. As a result, the proposed algorithm will provide useful tools in bridging the gap between pixel domain and compressed domain to facilitate content analysis with low latency and high efficiency such as those applications in surveillance videos, interactive multimedia, and image processing. Data compression Video coding Discrete cosine transforms MPEG compressed videos Extraction schemes Visual perception Fast content access Computing cost Discrete cosine transform Histogram measurement Image processing Digital video Video processing technology Visual perception Visual inspection Video extraction Interactive multimedia
87	Crime Detection From Pre-crime Video Analysis Sedat Kilic (18363729) 03 June 2024 (has links) <p dir="ltr">his research investigates the detection of pre-crime events, specifically targeting behaviors indicative of shoplifting, through the advanced analysis of CCTV video data. The study introduces an innovative approach that leverages augmented human pose and emotion information within individual frames, combined with the extraction of activity information across subsequent frames, to enhance the identification of potential shoplifting actions before they occur. Utilizing a diverse set of models including 3D Convolutional Neural Networks (CNNs), Graph Neural Networks (GNNs), Recurrent Neural Networks (RNNs), and a specially developed transformer architecture, the research systematically explores the impact of integrating additional contextual information into video analysis.</p><p dir="ltr">By augmenting frame-level video data with detailed pose and emotion insights, and focusing on the temporal dynamics between frames, our methodology aims to capture the nuanced behavioral patterns that precede shoplifting events. The comprehensive experimental evaluation of our models across different configurations reveals a significant improvement in the accuracy of pre-crime detection. The findings underscore the crucial role of combining visual features with augmented data and the importance of analyzing activity patterns over time for a deeper understanding of pre-shoplifting behaviors.</p><p dir="ltr">The study’s contributions are multifaceted, including a detailed examination of pre-crime frames, strategic augmentation of video data with added contextual information, the creation of a novel transformer architecture customized for pre-crime analysis, and an extensive evaluation of various computational models to improve predictive accuracy.</p> Computer vision Image and video coding Image processing Pattern recognition Video processing crime detection video analysis augmented information pose estimation emotion estimation optical flow deep learning pre-crime video analysis video understanding anomaly detection contextual information shoplifting prevention crime prevention vision transformer transformer generative AI
88	MEMS-Laser-Display-System / MEMS Laser Display System Specht, Hendrik 19 October 2011 (has links) (PDF) In der vorliegenden Arbeit werden die im Zusammenhang mit der Strahlablenkung stehenden Systemaspekte der auf MEMS-Scanner basierenden Laser-Display-Technologie theoretisch analysiert und aus den Ergebnissen die praktische Implementierung eines Laser-Display-Systems als Testplattform vorgenommen. Dabei werden mit einem Ansatz auf Basis zweier 1D-Scanner und einem weiteren Ansatz mit einem 2D-Scanner zwei Varianten realisiert. Darüber hinaus erfolgt die Entwicklung eines bildbasierten Multiparametertestverfahrens, welches sowohl für den Test komplettierter Strahlablenkeinheiten bzw. Projektionsmodule als auch zum umfassenden und zeiteffizienten Test von MEMS-Scannern auf Wafer-Level geeignet ist. Mit diesem Verfahren erfolgt eine Charakterisierung der zwei realisierten Varianten des Laser-Displays. Ausgehend von den Eigenschaften des menschlichen visuellen Systems und den daraus resultierenden Anforderungen an das Bild sowie einer systemtheoretischen Betrachtung des mechanischen Verhaltens von MEMS-Scannern bildet die Ansteuersignalerzeugung für den resonanten Betrieb der schnellen und den quasistatischen Betrieb der langsamen Achse einen Schwerpunkt. Neben dem reinen digitalen Regler- bzw. Filterentwurf sowie mehreren Linearisierungsmaßnahmen beinhaltet dieser auch die Herleitung einer FPGA-basierten Videosignalverarbeitung zur Konvertierung von Scannpattern, Zeitregime und Auflösung mit einer entsprechenden Synchronisierung von Strahlablenkung und Lasermodulation. Auf Grundlage der daraus resultierenden Erkenntnisse über den Zusammenhang zwischen Scanner-/Systemparametern und Bildparametern werden Testbild-Bildverarbeitungsalgorithmus-Kombinationen entwickelt und diese, angeordnet in einer Sequenz, mit einem Kalibrierverfahren zu einem Testverfahren für MEMS-Scanner vervollständigt. Die Ergebnisse dieser Arbeit entstanden im Rahmen von industriell beauftragten F&E-Projekten und fließen in die andauernde Fortführung des Themas beim Auftraggeber ein. Laserdisplay Mikrospiegel MOEMS Videosignalverarbeitung Scannpattern Rasterscann Lissajous Positionsregelung FIR-Filter IIR-Filter resonanter Betrieb quasistatischer Betrieb Testverfahren Wafer-Level laser display micro mirror MEMS MOEMS image quality resolution distortion linearity video processing scaling scan pattern line scan Lissajous system model position control phase locked loop FIR IIR filter resonant drive quasi-static drive wafer level test image processing ddc:003 ddc:620 ddc:629 MEMS Bildqualität Auflösung Verzeichnung Linearität Skalierung Systemmodell Phasenregelung Nichtrekursives Filter Rekursivfilter Test Bildverarbeitung
89	MEMS-Laser-Display-System: Analyse, Implementierung und Testverfahrenentwicklung Specht, Hendrik 20 May 2011 (has links) In der vorliegenden Arbeit werden die im Zusammenhang mit der Strahlablenkung stehenden Systemaspekte der auf MEMS-Scanner basierenden Laser-Display-Technologie theoretisch analysiert und aus den Ergebnissen die praktische Implementierung eines Laser-Display-Systems als Testplattform vorgenommen. Dabei werden mit einem Ansatz auf Basis zweier 1D-Scanner und einem weiteren Ansatz mit einem 2D-Scanner zwei Varianten realisiert. Darüber hinaus erfolgt die Entwicklung eines bildbasierten Multiparametertestverfahrens, welches sowohl für den Test komplettierter Strahlablenkeinheiten bzw. Projektionsmodule als auch zum umfassenden und zeiteffizienten Test von MEMS-Scannern auf Wafer-Level geeignet ist. Mit diesem Verfahren erfolgt eine Charakterisierung der zwei realisierten Varianten des Laser-Displays. Ausgehend von den Eigenschaften des menschlichen visuellen Systems und den daraus resultierenden Anforderungen an das Bild sowie einer systemtheoretischen Betrachtung des mechanischen Verhaltens von MEMS-Scannern bildet die Ansteuersignalerzeugung für den resonanten Betrieb der schnellen und den quasistatischen Betrieb der langsamen Achse einen Schwerpunkt. Neben dem reinen digitalen Regler- bzw. Filterentwurf sowie mehreren Linearisierungsmaßnahmen beinhaltet dieser auch die Herleitung einer FPGA-basierten Videosignalverarbeitung zur Konvertierung von Scannpattern, Zeitregime und Auflösung mit einer entsprechenden Synchronisierung von Strahlablenkung und Lasermodulation. Auf Grundlage der daraus resultierenden Erkenntnisse über den Zusammenhang zwischen Scanner-/Systemparametern und Bildparametern werden Testbild-Bildverarbeitungsalgorithmus-Kombinationen entwickelt und diese, angeordnet in einer Sequenz, mit einem Kalibrierverfahren zu einem Testverfahren für MEMS-Scanner vervollständigt. Die Ergebnisse dieser Arbeit entstanden im Rahmen von industriell beauftragten F&E-Projekten und fließen in die andauernde Fortführung des Themas beim Auftraggeber ein. info:eu-repo/classification/ddc/003 ddc:003 info:eu-repo/classification/ddc/620 ddc:620 info:eu-repo/classification/ddc/629 ddc:629

Search results