• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 29
  • 11
  • 7
  • 7
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 89
  • 89
  • 47
  • 29
  • 22
  • 17
  • 17
  • 15
  • 12
  • 10
  • 10
  • 9
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Optimized information processing in resource-constrained vision systems. From low-complexity coding to smart sensor networks

MORBEE, MARLEEN 14 October 2011 (has links)
Vision systems have become ubiquitous. They are used for traffic monitoring, elderly care, video conferencing, virtual reality, surveillance, smart rooms, home automation, sport games analysis, industrial safety, medical care etc. In most vision systems, the data coming from the visual sensor(s) is processed before transmission in order to save communication bandwidth or achieve higher frame rates. The type of data processing needs to be chosen carefully depending on the targeted application, and taking into account the available memory, computational power, energy resources and bandwidth constraints. In this dissertation, we investigate how a vision system should be built under practical constraints. First, this system should be intelligent, such that the right data is extracted from the video source. Second, when processing video data this intelligent vision system should know its own practical limitations, and should try to achieve the best possible output result that lies within its capabilities. We study and improve a wide range of vision systems for a variety of applications, which go together with different types of constraints. First, we present a modulo-PCM-based coding algorithm for applications that demand very low complexity coding and need to preserve some of the advantageous properties of PCM coding (direct processing, random access, rate scalability). Our modulo-PCM coding scheme combines three well-known, simple, source coding strategies: PCM, binning, and interpolative coding. The encoder first analyzes the signal statistics in a very simple way. Then, based on these signal statistics, the encoder simply discards a number of bits of each image sample. The modulo-PCM decoder recovers the removed bits of each sample by using its received bits and side information which is generated by interpolating previous decoded signals. Our algorithm is especially appropriate for image coding. / Morbee, M. (2011). Optimized information processing in resource-constrained vision systems. From low-complexity coding to smart sensor networks [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/12126
82

A Novel Access Technology Based on Infrared Thermography for People with Severe Motor Impairments

Memarian, Negar 18 February 2011 (has links)
Many individuals with severe motor impairments are cognitively capable, but because of their physical impairments, unable to express their intention through conventional means of communication. Access technologies are devices that attempt to translate the intention of these individuals into functional activity by harnessing their residual physical or physiological abilities. The primary objective of this thesis was to design and develop a novel non-invasive and non-contact access technology based on infrared thermal imaging. This access technology translates the local temperature change associated with voluntary mouth opening to activation of a binary switch such as a mouse click or key press. To this end, an algorithm based on motion and temperature analyses, and morphological and anthropometric filters was designed to detect mouth opening activity in thermal video in real-time. The secondary objective of this thesis was to introduce a mutual information measure for objective assessment of binary switch users’ performance. A model was suggested, in which combination of cognitive and physical abilities of the human user of a binary access switch constitute a communication channel. The proposed mutual information measure estimates the rate of information transmission in the ‘human communication channel’ during stimulus response tasks. Using this measure, in a study with ten able-bodied participants, the infrared thermal switch was validated against a conventional chin switch. Impairments in body functions and structures that may contraindicate the use of the infrared thermal switch were explored in a study with seven clients, with severe disabilities. Potential hard and soft technological solutions to mitigate the effect of these impairments on infrared thermal switch use were recommended. Finally the infrared thermal switch was tailored to meet the needs of a young man with severe spastic quadriplegic cerebral palsy, who had no other means of physical access.
83

A Novel Access Technology Based on Infrared Thermography for People with Severe Motor Impairments

Memarian, Negar 18 February 2011 (has links)
Many individuals with severe motor impairments are cognitively capable, but because of their physical impairments, unable to express their intention through conventional means of communication. Access technologies are devices that attempt to translate the intention of these individuals into functional activity by harnessing their residual physical or physiological abilities. The primary objective of this thesis was to design and develop a novel non-invasive and non-contact access technology based on infrared thermal imaging. This access technology translates the local temperature change associated with voluntary mouth opening to activation of a binary switch such as a mouse click or key press. To this end, an algorithm based on motion and temperature analyses, and morphological and anthropometric filters was designed to detect mouth opening activity in thermal video in real-time. The secondary objective of this thesis was to introduce a mutual information measure for objective assessment of binary switch users’ performance. A model was suggested, in which combination of cognitive and physical abilities of the human user of a binary access switch constitute a communication channel. The proposed mutual information measure estimates the rate of information transmission in the ‘human communication channel’ during stimulus response tasks. Using this measure, in a study with ten able-bodied participants, the infrared thermal switch was validated against a conventional chin switch. Impairments in body functions and structures that may contraindicate the use of the infrared thermal switch were explored in a study with seven clients, with severe disabilities. Potential hard and soft technological solutions to mitigate the effect of these impairments on infrared thermal switch use were recommended. Finally the infrared thermal switch was tailored to meet the needs of a young man with severe spastic quadriplegic cerebral palsy, who had no other means of physical access.
84

Iterative tensor factorization based on Krylov subspace-type methods with applications to image processing

UGWU, UGOCHUKWU OBINNA 06 October 2021 (has links)
No description available.
85

Multimedia Forensics Using Metadata

Ziyue Xiang (17989381) 21 February 2024 (has links)
<p dir="ltr">The rapid development of machine learning techniques makes it possible to manipulate or synthesize video and audio information while introducing nearly indetectable artifacts. Most media forensics methods analyze the high-level data (e.g., pixels from videos, temporal signals from audios) decoded from compressed media data. Since media manipulation or synthesis methods usually aim to improve the quality of such high-level data directly, acquiring forensic evidence from these data has become increasingly challenging. In this work, we focus on media forensics techniques using the metadata in media formats, which includes container metadata and coding parameters in the encoded bitstream. Since many media manipulation and synthesis methods do not attempt to hide metadata traces, it is possible to use them for forensics tasks. First, we present a video forensics technique using metadata embedded in MP4/MOV video containers. Our proposed method achieved high performance in video manipulation detection, source device attribution, social media attribution, and manipulation tool identification on publicly available datasets. Second, we present a transformer neural network based MP3 audio forensics technique using low-level codec information. Our proposed method can localize multiple compressed segments in MP3 files. The localization accuracy of our proposed method is higher compared to other methods. Third, we present an H.264-based video device matching method. This method can determine if the two video sequences are captured by the same device even if the method has never encountered the device. Our proposed method achieved good performance in a three-fold cross validation scheme on a publicly available video forensics dataset containing 35 devices. Fourth, we present a Graph Neural Network (GNN) based approach for the analysis of MP4/MOV metadata trees. The proposed method is trained using Self-Supervised Learning (SSL), which increased the robustness of the proposed method and makes it capable of handling missing/unseen data. Fifth, we present an efficient approach to compute the spectrogram feature with MP3 compressed audio signals. The proposed approach decreases the complexity of speech feature computation by ~77.6% and saves ~37.87% of MP3 decoding time. The resulting spectrogram features lead to higher synthetic speech detection performance.</p>
86

Video extraction for fast content access to MPEG compressed videos

Jiang, Jianmin, Weng, Y. 09 June 2009 (has links)
No / As existing video processing technology is primarily developed in the pixel domain yet digital video is stored in compressed format, any application of those techniques to compressed videos would require decompression. For discrete cosine transform (DCT)-based MPEG compressed videos, the computing cost of standard row-by-row and column-by-column inverse DCT (IDCT) transforms for a block of 8 8 elements requires 4096 multiplications and 4032 additions, although practical implementation only requires 1024 multiplications and 896 additions. In this paper, we propose a new algorithm to extract videos directly from MPEG compressed domain (DCT domain) without full IDCT, which is described in three extraction schemes: 1) video extraction in 2 2 blocks with four coefficients; 2) video extraction in 4 4 blocks with four DCT coefficients; and 3) video extraction in 4 4 blocks with nine DCT coefficients. The computing cost incurred only requires 8 additions and no multiplication for the first scheme, 2 multiplication and 28 additions for the second scheme, and 47 additions (no multiplication) for the third scheme. Extensive experiments were carried out, and the results reveal that: 1) the extracted video maintains competitive quality in terms of visual perception and inspection and 2) the extracted videos preserve the content well in comparison with those fully decompressed ones in terms of histogram measurement. As a result, the proposed algorithm will provide useful tools in bridging the gap between pixel domain and compressed domain to facilitate content analysis with low latency and high efficiency such as those applications in surveillance videos, interactive multimedia, and image processing.
87

Crime Detection From Pre-crime Video Analysis

Sedat Kilic (18363729) 03 June 2024 (has links)
<p dir="ltr">his research investigates the detection of pre-crime events, specifically targeting behaviors indicative of shoplifting, through the advanced analysis of CCTV video data. The study introduces an innovative approach that leverages augmented human pose and emotion information within individual frames, combined with the extraction of activity information across subsequent frames, to enhance the identification of potential shoplifting actions before they occur. Utilizing a diverse set of models including 3D Convolutional Neural Networks (CNNs), Graph Neural Networks (GNNs), Recurrent Neural Networks (RNNs), and a specially developed transformer architecture, the research systematically explores the impact of integrating additional contextual information into video analysis.</p><p dir="ltr">By augmenting frame-level video data with detailed pose and emotion insights, and focusing on the temporal dynamics between frames, our methodology aims to capture the nuanced behavioral patterns that precede shoplifting events. The comprehensive experimental evaluation of our models across different configurations reveals a significant improvement in the accuracy of pre-crime detection. The findings underscore the crucial role of combining visual features with augmented data and the importance of analyzing activity patterns over time for a deeper understanding of pre-shoplifting behaviors.</p><p dir="ltr">The study’s contributions are multifaceted, including a detailed examination of pre-crime frames, strategic augmentation of video data with added contextual information, the creation of a novel transformer architecture customized for pre-crime analysis, and an extensive evaluation of various computational models to improve predictive accuracy.</p>
88

MEMS-Laser-Display-System / MEMS Laser Display System

Specht, Hendrik 19 October 2011 (has links) (PDF)
In der vorliegenden Arbeit werden die im Zusammenhang mit der Strahlablenkung stehenden Systemaspekte der auf MEMS-Scanner basierenden Laser-Display-Technologie theoretisch analysiert und aus den Ergebnissen die praktische Implementierung eines Laser-Display-Systems als Testplattform vorgenommen. Dabei werden mit einem Ansatz auf Basis zweier 1D-Scanner und einem weiteren Ansatz mit einem 2D-Scanner zwei Varianten realisiert. Darüber hinaus erfolgt die Entwicklung eines bildbasierten Multiparametertestverfahrens, welches sowohl für den Test komplettierter Strahlablenkeinheiten bzw. Projektionsmodule als auch zum umfassenden und zeiteffizienten Test von MEMS-Scannern auf Wafer-Level geeignet ist. Mit diesem Verfahren erfolgt eine Charakterisierung der zwei realisierten Varianten des Laser-Displays. Ausgehend von den Eigenschaften des menschlichen visuellen Systems und den daraus resultierenden Anforderungen an das Bild sowie einer systemtheoretischen Betrachtung des mechanischen Verhaltens von MEMS-Scannern bildet die Ansteuersignalerzeugung für den resonanten Betrieb der schnellen und den quasistatischen Betrieb der langsamen Achse einen Schwerpunkt. Neben dem reinen digitalen Regler- bzw. Filterentwurf sowie mehreren Linearisierungsmaßnahmen beinhaltet dieser auch die Herleitung einer FPGA-basierten Videosignalverarbeitung zur Konvertierung von Scannpattern, Zeitregime und Auflösung mit einer entsprechenden Synchronisierung von Strahlablenkung und Lasermodulation. Auf Grundlage der daraus resultierenden Erkenntnisse über den Zusammenhang zwischen Scanner-/Systemparametern und Bildparametern werden Testbild-Bildverarbeitungsalgorithmus-Kombinationen entwickelt und diese, angeordnet in einer Sequenz, mit einem Kalibrierverfahren zu einem Testverfahren für MEMS-Scanner vervollständigt. Die Ergebnisse dieser Arbeit entstanden im Rahmen von industriell beauftragten F&E-Projekten und fließen in die andauernde Fortführung des Themas beim Auftraggeber ein.
89

MEMS-Laser-Display-System: Analyse, Implementierung und Testverfahrenentwicklung

Specht, Hendrik 20 May 2011 (has links)
In der vorliegenden Arbeit werden die im Zusammenhang mit der Strahlablenkung stehenden Systemaspekte der auf MEMS-Scanner basierenden Laser-Display-Technologie theoretisch analysiert und aus den Ergebnissen die praktische Implementierung eines Laser-Display-Systems als Testplattform vorgenommen. Dabei werden mit einem Ansatz auf Basis zweier 1D-Scanner und einem weiteren Ansatz mit einem 2D-Scanner zwei Varianten realisiert. Darüber hinaus erfolgt die Entwicklung eines bildbasierten Multiparametertestverfahrens, welches sowohl für den Test komplettierter Strahlablenkeinheiten bzw. Projektionsmodule als auch zum umfassenden und zeiteffizienten Test von MEMS-Scannern auf Wafer-Level geeignet ist. Mit diesem Verfahren erfolgt eine Charakterisierung der zwei realisierten Varianten des Laser-Displays. Ausgehend von den Eigenschaften des menschlichen visuellen Systems und den daraus resultierenden Anforderungen an das Bild sowie einer systemtheoretischen Betrachtung des mechanischen Verhaltens von MEMS-Scannern bildet die Ansteuersignalerzeugung für den resonanten Betrieb der schnellen und den quasistatischen Betrieb der langsamen Achse einen Schwerpunkt. Neben dem reinen digitalen Regler- bzw. Filterentwurf sowie mehreren Linearisierungsmaßnahmen beinhaltet dieser auch die Herleitung einer FPGA-basierten Videosignalverarbeitung zur Konvertierung von Scannpattern, Zeitregime und Auflösung mit einer entsprechenden Synchronisierung von Strahlablenkung und Lasermodulation. Auf Grundlage der daraus resultierenden Erkenntnisse über den Zusammenhang zwischen Scanner-/Systemparametern und Bildparametern werden Testbild-Bildverarbeitungsalgorithmus-Kombinationen entwickelt und diese, angeordnet in einer Sequenz, mit einem Kalibrierverfahren zu einem Testverfahren für MEMS-Scanner vervollständigt. Die Ergebnisse dieser Arbeit entstanden im Rahmen von industriell beauftragten F&E-Projekten und fließen in die andauernde Fortführung des Themas beim Auftraggeber ein.

Page generated in 0.0664 seconds