• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5797
  • 1138
  • 723
  • 337
  • 66
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 8694
  • 8694
  • 7756
  • 7115
  • 3981
  • 3980
  • 3294
  • 3214
  • 3110
  • 3110
  • 3110
  • 3110
  • 3110
  • 1164
  • 1157
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Motion scalability for video coding with flexible spatio-temporal decompositions

Mrak, Marta January 2007 (has links)
The research presented in this thesis aims to extend the scalability range of the wavelet-based video coding systems in order to achieve fully scalable coding with a wide range of available decoding points. Since the temporal redundancy regularly comprises the main portion of the global video sequence redundancy, the techniques that can be generally termed motion decorrelation techniques have a central role in the overall compression performance. For this reason the scalable motion modelling and coding are of utmost importance, and specifically, in this thesis possible solutions are identified and analysed. The main contributions of the presented research are grouped into two interrelated and complementary topics. Firstly a flexible motion model with rateoptimised estimation technique is introduced. The proposed motion model is based on tree structures and allows high adaptability needed for layered motion coding. The flexible structure for motion compensation allows for optimisation at different stages of the adaptive spatio-temporal decomposition, which is crucial for scalable coding that targets decoding on different resolutions. By utilising an adaptive choice of wavelet filterbank, the model enables high compression based on efficient mode selection. Secondly, solutions for scalable motion modelling and coding are developed. These solutions are based on precision limiting of motion vectors and creation of a layered motion structure that describes hierarchically coded motion. The solution based on precision limiting relies on layered bit-plane coding of motion vector values. The second solution builds on recently established techniques that impose scalability on a motion structure. The new approach is based on two major improvements: the evaluation of distortion in temporal Subbands and motion search in temporal subbands that finds the optimal motion vectors for layered motion structure. Exhaustive tests on the rate-distortion performance in demanding scalable video coding scenarios show benefits of application of both developed flexible motion model and various solutions for scalable motion coding.
132

Robust signatures for 3D face registration and recognition

Nair, Prathap M. January 2010 (has links)
Biometric authentication through face recognition has been an active area of research for the last few decades, motivated by its application-driven demand. The popularity of face recognition, compared to other biometric methods, is largely due to its minimum requirement of subject co-operation, relative ease of data capture and similarity to the natural way humans distinguish each other. 3D face recognition has recently received particular interest since three-dimensional face scans eliminate or reduce important limitations of 2D face images, such as illumination changes and pose variations. In fact, three-dimensional face scans are usually captured by scanners through the use of a constant structured-light source, making them invariant to environmental changes in illumination. Moreover, a single 3D scan also captures the entire face structure and allows for accurate pose normalisation. However, one of the biggest challenges that still remain in three-dimensional face scans is the sensitivity to large local deformations due to, for example, facial expressions. Due to the nature of the data, deformations bring about large changes in the 3D geometry of the scan. In addition to this, 3D scans are also characterised by noise and artefacts such as spikes and holes, which are uncommon with 2D images and requires a pre-processing stage that is speci c to the scanner used to capture the data. The aim of this thesis is to devise a face signature that is compact in size and overcomes the above mentioned limitations. We investigate the use of facial regions and landmarks towards a robust and compact face signature, and we study, implement and validate a region-based and a landmark-based face signature. Combinations of regions and landmarks are evaluated for their robustness to pose and expressions, while the matching scheme is evaluated for its robustness to noise and data artefacts.
133

Automated camera ranking and selection using video content and scene context

Daniyal, Fahad M. January 2012 (has links)
When observing a scene with multiple cameras, an important problem to solve is to automatically identify “what camera feed should be shown and when?” The answer to this question is of interest for a number of applications and scenarios ranging from sports to surveillance. In this thesis we present a framework for the ranking of each video frame and camera across time and the camera network, respectively. This ranking is then used for automated video production. In the first stage information from each camera view and from the objects in it is extracted and represented in a way that allows for object- and frame-ranking. First objects are detected and ranked within and across camera views. This ranking takes into account both visible and contextual information related to the object. Then content ranking is performed based on the objects in the view and camera-network level information. We propose two novel techniques for content ranking namely: Routing Based Ranking (RBR) and Multivariate Gaussian based Ranking (MVG). In RBR we use a rule based framework where weighted fusion of object and frame level information takes place while in MVG the rank is estimated as a multivariate Gaussian distribution. Through experimental and subjective validation we demonstrate that the proposed content ranking strategies allows the identification of the best-camera at each time. The second part of the thesis focuses on the automatic generation of N-to-1 videos based on the ranked content. We demonstrate that in such production settings it is undesirable to have frequent inter-camera switching. Thus motivating the need for a compromise, between selecting the best camera most of the time and minimising the frequent inter-camera switching, we demonstrate that state-of-the-art techniques for this task are inadequate and fail in dynamic scenes. We propose three novel methods for automated camera selection. The first method (¡go f ) performs a joint optimization of a cost function that depends on both the view quality and inter-camera switching so that a i Abstract ii pleasing best-view video sequence can be composed. The other two methods (¡dbn and ¡util) include the selection decision into the ranking-strategy. In ¡dbn we model the best-camera selection as a state sequence via Directed Acyclic Graphs (DAG) designed as a Dynamic Bayesian Network (DBN), which encodes the contextual knowledge about the camera network and employs the past information to minimize the inter camera switches. In comparison ¡util utilizes the past as well as the future information in a Partially Observable Markov Decision Process (POMDP) where the camera-selection at a certain time is influenced by the past information and its repercussions in the future. The performance of the proposed approach is demonstrated on multiple real and synthetic multi-camera setups. We compare the proposed architectures with various baseline methods with encouraging results. The performance of the proposed approaches is also validated through extensive subjective testing.
134

Automatic transcription of polyphonic music exploiting temporal evolution

Benetos, Emmanouil January 2012 (has links)
Automatic music transcription is the process of converting an audio recording into a symbolic representation using musical notation. It has numerous applications in music information retrieval, computational musicology, and the creation of interactive systems. Even for expert musicians, transcribing polyphonic pieces of music is not a trivial task, and while the problem of automatic pitch estimation for monophonic signals is considered to be solved, the creation of an automated system able to transcribe polyphonic music without setting restrictions on the degree of polyphony and the instrument type still remains open. In this thesis, research on automatic transcription is performed by explicitly incorporating information on the temporal evolution of sounds. First efforts address the problem by focusing on signal processing techniques and by proposing audio features utilising temporal characteristics. Techniques for note onset and offset detection are also utilised for improving transcription performance. Subsequent approaches propose transcription models based on shift-invariant probabilistic latent component analysis (SI-PLCA), modeling the temporal evolution of notes in a multiple-instrument case and supporting frequency modulations in produced notes. Datasets and annotations for transcription research have also been created during this work. Proposed systems have been privately as well as publicly evaluated within the Music Information Retrieval Evaluation eXchange (MIREX) framework. Proposed systems have been shown to outperform several state-of-the-art transcription approaches. Developed techniques have also been employed for other tasks related to music technology, such as for key modulation detection, temperament estimation, and automatic piano tutoring. Finally, proposed music transcription models have also been utilized in a wider context, namely for modeling acoustic scenes.
135

Default reasoning using maximum entropy and variable strength defaults

Bourne, Rachel Anne January 1999 (has links)
The thesis presents a computational model for reasoning with partial information which uses default rules or information about what normally happens. The idea is to provide a means of filling the gaps in an incomplete world view with the most plausible assumptions while allowing for the retraction of conclusions should they subsequently turn out to be incorrect. The model can be used both to reason from a given knowledge base of default rules, and to aid in the construction of such knowledge bases by allowing their designer to compare the consequences of his design with his own default assumptions. The conclusions supported by the proposed model are justified by the use of a probabilistic semantics for default rules in conjunction with the application of a rational means of inference from incomplete knowledge the principle of maximum entropy (ME). The thesis develops both the theory and algorithms for the ME approach and argues that it should be considered as a general theory of default reasoning. The argument supporting the thesis has two main threads. Firstly, the ME approach is tested on the benchmark examples required of nonmonotonic behaviour, and it is found to handle them appropriately. Moreover, these patterns of commonsense reasoning emerge as consequences of the chosen semantics rather than being design features. It is argued that this makes the ME approach more objective, and its conclusions more justifiable, than other default systems. Secondly, the ME approach is compared with two existing systems: the lexicographic approach (LEX) and system Z+. It is shown that the former can be equated with ME under suitable conditions making it strictly less expressive, while the latter is too crude to perform the subtle resolution of default conflict which the ME approach allows. Finally, a program called DRS is described which implements all systems discussed in the thesis and provides a tool for testing their behaviours.
136

Automatic object classification for surveillance videos

Fernandez Arguedas, Virginia January 2012 (has links)
The recent popularity of surveillance video systems, specially located in urban scenarios, demands the development of visual techniques for monitoring purposes. A primary step towards intelligent surveillance video systems consists on automatic object classification, which still remains an open research problem and the keystone for the development of more specific applications. Typically, object representation is based on the inherent visual features. However, psychological studies have demonstrated that human beings can routinely categorise objects according to their behaviour. The existing gap in the understanding between the features automatically extracted by a computer, such as appearance-based features, and the concepts unconsciously perceived by human beings but unattainable for machines, or the behaviour features, is most commonly known as semantic gap. Consequently, this thesis proposes to narrow the semantic gap and bring together machine and human understanding towards object classification. Thus, a Surveillance Media Management is proposed to automatically detect and classify objects by analysing the physical properties inherent in their appearance (machine understanding) and the behaviour patterns which require a higher level of understanding (human understanding). Finally, a probabilistic multimodal fusion algorithm bridges the gap performing an automatic classification considering both machine and human understanding. The performance of the proposed Surveillance Media Management framework has been thoroughly evaluated on outdoor surveillance datasets. The experiments conducted demonstrated that the combination of machine and human understanding substantially enhanced the object classification performance. Finally, the inclusion of human reasoning and understanding provides the essential information to bridge the semantic gap towards smart surveillance video systems.
137

Automatic annotation of musical audio for interactive applications

Brossier, Paul M. January 2006 (has links)
As machines become more and more portable, and part of our everyday life, it becomes apparent that developing interactive and ubiquitous systems is an important aspect of new music applications created by the research community. We are interested in developing a robust layer for the automatic annotation of audio signals, to be used in various applications, from music search engines to interactive installations, and in various contexts, from embedded devices to audio content servers. We propose adaptations of existing signal processing techniques to a real time context. Amongst these annotation techniques, we concentrate on low and mid-level tasks such as onset detection, pitch tracking, tempo extraction and note modelling. We present a framework to extract these annotations and evaluate the performances of different algorithms. The first task is to detect onsets and offsets in audio streams within short latencies. The segmentation of audio streams into temporal objects enables various manipulation and analysis of metrical structure. Evaluation of different algorithms and their adaptation to real time are described. We then tackle the problem of fundamental frequency estimation, again trying to reduce both the delay and the computational cost. Different algorithms are implemented for real time and experimented on monophonic recordings and complex signals. Spectral analysis can be used to label the temporal segments; the estimation of higher level descriptions is approached. Techniques for modelling of note objects and localisation of beats are implemented and discussed. Applications of our framework include live and interactive music installations, and more generally tools for the composers and sound engineers. Speed optimisations may bring a significant improvement to various automated tasks, such as automatic classification and recommendation systems. We describe the design of our software solution, for our research purposes and in view of its integration within other systems.
138

Investigating the effect of materials processing on ZnO nanorod properties and device performance

Hatch, Sabina January 2013 (has links)
This project explores the effect of materials processing on the optical, morphological, and electrical properties of ZnO nanorods synthesised using the low-temperature (90˚C) aqueous chemical technique. A highly-alkaline (pH 11) growth solution fabricated nanorods that exhibit morphological sensitivity to the anneal atmosphere used. This was attributed to unreacted precursors trapped throughout the nanorod bulk and near the surface. A significant increase in the c-axis peak intensity post-annealing and evidence of nitrogen-doping in all annealed nanorods, confirmed the precursors were present prior to the annealing process. In addition, intense green photoluminescence was observed under UV excitation and was shown to be dependent on the anneal atmosphere. The origin of this emission was related to zinc vacancy defects that were energetically favoured during the oxygen-rich synthesis and anneal conditions according to first principle calculations. To better study the electrical properties of the ZnO nanorods they were incorporated into p-n heterostructures using p-type CuSCN. An alternative spray-coating method was developed for depositing CuSCN that was a significant improvement over previous methods as demonstrated by the high hole mobility 70 cm2/V.s. The current popularity of conductive polymers led to the comparison of hybrid inorganic-organic (ZnO-PEDOT:PSS) and purely inorganic (ZnO-CuSCN) devices. These were tested as UV photodetectors and differences in device structure were shown to have a significant impact on the device response time and responsivity. A rectification ratio of 21500 at ±3 V was achieved for these ZnO-CuSCN devices. The inorganic ZnO-CuSCN device exhibited photovoltaic behaviour at zero-bias, which highlighted it as a suitable choice for self-powered UV photodetection. The effect of processing on the photodetector performance was investigated for two sets of nanorods; pH 6 and pH 11. Consequently, a maximum photocurrent response of 30 μA (for 6 mW cm-2 irradiance) was achieved for nitrogen-annealed pH 11-nanorods with a rise time of 25 ns. The high response was assigned to fewer zinc vacancies acting as electron trap states and the introduction of N-related donor defects.
139

Highly efficient low-level feature extraction for video representation and retrieval

Calie, Janko January 2004 (has links)
Witnessing the omnipresence of digital video media, the research community has raised the question of its meaningful use and management. Stored in immense multimedia databases, digital videos need to be retrieved and structured in an intelligent way, relying on the content and the rich semantics involved. Current Content Based Video Indexing and Retrieval systems face the problem of the semantic gap between the simplicity of the available visual features and the richness of user semantics. This work focuses on the issues of efficiency and scalability in video indexing and retrieval to facilitate a video representation model capable of semantic annotation. A highly efficient algorithm for temporal analysis and key-frame extraction is developed. It is based on the prediction information extracted directly from the compressed domain features and the robust scalable analysis in the temporal domain. Furthermore, a hierarchical quantisation of the colour features in the descriptor space is presented. Derived from the extracted set of low-level features, a video representation model that enables semantic annotation and contextual genre classification is designed. Results demonstrate the efficiency and robustness of the temporal analysis algorithm that runs in real time maintaining the high precision and recall of the detection task. Adaptive key-frame extraction and summarisation achieve a good overview of the visual content, while the colour quantisation algorithm efficiently creates hierarchical set of descriptors. Finally, the video representation model, supported by the genre classification algorithm, achieves excellent results in an automatic annotation system by linking the video clips with a limited lexicon of related keywords.
140

A semantic and agent-based approach to support information retrieval, interoperability and multi-lateral viewpoints for heterogeneous environmental databases

Zuo, Landong January 2006 (has links)
Data stored in individual autonomous databases often needs to be combined and interrelated. For example, in the Inland Water (IW) environment monitoring domain, the spatial and temporal variation of measurements of different water quality indicators stored in different databases are of interest. Data from multiple data sources is more complex to combine when there is a lack of metadata in a computation forin and when the syntax and semantics of the stored data models are heterogeneous. The main types of information retrieval (IR) requirements are query transparency and data harmonisation for data interoperability and support for multiple user views. A combined Semantic Web based and Agent based distributed system framework has been developed to support the above IR requirements. It has been implemented using the Jena ontology and JADE agent toolkits. The semantic part supports the interoperability of autonomous data sources by merging their intensional data, using a Global-As-View or GAV approach, into a global semantic model, represented in DAML+OIL and in OWL. This is used to mediate between different local database views. The agent part provides the semantic services to import, align and parse semantic metadata instances, to support data mediation and to reason about data mappings during alignment. The framework has applied to support information retrieval, interoperability and multi-lateral viewpoints for four European environmental agency databases. An extended GAV approach has been developed and applied to handle queries that can be reformulated over multiple user views of the stored data. This allows users to retrieve data in a conceptualisation that is better suited to them rather than to have to understand the entire detailed global view conceptualisation. User viewpoints are derived from the global ontology or existing viewpoints of it. This has the advantage that it reduces the number of potential conceptualisations and their associated mappings to be more computationally manageable. Whereas an ad hoc framework based upon conventional distributed programming language and a rule framework could be used to support user views and adaptation to user views, a more formal framework has the benefit in that it can support reasoning about the consistency, equivalence, containment and conflict resolution when traversing data models. A preliminary formulation of the formal model has been undertaken and is based upon extending a Datalog type algebra with hierarchical, attribute and instance value operators. These operators can be applied to support compositional mapping and consistency checking of data views. The multiple viewpoint system was implemented as a Java-based application consisting of two sub-systems, one for viewpoint adaptation and management, the other for query processing and query result adjustment.

Page generated in 0.1221 seconds