• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 11
  • 5
  • 4
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 48
  • 48
  • 42
  • 14
  • 12
  • 10
  • 9
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

An Fpga Implementation Of Real-time Electro-optic &amp / Ir Image Fusion

Colova, Ibrahim Melih 01 September 2010 (has links) (PDF)
In this thesis, a modified 2D Discrete Cosine Transform based electro-optic and IR image fusion algorithm is proposed and implemented on an FPGA platform. The platform is a custom FPGA board which uses ALTERA Stratix III family FPGA. The algorithm is also compared with state of the art image fusion algorithms by means of an image fusion software application GUI developed in Matlab&reg / . The proposed algorithm principally takes corresponding 4x4 pixel blocks of two images to be fused and transforms them by means of 2D Discrete Cosine Transform. Then, the L2 norm of each block is calculated and used as the weighting factor for the AC values of the fused image block. The DC value of the fused block is the arithmetic mean of the DC coefficients of both input blocks. Based on this mechanism, the whole two images are processed in such a way that the output image is a composition of the processed 4x4 blocks. The proposed algorithm performs well compared to the other state of the art image fusion algorithms both in subjective and objective quality evaluations. In hardware, v the implemented algorithm can accept input videos as fast as 65 MHz pixel clock with a resolution of 1024x768 @60 Hz.
22

3-D Face Recognition using the Discrete Cosine Transform (DCT)

Hantehzadeh, Neda 01 January 2009 (has links)
Face recognition can be used in various biometric applications ranging from identifying criminals entering an airport to identifying an unconscious patient in the hospital With the introduction of 3-dimensional scanners in the last decade, researchers have begun to develop new methods for 3-D face recognition. This thesis focuses on 3-D face recognition using the one- and two-dimensional Discrete Cosine Transform (DCT) . A feature ranking based dimensionality reduction strategy is introduced to select the DCT coefficients that yield the best classification accuracies. Two forms of 3-D representation are used: point cloud and depth map images. These representations are extracted from the original VRML files in a face database and are normalized during the extraction process. Classification accuracies exceeding 97% are obtained using the point cloud images in conjunction with the 2-D DCT.
23

The contour tree image encoding technique and file format

Turner, Martin John January 1994 (has links)
The process of contourization is presented which converts a raster image into a discrete set of plateaux or contours. These contours can be grouped into a hierarchical structure, defining total spatial inclusion, called a contour tree. A contour coder has been developed which fully describes these contours in a compact and efficient manner and is the basis for an image compression method. Simplification of the contour tree has been undertaken by merging contour tree nodes thus lowering the contour tree's entropy. This can be exploited by the contour coder to increase the image compression ratio. By applying general and simple rules derived from physiological experiments on the human vision system, lossy image compression can be achieved which minimises noticeable artifacts in the simplified image. The contour merging technique offers a complementary lossy compression system to the QDCT (Quantised Discrete Cosine Transform). The artifacts introduced by the two methods are very different; QDCT produces a general blurring and adds extra highlights in the form of overshoots, whereas contour merging sharpens edges, reduces highlights and introduces a degree of false contouring. A format based on the contourization technique which caters for most image types is defined, called the contour tree image format. Image operations directly on this compressed format have been studied which for certain manipulations can offer significant operational speed increases over using a standard raster image format. A couple of examples of operations specific to the contour tree format are presented showing some of the features of the new format.
24

Metody pro odstranění šumu z digitálních obrazů / Digital Image Noise Reduction Methods

Čišecký, Roman January 2012 (has links)
The master's thesis is concerned with digital image denoising methods. The theoretical part explains some elementary terms related to image processing, image noise, categorization of noise and quality determining criteria of denoising process. There are also particular denoising methods described, mentioning their advantages and disadvantages in this paper. The practical part deals with an implementation of the selected denoising methods in a Java, in the environment of application RapidMiner. In conclusion, the results obtained by different methods are compared.
25

Design and analysis of Discrete Cosine Transform-based watermarking algorithms for digital images. Development and evaluation of blind Discrete Cosine Transform-based watermarking algorithms for copyright protection of digital images using handwritten signatures and mobile phone numbers.

Al-Gindy, Ahmed M.N. January 2011 (has links)
This thesis deals with the development and evaluation of blind discrete cosine transform-based watermarking algorithms for copyright protection of digital still images using handwritten signatures and mobile phone numbers. The new algorithms take into account the perceptual capacity of each low frequency coefficients inside the Discrete Cosine Transform (DCT) blocks before embedding the watermark information. They are suitable for grey-scale and colour images. Handwritten signatures are used instead of pseudo random numbers. The watermark is inserted in the green channel of the RGB colour images and the luminance channel of the YCrCb images. Mobile phone numbers are used as watermarks for images captured by mobile phone cameras. The information is embedded multiple-times and a shuffling scheme is applied to ensure that no spatial correlation exists between the original host image and the multiple watermark copies. Multiple embedding will increase the robustness of the watermark against attacks since each watermark will be individually reconstructed and verified before applying an averaging process. The averaging process has managed to reduce the amount of errors of the extracted information. The developed watermarking methods are shown to be robust against JPEG compression, removal attack, additive noise, cropping, scaling, small degrees of rotation, affine, contrast enhancements, low-pass, median filtering and Stirmark attacks. The algorithms have been examined using a library of approximately 40 colour images of size 512 512 with 24 bits per pixel and their grey-scale versions. Several evaluation techniques were used in the experiment with different watermarking strengths and different signature sizes. These include the peak signal to noise ratio, normalized correlation and structural similarity index measurements. The performance of the proposed algorithms has been compared to other algorithms and better invisibility qualities with stronger robustness have been achieved.
26

Matrix Approximation And Image Compression

Padavana, Isabella R 01 June 2024 (has links) (PDF)
This thesis concerns the mathematics and application of various methods for approximating matrices, with a particular eye towards the role that such methods play in image compression. An image is stored as a matrix of values with each entry containing a value recording the intensity of a corresponding pixel, so image compression is essentially equivalent to matrix approximation. First, we look at the singular value decomposition, one of the central tools for analyzing a matrix. We show that, in a sense, the singular value decomposition is the best low-rank approximation of any matrix. However, the singular value decomposition has some serious shortcomings as an approximation method in the context of digital images. The second method we consider is the discrete Fourier transform, which does not require the storage of basis vectors (unlike the SVD). We describe the fast Fourier transform, which is a remarkably efficient method for computing the discrete cosine transform, and how we can use this method to reduce the information in a matrix. Finally, we look at the discrete cosine transform, which reduces the complexity of the calculation further by restricting to a real basis. We also look at how we can apply a filter to adjust the relative importance of the data encoded by the discrete cosine transform prior to compression. In addition, we developed code implementing the ideas explored in the thesis and demonstrating examples.
27

Suprasegmental representations for the modeling of fundamental frequency in statistical parametric speech synthesis

Fonseca De Sam Bento Ribeiro, Manuel January 2018 (has links)
Statistical parametric speech synthesis (SPSS) has seen improvements over recent years, especially in terms of intelligibility. Synthetic speech is often clear and understandable, but it can also be bland and monotonous. Proper generation of natural speech prosody is still a largely unsolved problem. This is relevant especially in the context of expressive audiobook speech synthesis, where speech is expected to be fluid and captivating. In general, prosody can be seen as a layer that is superimposed on the segmental (phone) sequence. Listeners can perceive the same melody or rhythm in different utterances, and the same segmental sequence can be uttered with a different prosodic layer to convey a different message. For this reason, prosody is commonly accepted to be inherently suprasegmental. It is governed by longer units within the utterance (e.g. syllables, words, phrases) and beyond the utterance (e.g. discourse). However, common techniques for the modeling of speech prosody - and speech in general - operate mainly on very short intervals, either at the state or frame level, in both hidden Markov model (HMM) and deep neural network (DNN) based speech synthesis. This thesis presents contributions supporting the claim that stronger representations of suprasegmental variation are essential for the natural generation of fundamental frequency for statistical parametric speech synthesis. We conceptualize the problem by dividing it into three sub-problems: (1) representations of acoustic signals, (2) representations of linguistic contexts, and (3) the mapping of one representation to another. The contributions of this thesis provide novel methods and insights relating to these three sub-problems. In terms of sub-problem 1, we propose a multi-level representation of f0 using the continuous wavelet transform and the discrete cosine transform, as well as a wavelet-based decomposition strategy that is linguistically and perceptually motivated. In terms of sub-problem 2, we investigate additional linguistic features such as text-derived word embeddings and syllable bag-of-phones and we propose a novel method for learning word vector representations based on acoustic counts. Finally, considering sub-problem 3, insights are given regarding hierarchical models such as parallel and cascaded deep neural networks.
28

Bayesian Uncertainty Quantification for Large Scale Spatial Inverse Problems

Mondal, Anirban 2011 August 1900 (has links)
We considered a Bayesian approach to nonlinear inverse problems in which the unknown quantity is a high dimension spatial field. The Bayesian approach contains a natural mechanism for regularization in the form of prior information, can incorporate information from heterogeneous sources and provides a quantitative assessment of uncertainty in the inverse solution. The Bayesian setting casts the inverse solution as a posterior probability distribution over the model parameters. Karhunen-Lo'eve expansion and Discrete Cosine transform were used for dimension reduction of the random spatial field. Furthermore, we used a hierarchical Bayes model to inject multiscale data in the modeling framework. In this Bayesian framework, we have shown that this inverse problem is well-posed by proving that the posterior measure is Lipschitz continuous with respect to the data in total variation norm. The need for multiple evaluations of the forward model on a high dimension spatial field (e.g. in the context of MCMC) together with the high dimensionality of the posterior, results in many computation challenges. We developed two-stage reversible jump MCMC method which has the ability to screen the bad proposals in the first inexpensive stage. Channelized spatial fields were represented by facies boundaries and variogram-based spatial fields within each facies. Using level-set based approach, the shape of the channel boundaries was updated with dynamic data using a Bayesian hierarchical model where the number of points representing the channel boundaries is assumed to be unknown. Statistical emulators on a large scale spatial field were introduced to avoid the expensive likelihood calculation, which contains the forward simulator, at each iteration of the MCMC step. To build the emulator, the original spatial field was represented by a low dimensional parameterization using Discrete Cosine Transform (DCT), then the Bayesian approach to multivariate adaptive regression spline (BMARS) was used to emulate the simulator. Various numerical results were presented by analyzing simulated as well as real data.
29

VISUALIZAÇÃO DE DADOS VOLUMÉTRICOS COMPRIMIDOS BASEADO NA TRANSFORMADA DO COSSENO LOCAL / VISUALIZATION OF COMPRESSED VOLUMETRIC DATA BASED IN THE TRANSFORMED ONE OF THE LOCAL COSINE

Demetrio, Fernando Jorge Cutrim 01 April 2005 (has links)
Made available in DSpace on 2016-08-17T14:52:58Z (GMT). No. of bitstreams: 1 Fernando Jorge Cutrim Demetrio.pdf: 680959 bytes, checksum: 8bfd83d8940d01500dfe55ffeccc7148 (MD5) Previous issue date: 2005-04-01 / This work describes the development of an algorithm for visualization of compressed volumetric data with a local cosine transform scheme, that minimizes the blocking artefacts generated by transform compression methods, and makes possible the visualization of large volume data in computer with limited memory resources. We also present the results obtained for the visualization process . / Este trabalho descreve o desenvolvimento de um algoritmo para visualização de dados volumétricos comprimidos baseado na transformada do cosseno local, que minimiza os artefatos nos blocos gerados pelo método de compressão, e possibilita a visualização de grandes volumes em computadores com recursos limitados de memória. Nós apresentamos também os resultados obtidos pelo processo de visualização.
30

Sistema de inferência genético-nebuloso para reconhecimento de voz: Uma abordagem em modelos preditivos de baixa ordem utilizando a transformada cosseno discreta / System of genetic hazy inference for speech recognition: one approach to predictive models of low-order using the discrete cosine transform

Silva, Washington Luis Santos 20 March 2015 (has links)
Made available in DSpace on 2016-08-17T16:54:32Z (GMT). No. of bitstreams: 1 TESE_WASHINGTON LUIS SANTOS SILVA.pdf: 2994073 bytes, checksum: 86620806fbcc7af4fcf423defd5776bc (MD5) Previous issue date: 2015-03-20 / This thesis proposes a methodology that uses an intelligent system for voice recognition. It uses the definition of intelligent system, as the system has the ability to adapt their behavior to achieve their goals in a variety of environments. It is used also, the definition of Computational Intelligence, as the simulation of intelligent behavior in terms of computational process. In addition the speech signal pre-processing with mel-cepstral coefficients, the discrete cosine transform (DCT) is used to generate a two-dimensional array to model each pattern to be recognized. A Mamdani fuzzy inference system for speech recognition is optimized by genetic algorithm to maximize the amount of correct classification of standards with a reduced number of parameters. The experimental results achieved in speech recognition with the proposed methodology were compared with the Hidden Markov Models-HMM and the classifiers Gaussians Mixtures Models-GMM and Support Vector Machine-SVM. The recognition system used in this thesis was called Intelligent Methodology for Speech Recognition-IMSR / Neste trabalho propõe-se uma metodologia que utiliza um sistema inteligente para reconhecimento de voz. Utiliza-se a definição de sistema inteligente, como o sistema que possui a capacidade de adaptar seu comportamento para atingir seus objetivos em uma variedade de ambientes. Utiliza-se, também, a definição de Inteligência Computacional, como sendo a simulação de comportamentos inteligentes em termos de processo computacional. Além do pré-processamento do sinal de voz com coeficientes mel-cepstrais, a transformada discreta cosseno (TCD) é utilizada para gerar uma matriz bidimensional para modelar cada padrão a ser reconhecido. Um sistema de inferências nebuloso Mamdani para reconhecimento de voz é otimizado por algoritmo genético para maximizar a quantidade de acertos na classificação dos padrões com um número reduzido de parâmetros. Os resultados experimentais alcançados no reconhecimento de voz com a metodologia proposta foram comparados com o Hidden Markov Models-HMM e com os classificadores Gaussian Mixture Models-GMM e máquina de vetor de suporte (Support Vector Machine-SVM) com intuito de avaliação de desempenho. O sistema de reconhecimento usado neste trabalho foi denominado Intelligent Methodology for Speech Recognition-IMSR.

Page generated in 0.0892 seconds