• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 5
  • 4
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 41
  • 41
  • 41
  • 13
  • 11
  • 9
  • 8
  • 6
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Metody pro odstranění šumu z digitálních obrazů / Digital Image Noise Reduction Methods

Čišecký, Roman January 2012 (has links)
The master's thesis is concerned with digital image denoising methods. The theoretical part explains some elementary terms related to image processing, image noise, categorization of noise and quality determining criteria of denoising process. There are also particular denoising methods described, mentioning their advantages and disadvantages in this paper. The practical part deals with an implementation of the selected denoising methods in a Java, in the environment of application RapidMiner. In conclusion, the results obtained by different methods are compared.
22

Design and analysis of Discrete Cosine Transform-based watermarking algorithms for digital images. Development and evaluation of blind Discrete Cosine Transform-based watermarking algorithms for copyright protection of digital images using handwritten signatures and mobile phone numbers.

Al-Gindy, Ahmed M.N. January 2011 (has links)
This thesis deals with the development and evaluation of blind discrete cosine transform-based watermarking algorithms for copyright protection of digital still images using handwritten signatures and mobile phone numbers. The new algorithms take into account the perceptual capacity of each low frequency coefficients inside the Discrete Cosine Transform (DCT) blocks before embedding the watermark information. They are suitable for grey-scale and colour images. Handwritten signatures are used instead of pseudo random numbers. The watermark is inserted in the green channel of the RGB colour images and the luminance channel of the YCrCb images. Mobile phone numbers are used as watermarks for images captured by mobile phone cameras. The information is embedded multiple-times and a shuffling scheme is applied to ensure that no spatial correlation exists between the original host image and the multiple watermark copies. Multiple embedding will increase the robustness of the watermark against attacks since each watermark will be individually reconstructed and verified before applying an averaging process. The averaging process has managed to reduce the amount of errors of the extracted information. The developed watermarking methods are shown to be robust against JPEG compression, removal attack, additive noise, cropping, scaling, small degrees of rotation, affine, contrast enhancements, low-pass, median filtering and Stirmark attacks. The algorithms have been examined using a library of approximately 40 colour images of size 512 512 with 24 bits per pixel and their grey-scale versions. Several evaluation techniques were used in the experiment with different watermarking strengths and different signature sizes. These include the peak signal to noise ratio, normalized correlation and structural similarity index measurements. The performance of the proposed algorithms has been compared to other algorithms and better invisibility qualities with stronger robustness have been achieved.
23

Suprasegmental representations for the modeling of fundamental frequency in statistical parametric speech synthesis

Fonseca De Sam Bento Ribeiro, Manuel January 2018 (has links)
Statistical parametric speech synthesis (SPSS) has seen improvements over recent years, especially in terms of intelligibility. Synthetic speech is often clear and understandable, but it can also be bland and monotonous. Proper generation of natural speech prosody is still a largely unsolved problem. This is relevant especially in the context of expressive audiobook speech synthesis, where speech is expected to be fluid and captivating. In general, prosody can be seen as a layer that is superimposed on the segmental (phone) sequence. Listeners can perceive the same melody or rhythm in different utterances, and the same segmental sequence can be uttered with a different prosodic layer to convey a different message. For this reason, prosody is commonly accepted to be inherently suprasegmental. It is governed by longer units within the utterance (e.g. syllables, words, phrases) and beyond the utterance (e.g. discourse). However, common techniques for the modeling of speech prosody - and speech in general - operate mainly on very short intervals, either at the state or frame level, in both hidden Markov model (HMM) and deep neural network (DNN) based speech synthesis. This thesis presents contributions supporting the claim that stronger representations of suprasegmental variation are essential for the natural generation of fundamental frequency for statistical parametric speech synthesis. We conceptualize the problem by dividing it into three sub-problems: (1) representations of acoustic signals, (2) representations of linguistic contexts, and (3) the mapping of one representation to another. The contributions of this thesis provide novel methods and insights relating to these three sub-problems. In terms of sub-problem 1, we propose a multi-level representation of f0 using the continuous wavelet transform and the discrete cosine transform, as well as a wavelet-based decomposition strategy that is linguistically and perceptually motivated. In terms of sub-problem 2, we investigate additional linguistic features such as text-derived word embeddings and syllable bag-of-phones and we propose a novel method for learning word vector representations based on acoustic counts. Finally, considering sub-problem 3, insights are given regarding hierarchical models such as parallel and cascaded deep neural networks.
24

Bayesian Uncertainty Quantification for Large Scale Spatial Inverse Problems

Mondal, Anirban 2011 August 1900 (has links)
We considered a Bayesian approach to nonlinear inverse problems in which the unknown quantity is a high dimension spatial field. The Bayesian approach contains a natural mechanism for regularization in the form of prior information, can incorporate information from heterogeneous sources and provides a quantitative assessment of uncertainty in the inverse solution. The Bayesian setting casts the inverse solution as a posterior probability distribution over the model parameters. Karhunen-Lo'eve expansion and Discrete Cosine transform were used for dimension reduction of the random spatial field. Furthermore, we used a hierarchical Bayes model to inject multiscale data in the modeling framework. In this Bayesian framework, we have shown that this inverse problem is well-posed by proving that the posterior measure is Lipschitz continuous with respect to the data in total variation norm. The need for multiple evaluations of the forward model on a high dimension spatial field (e.g. in the context of MCMC) together with the high dimensionality of the posterior, results in many computation challenges. We developed two-stage reversible jump MCMC method which has the ability to screen the bad proposals in the first inexpensive stage. Channelized spatial fields were represented by facies boundaries and variogram-based spatial fields within each facies. Using level-set based approach, the shape of the channel boundaries was updated with dynamic data using a Bayesian hierarchical model where the number of points representing the channel boundaries is assumed to be unknown. Statistical emulators on a large scale spatial field were introduced to avoid the expensive likelihood calculation, which contains the forward simulator, at each iteration of the MCMC step. To build the emulator, the original spatial field was represented by a low dimensional parameterization using Discrete Cosine Transform (DCT), then the Bayesian approach to multivariate adaptive regression spline (BMARS) was used to emulate the simulator. Various numerical results were presented by analyzing simulated as well as real data.
25

Sistema de inferência genético-nebuloso para reconhecimento de voz: Uma abordagem em modelos preditivos de baixa ordem utilizando a transformada cosseno discreta / System of genetic hazy inference for speech recognition: one approach to predictive models of low-order using the discrete cosine transform

Silva, Washington Luis Santos 20 March 2015 (has links)
Made available in DSpace on 2016-08-17T16:54:32Z (GMT). No. of bitstreams: 1 TESE_WASHINGTON LUIS SANTOS SILVA.pdf: 2994073 bytes, checksum: 86620806fbcc7af4fcf423defd5776bc (MD5) Previous issue date: 2015-03-20 / This thesis proposes a methodology that uses an intelligent system for voice recognition. It uses the definition of intelligent system, as the system has the ability to adapt their behavior to achieve their goals in a variety of environments. It is used also, the definition of Computational Intelligence, as the simulation of intelligent behavior in terms of computational process. In addition the speech signal pre-processing with mel-cepstral coefficients, the discrete cosine transform (DCT) is used to generate a two-dimensional array to model each pattern to be recognized. A Mamdani fuzzy inference system for speech recognition is optimized by genetic algorithm to maximize the amount of correct classification of standards with a reduced number of parameters. The experimental results achieved in speech recognition with the proposed methodology were compared with the Hidden Markov Models-HMM and the classifiers Gaussians Mixtures Models-GMM and Support Vector Machine-SVM. The recognition system used in this thesis was called Intelligent Methodology for Speech Recognition-IMSR / Neste trabalho propõe-se uma metodologia que utiliza um sistema inteligente para reconhecimento de voz. Utiliza-se a definição de sistema inteligente, como o sistema que possui a capacidade de adaptar seu comportamento para atingir seus objetivos em uma variedade de ambientes. Utiliza-se, também, a definição de Inteligência Computacional, como sendo a simulação de comportamentos inteligentes em termos de processo computacional. Além do pré-processamento do sinal de voz com coeficientes mel-cepstrais, a transformada discreta cosseno (TCD) é utilizada para gerar uma matriz bidimensional para modelar cada padrão a ser reconhecido. Um sistema de inferências nebuloso Mamdani para reconhecimento de voz é otimizado por algoritmo genético para maximizar a quantidade de acertos na classificação dos padrões com um número reduzido de parâmetros. Os resultados experimentais alcançados no reconhecimento de voz com a metodologia proposta foram comparados com o Hidden Markov Models-HMM e com os classificadores Gaussian Mixture Models-GMM e máquina de vetor de suporte (Support Vector Machine-SVM) com intuito de avaliação de desempenho. O sistema de reconhecimento usado neste trabalho foi denominado Intelligent Methodology for Speech Recognition-IMSR.
26

Investigation of New Techniques for Face detection

Abdallah, Abdallah Sabry 18 July 2007 (has links)
The task of detecting human faces within either a still image or a video frame is one of the most popular object detection problems. For the last twenty years researchers have shown great interest in this problem because it is an essential pre-processing stage for computing systems that process human faces as input data. Example applications include face recognition systems, vision systems for autonomous robots, human computer interaction systems (HCI), surveillance systems, biometric based authentication systems, video transmission and video compression systems, and content based image retrieval systems. In this thesis, non-traditional methods are investigated for detecting human faces within color images or video frames. The attempted methods are chosen such that the required computing power and memory consumption are adequate for real-time hardware implementation. First, a standard color image database is introduced in order to accomplish fair evaluation and benchmarking of face detection and skin segmentation approaches. Next, a new pre-processing scheme based on skin segmentation is presented to prepare the input image for feature extraction. The presented pre-processing scheme requires relatively low computing power and memory needs. Then, several feature extraction techniques are evaluated. This thesis introduces feature extraction based on Two Dimensional Discrete Cosine Transform (2D-DCT), Two Dimensional Discrete Wavelet Transform (2D-DWT), geometrical moment invariants, and edge detection. It also attempts to construct a hybrid feature vector by the fusion between 2D-DCT coefficients and edge information, as well as the fusion between 2D-DWT coefficients and geometrical moments. A self organizing map (SOM) based classifier is used within all the experiments to distinguish between facial and non-facial samples. Two strategies are tried to make the final decision from the output of a single SOM or multiple SOM. Finally, an FPGA based framework that implements the presented techniques, is presented as well as a partial implementation. Every presented technique has been evaluated consistently using the same dataset. The experiments show very promising results. The highest detection rate of 89.2% was obtained when using a fusion between DCT coefficients and edge information to construct the feature vector. A second highest rate of 88.7% was achieved by using a fusion between DWT coefficients and geometrical moments. Finally, a third highest rate of 85.2% was obtained by calculating the moments of edges. / Master of Science
27

Video extraction for fast content access to MPEG compressed videos

Jiang, Jianmin, Weng, Y. 09 June 2009 (has links)
No / As existing video processing technology is primarily developed in the pixel domain yet digital video is stored in compressed format, any application of those techniques to compressed videos would require decompression. For discrete cosine transform (DCT)-based MPEG compressed videos, the computing cost of standard row-by-row and column-by-column inverse DCT (IDCT) transforms for a block of 8 8 elements requires 4096 multiplications and 4032 additions, although practical implementation only requires 1024 multiplications and 896 additions. In this paper, we propose a new algorithm to extract videos directly from MPEG compressed domain (DCT domain) without full IDCT, which is described in three extraction schemes: 1) video extraction in 2 2 blocks with four coefficients; 2) video extraction in 4 4 blocks with four DCT coefficients; and 3) video extraction in 4 4 blocks with nine DCT coefficients. The computing cost incurred only requires 8 additions and no multiplication for the first scheme, 2 multiplication and 28 additions for the second scheme, and 47 additions (no multiplication) for the third scheme. Extensive experiments were carried out, and the results reveal that: 1) the extracted video maintains competitive quality in terms of visual perception and inspection and 2) the extracted videos preserve the content well in comparison with those fully decompressed ones in terms of histogram measurement. As a result, the proposed algorithm will provide useful tools in bridging the gap between pixel domain and compressed domain to facilitate content analysis with low latency and high efficiency such as those applications in surveillance videos, interactive multimedia, and image processing.
28

Compression d'images dans les réseaux de capteurs sans fil / Image compression in Wireless Sensor Networks

Makkaoui, Leila 26 November 2012 (has links)
Les réseaux de capteurs sans fil d'images sont utilisés aujourd'hui dans de nombreuses applications qui diffèrent par leurs objectifs et leurs contraintes individuelles. Toutefois, le dénominateur commun de toutes les applications de réseaux de capteurs reste la vulnérabilité des noeuds-capteurs en raison de leurs ressources matérielles limitées dont la plus contraignante est l'énergie. En effet, les technologies sans fil disponibles dans ce type de réseaux sont généralement à faible portée, et les ressources matérielles (CPU, batterie) sont également de faible puissance. Il faut donc répondre à un double objectif : l'efficacité d'une solution tout en offrant une bonne qualité d'image à la réception. La contribution de cette thèse porte principalement sur l'étude des méthodes de traitement et de compression d'images au noeud-caméra, nous avons proposé une nouvelle méthode de compression d'images qui permet d'améliorer l'efficacité énergétique des réseaux de capteurs sans fil. Des expérimentations sur une plate-forme réelle de réseau de capteurs d'images ont été réalisées afin de démontrer la validité de nos propositions, en mesurant des aspects telles que la quantité de mémoire requise pour l'implantation logicielle de nos algorithmes, leur consommation d'énergie et leur temps d'exécution. Nous présentons aussi, les résultats de synthèse de la chaine de compression proposée sur des systèmes à puce FPGA et ASIC / The increasing development of Wireless Camera Sensor Networks today allows a wide variety of applications with different objectives and constraints. However, the common problem of all the applications of sensor networks remains the vulnerability of sensors nodes because of their limitation in material resources, the most restricting being energy. Indeed, the available wireless technologies in this type of networks are usually a low-power, short-range wireless technology and low power hardware resources (CPU, battery). So we should meet a twofold objective: an efficient solution while delivering outstanding image quality on reception. This thesis concentrates mainly on the study and evaluation of compression methods dedicated to transmission over wireless camera sensor networks. We have suggested a new image compression method which decreases the energy consumption of sensors and thus maintains a long network lifetime. We evaluate its hardware implementation using experiments on real camera sensor platforms in order to show the validity of our propositions, by measuring aspects such as the quantity of memory required for the implantation program of our algorithms, the energy consumption and the execution time. We then focus on the study of the hardware features of our proposed method of synthesis of the compression circuit when implemented on a FPGA and ASIC chip prototype
29

Sparse Fast Trigonometric Transforms

Bittens, Sina Vanessa 13 June 2019 (has links)
No description available.
30

A Spatially-filtered Finite-difference Time-domain Method with Controllable Stability Beyond the Courant Limit

Chang, Chun 19 July 2012 (has links)
This thesis introduces spatial filtering, which is a technique to extend the time step size beyond the conventional stability limit for the Finite-Difference Time-Domain (FDTD) method, at the expense of transforming field nodes between the spatial domain and the discrete spatial-frequency domain and removing undesired spatial-frequency components at every FDTD update cycle. The spatially-filtered FDTD method is demonstrated to be almost as accurate as and more efficient than the conventional FDTD method via theories and numerical examples. Then, this thesis combines spatial filtering and an existing subgridding scheme to form the spatially-filtered subgridding scheme. The spatially-filtered subgridding scheme is more efficient than existing subgridding schemes because the former allows the time step size used in the dense mesh to be larger than the dense mesh CFL limit. However, trade-offs between accuracy and efficiency are required in complicated structures.

Page generated in 0.0544 seconds