• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 32
  • 10
  • 8
  • 7
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 74
  • 74
  • 18
  • 16
  • 14
  • 14
  • 12
  • 12
  • 9
  • 9
  • 8
  • 8
  • 8
  • 8
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Transmission of vector quantization over a frequency-selective Rayleigh fading CDMA channel

Nguyen, Son Xuan 19 December 2005 (has links)
Recently, the transmission of vector quantization (VQ) over a code-division multiple access (CDMA) channel has received a considerable attention in research community. The complexity of the optimal decoding for VQ in CDMA communications is prohibitive for implementation, especially for systems with a medium or large number of users. A suboptimal approach to VQ decoding over a CDMA channel, disturbed by additive white Gaussian noise (AWGN), was recently developed. Such a suboptimal decoder is built from a soft-output multiuser detector (MUD), a soft bit estimator and the optimal soft VQ decoders of individual users. <p>Due to its lower complexity and good performance, such a decoding scheme is an attractive alternative to the complicated optimal decoder. It is necessary to extend this decoding scheme for a frequency-selective Rayleigh fading CDMA channel, a channel model typically seen in mobile wireless communications. This is precisely the objective of this thesis. <p>Furthermore, the suboptimal decoders are obtained not only for binary phase shift keying (BPSK), but also for M-ary pulse amplitude modulation (M-PAM). This extension offers a flexible trade-off between spectrum efficiency and performance of the systems. In addition, two algorithms based on distance measure and reliability processing are introduced as other alternatives to the suboptimal decoder. <p>Simulation results indicate that the suboptimal decoders studied in this thesis also performs very well over a frequency-selective Rayleigh fading CDMA channel.
32

A Design of Multi-Session, Text Independent, TV-Recorded Audio-Video Database for Speaker Recognition

Wang, Long-Cheng 07 September 2006 (has links)
A four-session text independent, TV-recorded audio-video database for speaker recognition is collected in this thesis. The speaker data is used to verify the applicability of a design methodology based on Mel-frequency cepstrum coefficients and Gaussian mixture model. Both single-session and multi-session problems are discussed in the thesis. Experimental results indicate that 90% correct rate can be achieved for a single-session 3000-speaker corpus while only 67% correct rate can be obtained for a two-session 800-speaker dataset. The performance of a multi-session speaker recognition system is greatly reduced due to the variability incurred in the recording environment, speakers¡¦ recording mood and other unknown factors. How to increase the system performance under multi-session conditions becomes a challenging task in the future. And the establishment of such a multi-session large-scale speaker database does indeed play an indispensable role in this task.
33

A design of text-independent medium-size speaker recognition system

Zheng, Shun-De 13 September 2002 (has links)
This paper presents text-independent speaker identification results for medium-size speaker population sizes up to 400 speakers for TV speech and TIMIT database . A system based on Gaussian mixture speaker models is used for speaker identification, and experiments are conducted on the TV database and TIMIT database. The TV-Database results show medium-size population performance under TV conditions. These are believed to be the first speaker identification experiments on the complete 400 speaker TV databases and the largest text-independent speaker identification task reported to date. Identification accuracies of 94.5% on the TV databases, respectively and 98.5% on the TIMIT database .
34

Two Variants of Self-Organizing Map and Their Applications in Image Quantization and Compression

Wang, Chao-huang 22 July 2009 (has links)
The self-organizing map (SOM) is an unsupervised learning algorithm which has been successfully applied to various applications. One of advantages of SOM is it maintains an incremental property to handle data on the fly. In the last several decades, there have been variants of SOM used in many application domains. In this dissertation, two new SOM algorithms are developed for image quantization and compression. The first algorithm is a sample-size adaptive SOM algorithm that can be used for color quantization of images to adapt to the variations of network parameters and training sample size. The sweep size of neighborhood function is modulated by the size of the training data. In addition, the minimax distortion principle which is modulated by training sample size is used to search the winning neuron. Based on the sample-size adaptive self-organizing map, we use the sampling ratio of training data, rather than the conventional weight change between adjacent sweeps, as a stop criterion. As a result, it can significantly speed up the learning process. Experimental results show that the proposed sample-size adaptive SOM achieves much better PSNR quality, and smaller PSNR variation under various combinations of network parameters and image size. The second algorithm is a novel classified SOM method for edge preserving quantization of images using an adaptive subcodebook and weighted learning rate. The subcodebook sizes of two classes are automatically adjusted in training iterations based on modified partial distortions that can be estimated incrementally. The proposed weighted learning rate updates the neuron efficiently no matter of how large the weighting factor is. Experimental results show that the proposed classified SOM method achieves better quality of reconstructed edge blocks and more spread out codebook and incurs a significantly less computational cost as compared to the competing methods.
35

Efficient image compression system using a CMOS transform imager

Lee, Jungwon 12 November 2009 (has links)
This research focuses on the implementation of the efficient image compression system among the many potential applications of a transform imager system. The study includes implementing the image compression system using a transform imager, developing a novel image compression algorithm for the system, and improving the performance of the image compression system through efficient encoding and decoding algorithms for vector quantization. A transform imaging system is implemented using a transform imager, and the baseline JPEG compression algorithm is implemented and tested to verify the functionality and performance of the transform imager system. The computational reduction in digital processing is investigated from two perspectives, algorithmic and implementation. Algorithmically, a novel wavelet-based embedded image compression algorithm using dynamic index reordering vector quantization (DIRVQ) is proposed for the system. DIRVQ makes it possible for the proposed algorithm to achieve superior performance over the embedded zero-tree wavelet (EZW) algorithm and the successive approximation vector quantization (SAVQ) algorithm. However, because DIRVQ requires intensive computational complexity, additional focus is placed on the efficient implementation of DIRVQ, and highly efficient implementation is achieved without a compromise in performance.
36

Telemetry Network Intrusion Detection System

Maharjan, Nadim, Moazzemi, Paria 10 1900 (has links)
ITC/USA 2012 Conference Proceedings / The Forty-Eighth Annual International Telemetering Conference and Technical Exhibition / October 22-25, 2012 / Town and Country Resort & Convention Center, San Diego, California / Telemetry systems are migrating from links to networks. Security solutions that simply encrypt radio links no longer protect the network of Test Articles or the networks that support them. The use of network telemetry is dramatically expanding and new risks and vulnerabilities are challenging issues for telemetry networks. Most of these vulnerabilities are silent in nature and cannot be detected with simple tools such as traffic monitoring. The Intrusion Detection System (IDS) is a security mechanism suited to telemetry networks that can help detect abnormal behavior in the network. Our previous research in Network Intrusion Detection Systems focused on "Password" attacks and "Syn" attacks. This paper presents a generalized method that can detect both "Password" attack and "Syn" attack. In this paper, a K-means Clustering algorithm is used for vector quantization of network traffic. This reduces the scope of the problem by reducing the entropy of the network data. In addition, a Hidden-Markov Model (HMM) is then employed to help to further characterize and analyze the behavior of the network into states that can be labeled as normal, attack, or anomaly. Our experiments show that IDS can discover and expose telemetry network vulnerabilities using Vector Quantization and the Hidden Markov Model providing a more secure telemetry environment. Our paper shows how these can be generalized into a Network Intrusion system that can be deployed on telemetry networks.
37

Αυτόματη αναγνώριση ομιλητή χρησιμοποιώντας μεθόδους ταυτοποίησης κλειστού συνόλου / Automatic speaker recognition using closed-set recognition methods

Κεραμεύς, Ηλίας 03 August 2009 (has links)
Ο στόχος ενός συστήματος αυτόματης αναγνώρισης ομιλητή είναι άρρηκτα συνδεδεμένος με την εξαγωγή, το χαρακτηρισμό και την αναγνώριση πληροφοριών σχετικά με την ταυτότητα ενός ομιλητή. Η αναγνώριση ομιλητή αναφέρεται είτε στην ταυτοποίηση είτε στην επιβεβαίωσή του. Συγκεκριμένα, ανάλογα με τη μορφή της απόφασης που επιστρέφει, ένα σύστημα ταυτοποίησης μπορεί να χαρακτηριστεί ως ανοιχτού συνόλου (open-set) ή ως κλειστού συνόλου (closed-set). Αν ένα σύστημα βασιζόμενο σε ένα άγνωστο δείγμα φωνής αποκρίνεται με μια ντετερμινιστικής μορφής απόφαση, εάν το δείγμα ανήκει σε συγκεκριμένο ή σε άγνωστο ομιλητή, το σύστημα χαρακτηρίζεται ως σύστημα ταυτοποίησης ανοιχτού συνόλου. Από την άλλη πλευρά, στην περίπτωση που το σύστημα επιστρέφει τον πιθανότερο ομιλητή, από αυτούς που ήδη είναι καταχωρημένοι στη βάση, από τον οποίο προέρχεται το δείγμα φωνής το σύστημα χαρακτηρίζεται ως σύστημα κλειστού συνόλου. Η ταυτοποίηση συστήματος κλειστού συνόλου, περαιτέρω μπορεί να χαρακτηριστεί ως εξαρτημένη ή ανεξάρτητη από κείμενο, ανάλογα με το εάν το σύστημα γνωρίζει την εκφερόμενη φράση ή εάν αυτό είναι ικανό να αναγνωρίσει τον ομιλητή από οποιαδήποτε φράση που μπορεί αυτός να εκφέρει. Στην εργασία αυτή εξετάζονται και υλοποιούνται αλγόριθμοι αυτόματης αναγνώρισης ομιλητή που βασίζονται σε κλειστού τύπου και ανεξαρτήτως κειμένου συστήματα ταυτοποίησης. Συγκεκριμένα, υλοποιούνται αλγόριθμοι που βασίζονται στην ιδέα της διανυσματικής κβάντισης, τα στοχαστικά μοντέλα και τα νευρωνικά δίκτυα. / The purpose of system of automatic recognition of speaker is unbreakably connected with the export, the characterization and the recognition of information with regard to the identity of speaker. The recognition of speaker is reported or in the identification or in his confirmation. Concretely, depending on the form of decision that returns, a system of identification can be characterized as open-set or as closed-set. If a system based on an unknown sample of voice is replied with deterministic form decision, if the sample belongs in concrete or in unknown speaker, the system is characterized as system of identification of open set. On the other hand, in the case where the system return the more likely speaker than which emanates the sample of voice, the system is characterized as system of closed set. The identification of system of close set, further can be characterized as made dependent or independent from text, depending on whether the system knows the speaking phrase or if this is capable to recognize the speaker from any phrase that can speak. In this work they are examined and they are implemented algorithms of automatic recognition of speaker that are based in closed type and independent text systems of identification. Concretely, are implemented algorithms that are based in the idea of the Vector Quantization, the stochastic models and the neural networks.
38

Second-order prediction and residue vector quantization for video compression / Prédiction de second ordre et résidu par quantification vectorielle pour la compression vidéo

Huang, Bihong 08 July 2015 (has links)
La compression vidéo est une étape cruciale pour une grande partie des applications de télécommunication. Depuis l'avènement de la norme H.261/MPEG-2, un nouveau standard de compression vidéo est produit tous les 10 ans environ, avec un gain en compression de 50% par rapport à la précédente. L'objectif de la thèse est d'obtenir des gains en compression par rapport à la dernière norme de codage vidéo HEVC. Dans cette thèse, nous proposons trois approches pour améliorer la compression vidéo en exploitant les corrélations du résidu de prédiction intra. Une première approche basée sur l'utilisation de résidus précédemment décodés montre que, si des gains sont théoriquement possibles, le surcoût de la signalisation les réduit pratiquement à néant. Une deuxième approche basée sur la quantification vectorielle mode-dépendent (MDVQ) du résidu préalablement à l'étape classique transformée-quantification scalaire, permet d'obtenir des gains substantiels. Nous montrons que cette approche est réaliste, car les dictionnaires sont indépendants du QP et de petite taille. Enfin, une troisième approche propose de rendre adaptatif les dictionnaires utilisés en MDVQ. Un gain substantiel est apporté par l'adaptivité, surtout lorsque le contenu vidéo est atypique, tandis que la complexité de décodage reste bien contenue. Au final on obtient un compromis gain-complexité compatible avec une soumission en normalisation. / Video compression has become a mandatory step in a wide range of digital video applications. Since the development of the block-based hybrid coding approach in the H.261/MPEG-2 standard, new coding standard was ratified every ten years and each new standard achieved approximately 50% bit rate reduction compared to its predecessor without sacrificing the picture quality. However, due to the ever-increasing bit rate required for the transmission of HD and Beyond-HD formats within a limited bandwidth, there is always a requirement to develop new video compression technologies which provide higher coding efficiency than the current HEVC video coding standard. In this thesis, we proposed three approaches to improve the intra coding efficiency of the HEVC standard by exploiting the correlation of intra prediction residue. A first approach based on the use of previously decoded residue shows that even though gains are theoretically possible, the extra cost of signaling could negate the benefit of residual prediction. A second approach based on Mode Dependent Vector Quantization (MDVQ) prior to the conventional transformed scalar quantization step provides significant coding gains. We show that this approach is realistic because the dictionaries are independent of QP and of a reasonable size. Finally, a third approach is developed to modify dictionaries gradually to adapt to the intra prediction residue. A substantial gain is provided by the adaptivity, especially when the video content is atypical, without increasing the decoding complexity. In the end we get a compromise of complexity and gain for a submission in standardization.
39

Speech Recognition Using a Synthesized Codebook

Smith, Lloyd A. (Lloyd Allen) 08 1900 (has links)
Speech sounds generated by a simple waveform synthesizer were used to create a vector quantization codebook for use in speech recognition. Recognition was tested over the TI-20 isolated word data base using a conventional DTW matching algorithm. Input speech was band limited to 300 - 3300 Hz, then passed through the Scott Instruments Corp. Coretechs process, implemented on a VET3 speech terminal, to create the speech representation for matching. Synthesized sounds were processed in software by a VET3 signal processing emulation program. Emulation and recognition were performed on a DEC VAX 11/750. The experiments were organized in 2 series. A preliminary experiment, using no vector quantization, provided a baseline for comparison. The original codebook contained 109 vectors, all derived from 2 formant synthesized sounds. This codebook was decimated through the course of the first series of experiments, based on the number of times each vector was used in quantizing the training data for the previous experiment, in order to determine the smallest subset of vectors suitable for coding the speech data base. The second series of experiments altered several test conditions in order to evaluate the applicability of the minimal synthesized codebook to conventional codebook training. The baseline recognition rate was 97%. The recognition rate for synthesized codebooks was approximately 92% for sizes ranging from 109 to 16 vectors. Accuracy for smaller codebooks was slightly less than 90%. Error analysis showed that the primary loss in dropping below 16 vectors was in coding of voiced sounds with high frequency second formants. The 16 vector synthesized codebook was chosen as the seed for the second series of experiments. After one training iteration, and using a normalized distortion score, trained codebooks performed with an accuracy of 95.1%. When codebooks were trained and tested on different sets of speakers, accuracy was 94.9%, indicating that very little speaker dependence was introduced by the training.
40

Speaker Identification Based On Discriminative Vector Quantization And Data Fusion

Zhou, Guangyu 01 January 2005 (has links)
Speaker Identification (SI) approaches based on discriminative Vector Quantization (VQ) and data fusion techniques are presented in this dissertation. The SI approaches based on Discriminative VQ (DVQ) proposed in this dissertation are the DVQ for SI (DVQSI), the DVQSI with Unique speech feature vector space segmentation for each speaker pair (DVQSI-U), and the Adaptive DVQSI (ADVQSI) methods. The difference of the probability distributions of the speech feature vector sets from various speakers (or speaker groups) is called the interspeaker variation between speakers (or speaker groups). The interspeaker variation is the measure of template differences between speakers (or speaker groups). All DVQ based techniques presented in this contribution take advantage of the interspeaker variation, which are not exploited in the previous proposed techniques by others that employ traditional VQ for SI (VQSI). All DVQ based techniques have two modes, the training mode and the testing mode. In the training mode, the speech feature vector space is first divided into a number of subspaces based on the interspeaker variations. Then, a discriminative weight is calculated for each subspace of each speaker or speaker pair in the SI group based on the interspeaker variation. The subspaces with higher interspeaker variations play more important roles in SI than the ones with lower interspeaker variations by assigning larger discriminative weights. In the testing mode, discriminative weighted average VQ distortions instead of equally weighted average VQ distortions are used to make the SI decision. The DVQ based techniques lead to higher SI accuracies than VQSI. DVQSI and DVQSI-U techniques consider the interspeaker variation for each speaker pair in the SI group. In DVQSI, speech feature vector space segmentations for all the speaker pairs are exactly the same. However, each speaker pair of DVQSI-U is treated individually in the speech feature vector space segmentation. In both DVQSI and DVQSI-U, the discriminative weights for each speaker pair are calculated by trial and error. The SI accuracies of DVQSI-U are higher than those of DVQSI at the price of much higher computational burden. ADVQSI explores the interspeaker variation between each speaker and all speakers in the SI group. In contrast with DVQSI and DVQSI-U, in ADVQSI, the feature vector space segmentation is for each speaker instead of each speaker pair based on the interspeaker variation between each speaker and all the speakers in the SI group. Also, adaptive techniques are used in the discriminative weights computation for each speaker in ADVQSI. The SI accuracies employing ADVQSI and DVQSI-U are comparable. However, the computational complexity of ADVQSI is much less than that of DVQSI-U. Also, a novel algorithm to convert the raw distortion outputs of template-based SI classifiers into compatible probability measures is proposed in this dissertation. After this conversion, data fusion techniques at the measurement level can be applied to SI. In the proposed technique, stochastic models of the distortion outputs are estimated. Then, the posteriori probabilities of the unknown utterance belonging to each speaker are calculated. Compatible probability measures are assigned based on the posteriori probabilities. The proposed technique leads to better SI performance at the measurement level than existing approaches.

Page generated in 0.1791 seconds