• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Multiple Global Affine Motion Models Used in Video Coding

Li, Xiaohuan 05 March 2007 (has links)
With low bit rate scenarios, a hybrid video coder (e.g. AVC/H.264) tends to allocate greater portion of bits for motion vectors, while saving bits on residual errors. According to this fact, a coding scheme with non-normative global motion models in combination with conventional local motion vectors is proposed, which describes the motion of a frame by the affine motion parameter sets drawn by motion segmentation of the luminance channel. The motion segmentation task is capable of adapting the number of motion objects to the video contents. 6-D affine model sets are driven by linear regression from the scalable block-based motion fields estimated by the existent MPEG encoder. In cases that the number of motion objects exceeds a certain threshold, the global affine models are disabled. Otherwise the 4 scaling factors of the affine models are compressed by a vector quantizer, designed with a unique cache memory for efficient searching and coding. The affine motion information is written in the slice header as a syntax. The global motion information is used for compensating those macroblocks whose Lagrange cost is minimized by the AFFINE mode. The rate-distortion cost is computed by a modified Lagrange equation, which takes into consideration the perceptual discrimination of human vision in different areas. Besides increasing the coding efficiency, the global affine model manifests the following two features that refine the compressed video quality. i) When the number of slices per frame is more than 1, the global affine motion model can enhance the error-resilience of the video stream, because the affine motion parameters are duplicated in the headers of different slices over the same frame. ii) The global motion model predicts a frame by warping the whole reference frame and this helps to decrease blocking artifacts in the compensation frame.
2

Person Identification Based on Karhunen-Loeve Transform

Chen, Chin-Ta 16 July 2004 (has links)
Abstract In this dissertation, person identification systems based on Karhunen-Loeve transform (KLT) are investigated. Both speaker and face recognition are considered in our design. Among many aspects of the system design issues, three important problems: how to improve the correct classification rate, how to reduce the computational cost and how to increase the robustness property of the system, are addressed in this thesis. Improvement of the correct classification rate and reduction of the computational cost for the person identification system can be accomplished by appropriate feature design methodology. KLT and hard-limited KLT (HLKLT) are proposed here to extract class related features. Theoretically, KLT is the optimal transform in minimum mean square error and maximal energy packing sense. The transformed data is totally uncorrelated and it contains most of the classification information in the first few coordinates. Therefore, satisfactory correct classification rate can be achieved by using only the first few KLT derived eigenfeatures. In the above data transformation process, the transformed data is calculated from the inner products of the original samples and the selected eigenvectors. The computation is of course floating point arithmetic. If this linear transformation process can be further reduced to integer arithmetic, the time used for both person feature training and person classification will be greatly reduced. The hard-limiting process (HLKLT) here is used to extract the zero-crossing information in the eigenvectors, which is hypothesized to contain important information that can be used for classification. This kind of feature tremendously simplifies the linear transformation process since the computation is merely integer arithmetic. In this thesis, it is demonstrated that the hard-limited KL transform has much simpler structure than that of the KL transform and it possess approximately the same excellent performances for both speaker identification system and face recognition system. Moreover, a hybrid KLT/GMM speaker identification system is proposed in this thesis to improve classification rate and to save computational time. The increase of the correct rate comes from the fact that two different sets of speech features, one from the KLT features, the other from the MFCC features of the Gaussian mixture speaker model (GMM), are applied in the hybrid system. Furthermore, this hybrid system performs classification in a sequential manner. In the first stage, the relatively faster KLT features are used as the initial candidate selection tool to discard those speakers with larger separability. Then in the second stage, the GMM is utilized as the final speaker recognition means to make the ultimate decision. Therefore, only a small portion of the speakers needed to be discriminated in the time-consuming GMM stage. Our results show that the combination is beneficial to both classification accuracy and computational cost. The above hybrid KLT/GMM design is also applied to a robust speaker identification system. Under both additive white Gaussian noise (AWGN) and car noise environments, it is demonstrated that accuracy improvement and computational saving compared to the conventional GMM model can be achieved. Genetic algorithm (GA) is proposed in this thesis to improve the speaker identification performance of the vector quantizer (VQ) by avoiding typical local minima incurred in the LBG process. The results indicates that this scheme is useful for our application on recognition and practice.
3

ARAVQ som datareducerare för en klassificeringsuppgift inom datautvinning

Ahlén, Niclas January 2004 (has links)
<p>Adaptive Resource Allocating Vector Quantizer (ARAVQ) är en teknik för datareducering för mobila robotar. Tekniken har visats framgångsrik i enkla miljöer och det har spekulerats i att den kan fungera som ett generellt datautvinningsverktyg för tidsserier. I rapporten presenteras experiment där ARAVQ används som datareducerare på en artificiell respektive en fysiologisk datamängd inom en datautvinningskontext. Dessa datamängder skiljer sig från tidigare robotikmiljöer i och med att de beskriver objekt med diffusa eller överlappande gränser i indatarymden. Varje datamängd klassificeras efter datareduceringen med hjälp av artificiella neuronnät. Resultatet från experimenten tyder på att klassificering med ARAVQ som datareducerare uppnår ett betydligt lägre resultat än om ARAVQ inte används som datareducerare. Detta antas delvis bero på den låga generaliserbarheten hos de lösningar som skapas av ARAVQ. I diskussionen föreslås att ARAVQ skall kompletteras med en funktion för grannskap, motsvarande den som finns i Self-Organizing Map. Med ett grannskap behålls relationerna mellan de kluster som ARAVQ skapar, vilket antas minska följderna av att en beskrivning hamnar i ett grannkluster</p>
4

ARAVQ som datareducerare för en klassificeringsuppgift inom datautvinning

Ahlén, Niclas January 2004 (has links)
Adaptive Resource Allocating Vector Quantizer (ARAVQ) är en teknik för datareducering för mobila robotar. Tekniken har visats framgångsrik i enkla miljöer och det har spekulerats i att den kan fungera som ett generellt datautvinningsverktyg för tidsserier. I rapporten presenteras experiment där ARAVQ används som datareducerare på en artificiell respektive en fysiologisk datamängd inom en datautvinningskontext. Dessa datamängder skiljer sig från tidigare robotikmiljöer i och med att de beskriver objekt med diffusa eller överlappande gränser i indatarymden. Varje datamängd klassificeras efter datareduceringen med hjälp av artificiella neuronnät. Resultatet från experimenten tyder på att klassificering med ARAVQ som datareducerare uppnår ett betydligt lägre resultat än om ARAVQ inte används som datareducerare. Detta antas delvis bero på den låga generaliserbarheten hos de lösningar som skapas av ARAVQ. I diskussionen föreslås att ARAVQ skall kompletteras med en funktion för grannskap, motsvarande den som finns i Self-Organizing Map. Med ett grannskap behålls relationerna mellan de kluster som ARAVQ skapar, vilket antas minska följderna av att en beskrivning hamnar i ett grannkluster

Page generated in 0.0404 seconds