Spelling suggestions: "subject:"designal processing - digital techniques"" "subject:"absignal processing - digital techniques""
241 |
Audio-video based handwritten mathematical content recognitionVemulapalli, Smita 12 November 2012 (has links)
Recognizing handwritten mathematical content is a challenging problem, and more so when such content appears in classroom videos. However, given the fact that in such videos the handwritten text and the accompanying audio refer to the same content, a combination of video and audio based recognizer has the potential to significantly improve the content recognition accuracy. This dissertation, using a combination of video and audio based recognizers, focuses on improving the recognition accuracy associated with handwritten mathematical content in such videos.
Our approach makes use of a video recognizer as the primary recognizer and a multi-stage assembly, developed as part of this research, is used to facilitate effective combination with an audio recognizer. Specifically, we address the following challenges related to audio-video based handwritten mathematical content recognition: (1) Video Preprocessing - generates a timestamped sequence of segmented characters from the classroom video in the face of occlusions and shadows caused by the instructor, (2) Ambiguity Detection - determines the subset of input characters that may have been incorrectly recognized by the video based recognizer and forwards this subset for disambiguation, (3) A/V Synchronization - establishes correspondence between the handwritten character and the spoken content, (4) A/V Combination - combines the synchronized outputs from the video and audio based recognizers and generates the final recognized character, and (5) Grammar Assisted A/V Based Mathematical Content Recognition - utilizes a base mathematical speech grammar for both character and structure disambiguation. Experiments conducted using videos recorded in a classroom-like environment demonstrate the significant improvements in recognition accuracy that can be achieved using our techniques.
|
242 |
Improving High Quality Concatenative Text-to-Speech Using the Circular Linear Prediction ModelShukla, Sunil Ravindra 10 January 2007 (has links)
Current high quality text-to-speech (TTS) systems are based on unit selection from a large database that is both contextually and prosodically rich. These systems, albeit capable of natural voice quality, are computationally expensive and require a very large footprint. Their success is attributed to the dramatic reduction of storage costs in recent times. However, for many TTS applications a smaller footprint is becoming a standard requirement. This thesis presents a new method for representing speech segments that can improve the quality and/or reduce the footprint current concatenative TTS systems. The circular linear prediction (CLP) model is revisited and combined with the constant pitch transform (CPT) to provide a robust representation of speech signals that allows for limited prosodic movements without a perceivable loss in quality. The CLP model assumes that each frame of voiced speech is an infinitely periodic signal. This assumption allows for LPC modeling using the covariance method, with the efficiency of the autocorrelation method. The CPT is combined with this model to provide a database that is uniform in pitch for matching the target prosody during synthesis. With this representation, limited prosody modifications and unit concatenation can be performed without causing audible artifacts. For resolving artifacts caused by pitch modifications in voicing transitions, a method has been introduced for reducing peakiness in the LP spectra by constraining the line spectral frequencies. Two experiments have been conducted to demonstrate the potential for the capabilities of CLP/CPT method. The first is a listening test to determine the ability of this model to realize prosody modifications without perceivable degradation. Utterances are resynthesized using the CLP/CPT method with emphasized prosodics to increase intelligibility in harsh environments. The second experiment compares the quality of utterances synthesized by unit-selection based limited-domain TTS against the CLP/CPT method. The results demonstrate that the CLP/CPT representation, applied to current concatenative TTS systems, can reduce the size of the database and increase the prosodic richness without noticeable degradation in voice quality.
|
243 |
Learning, probabilistic, and asynchronous technologies for an ultra efficient datapathMarr, Bo 17 November 2009 (has links)
A novel microarchitecture and circuit design techniques are presented for an asynchronous datapath that not only exhibits an extremely high rate of performance, but is also energy efficient. A 0.5 um chip was fabricated and tested that contains test circuits for the asynchronous datapath. Results show an adder and multiplier design that due to the 2-dimensional bit pipelining techniques, speculative completion, dynamic asynchronous circuits, and bit-level reservation stations and reorder buffers can commit 16-bit additions and multiplications at 1 giga operation per second (GOPS). The synchronicity simulator is also shown that simulates the same architecture except at more modern transistor nodes showing adder and multiplier performances at up to 11.1 GOPS in a commerically available 65 nm process. When compared to other designs and results, these prove to be some of the fastest if not the fastest adders and multipliers to date. The chip technology also was tested down to supply voltages below threshold making it extremely energy efficient. The asynchronous architecture also allows more exotic technologies, which are presented. Learning digital circuits are presented whereby the current supplied to a digital gate can be dynamically updated with floating gate technology. Probabilistic digital signal processing is also presented where the probabilistic operation is due to the statistical delay through the asynchronous circuits. Results show successful image processing with probabilistic operation in the least significant bits of the datapath resulting in large performance and energy gains.
|
244 |
Statistical methods for feature extraction in shape analysis and bioinformaticsLe Faucheur, Xavier Jean Maurice 05 April 2010 (has links)
The presented research explores two different problems of statistical data analysis.
In the first part of this thesis, a method for 3D shape representation, compression and smoothing is presented. First, a technique for encoding non-spherical surfaces using second generation wavelet decomposition is described. Second, a novel model is proposed for wavelet-based surface enhancement. This part of the work aims to develop an efficient algorithm for removing irrelevant and noise-like variations from 3D shapes. Surfaces are encoded using second generation wavelets, and the proposed methodology consists of separating noise-like wavelet coefficients from those contributing to the relevant part of the signal. The empirical-based Bayesian models developed in this thesis threshold wavelet coefficients in an adaptive and robust manner. Once thresholding is performed, irrelevant coefficients are removed and the inverse wavelet transform is applied to the clean set of wavelet coefficients. Experimental results show the efficiency of the proposed technique for surface smoothing and compression.
The second part of this thesis proposes using a non-parametric clustering method for studying RNA (RiboNucleic Acid) conformations. The local conformation of RNA molecules is an important factor in determining their catalytic and binding properties. RNA conformations can be characterized by a finite set of parameters that define the local arrangement of the molecule in space. Their analysis is particularly difficult due to the large number of degrees of freedom, such as torsion angles and inter-atomic distances among interacting residues. In order to understand and analyze the structural variability of RNA molecules, this work proposes a methodology for detecting repetitive conformational sub-structures along RNA strands. Clusters of similar structures in the conformational space are obtained using a nearest-neighbor search method based on the statistical mechanical Potts model. The proposed technique is a mostly automatic clustering algorithm and may be applied to problems where there is no prior knowledge on the structure of the data space, in contrast to many other clustering techniques. First, results are reported for both single residue conformations- where the parameter set of the data space includes four to seven torsional angles-, and base pair geometries. For both types of data sets, a very good match is observed between the results of the proposed clustering method and other known classifications, with only few exceptions. Second, new results are reported for base stacking geometries. In this case, the proposed classification is validated with respect to specific geometrical constraints, while the content and geometry of the new clusters are fully analyzed.
|
245 |
Avian musing feature space analysisColón, Guillermo J. 24 May 2012 (has links)
The purpose of this study was to analyze the possibility of utilizing known
signal processing and machine learning algorithms to correlate environmental
data to chicken vocalizations. The specific musing to be analyzed consist of
not just one chicken's vocalizations but of a whole collective, it therefore
becomes a chatter problem. There have been similar attempts to create such a
correlation in the past but with singled out birds instead of a multitude. This
study was performed on broiler chickens (birds used in meat production).
One of the reasons why this correlation is useful is for the purpose of an
automated control system. Utilizing the chickens own vocalization to determine
the temperature, the humidity, the levels of ammonia among other environmental
factors, reduces, and might even remove, the need for sophisticated sensors.
Another factor that this study wanted to correlate was stress in the chickens
to their vocalization. This has great implications in animal welfare, to
guarantee that the animals are being properly take care off. Also, it has been
shown that the meat of non-stressed chickens is of much better quality than the
opposite.
The audio was filtered and certain features were extracted to predict stress.
The features considered were loudness, spectral centroid, spectral sparsity,
temporal sparsity, transient index, temporal average, temporal standard
deviation, temporal skewness, and temporal kurtosis.
In the end, out of all the features analyzed it was shown that the kurtosis
and loudness proved to be the best features for identifying stressed birds in
audio.
|
246 |
Channel equalization to achieve high bit rates in discrete multitone systemsDing, Ming 28 August 2008 (has links)
Not available / text
|
247 |
Discrete-time crossing-point estimation for switching power convertersSmecher, Graeme. January 2008 (has links)
In a number of electrical engineering problems, so-called "crossing points" -- the instants at which two continuous-time signals cross each other -- are of interest. Often, particularly in applications using a Digital Signal Processor (DSP), only periodic samples along with a partial statistical characterization of the signals are available. In this situation, we are faced with the following problem: Given limited information about these signals, how can we efficiently and accurately estimate their crossing points? / For example, an audio amplifier typically receives its input from a digital source decoded into regular samples (e.g. from MP3, DVD, or CD audio), or obtained from a continuous-time signal using an analog-to-digital converter (ADC). In a switching amplifier based on Pulse-Width Modulation (PWM) or Click Modulation (CM), a signal derived from the sampled audio is compared against a deterministic reference waveform; the crossing points of these signals control a switching power stage. Crossing-point estimates must be accurate in order to preserve audio quality. They must also be simple to calculate, in order to minimize processing requirements and delays. / We consider estimating the crossing points of a known function and a Gaussian random process, given uniformly-spaced, noisy samples of the random process for which the second-order statistics are assumed to be known. We derive the Maximum A-Posteriori (MAP) estimator, along with a Minimum Mean-Squared Error (MMSE) estimator which we show to be a computationally efficient approximation to the MAP estimator. / We also derive the Cramer-Rao bound (CRB) on estimator variance for the problem, which allows practical estimators to be evaluated against a best-case performance limit. We investigate several comparison estimators chosen from the literature. The structure of the MMSE estimator and comparison estimators is shown to be very similar, making the difference in computational expense between each technique largely dependent on the cost of evaluating various (generally non-linear) functions. / Simulations for both Pulse-Width and Click Modulation scenarios show the MMSE estimator performs very near to the Cramer-Rao bound and outperforms the alternative estimators selected from the literature.
|
248 |
Design and evaluation of dynamic feature-based segmentation on musicBefus, Chad R, University of Lethbridge. Faculty of Arts and Science January 2010 (has links)
Segmentation is an indispensable step in the field of Music Information Retrieval (MIR).
Segmentation refers to the splitting of a music piece into significant sections. Classically
there has been a great deal of attention focused on various issues of segmentation, such
as: perceptual segmentation vs. computational segmentation, segmentation evaluations,
segmentation algorithms, etc. In this thesis, we conduct a series of perceptual experiments which challenge several of the traditional assumptions with respect to segmentation. Identifying some deficiencies in the current segmentation evaluation methods, we present a novel standardized evaluation approach which considers segmentation as a supportive step towards feature extraction in the MIR process. Furthermore, we propose a simple but effective segmentation algorithm and evaluate it utilizing our evaluation approach. / viii, 94 leaves : ill. ; 29 cm
|
249 |
Adaptive wireless rate control driven by highly fine-grained channel assessmentSong, Lixing 03 May 2014 (has links)
Access to abstract permanently restricted to Ball State community only. / Background : a survey for rate adaptation -- ABEP metric and channell assessment -- ABEP-based adaptive rate control -- Performance evaluation. / Access to thesis permanently restricted to Ball State community only. / Department of Computer Science
|
250 |
Design of a reusable distributed arithmetic filter and its application to the affine projection algorithmLo, Haw-Jing 06 April 2009 (has links)
Digital signal processing (DSP) is widely used in many applications spanning the spectrum from audio processing to image and video processing to radar and sonar processing. At the core of digital signal processing applications is the digital filter which are implemented in two ways, using either finite impulse response (FIR) filters or infinite impulse response (IIR) filters. The primary difference between FIR and IIR is that for FIR filters, the output is dependent only on the inputs, while for IIR filters the output is dependent on the inputs and the previous outputs. FIR filters also do not sur from stability issues stemming from the feedback of the output to the input that aect IIR filters.
In this thesis, an architecture for FIR filtering based on distributed arithmetic is presented. The proposed architecture has the ability to implement large FIR filters using minimal hardware and at the same time is able to complete the FIR filtering operation in minimal amount of time and delay when compared to typical FIR filter implementations. The proposed architecture is then used to implement the fast affine projection adaptive algorithm, an algorithm that is typically used with large filter sizes. The fast affine projection algorithm has a high computational burden that limits the throughput, which in turn restricts the number of applications. However, using the proposed FIR filtering architecture, the limitations on throughput are removed. The implementation of the fast affine projection adaptive algorithm using distributed arithmetic is unique to this thesis. The constructed adaptive filter shares all the benefits of the proposed FIR filter: low hardware requirements, high speed, and minimal delay.
|
Page generated in 0.1514 seconds