131 |
Low-Complexity Receiver Algorithms in Large-Scale Multiuser MIMO Systems and Generalized Spatial ModulationDatta, Tanumay January 2013 (has links) (PDF)
Multi-antenna wireless systems have become very popular due to their theoretically predicted higher spectral efficiencies and improved performance compared to single-antenna systems. Large-scale multiple-input multiple-output (MIMO) systems refer to wireless systems where communication terminals employ tens to hundreds of antennas to achieve in-creased spectral efficiencies/sum rates, reliability, and power efficiency. Large-scale multi-antenna systems are attractive to meet the increasing wireless data rate requirements, without compromising on the bandwidth. This thesis addresses key signal processing issues in large-scale MIMO systems. Specifically, the thesis investigates efficient algorithms for signal detection and channel estimation in large-scale MIMO systems. It also investigates ‘spatial modulation,’ a multi-antenna modulation scheme that can reduce the number of transmit radio frequency (RF) chains, without compromising much on the spectral efficiency. The work reported in this thesis is comprised of the following two parts:
1 investigation of low-complexity receiver algorithms based on Markov chain Monte Carlo (MCMC) technique, tabu search, and belief propagation for large-scale uplink multiuser MIMO systems, and
2 investigation of achievable rates and signal detection in generalized spatial modulation.
1. Receiver algorithms for large-scale multiuser MIMO systems on the uplink In this part of the thesis, we propose low-complexity algorithms based on MCMC techniques, Gaussian sampling based lattice decoding (GSLD), reactive tabu search (RTS), and factor graph based belief propagation (BP) for signal detection on the uplink in large-scale multiuser MIMO systems. We also propose an efficient channel estimation scheme based on Gaussian sampling.
Markov chain Monte Carlo (MCMC) sampling: We propose a novel MCMC based detection algorithm, which achieves near-optimal performance in large dimensions at low complexities by the joint use of a mixed Gibbs sampling (MGS) strategy and a multiple restart strategy with an efficient restart criterion. The proposed mixed Gibbs sampling distribution is a weighted mixture of the target distribution and uniform distribution. The presence of the uniform component in the sampling distribution allows the algorithm to exit from local traps quickly and alleviate the stalling problem encountered in conventional Gibbs sampling. We present an analysis for the optimum choice of the mixing ratio. The analysis approach is to define an absorbing Markov chain and use its property regarding the expected number of iterations needed to reach the global minima for the first time. We also propose an MCMC based algorithm which exploits the sparsity in uplink multiuser MIMO transmissions, where not all users are active simultaneously. Gaussian sampling based lattice decoding: Next, we investigate the problem of searching the closest lattice point in large dimensional lattices and its use in signal detection in large-scale MIMO systems. Specifically, we propose a Gaussian sampling based lattice decoding (GSLD) algorithm. The novelty of this algorithm is that, instead of sampling from a discrete distribution as in Gibbs sampling, the algorithm iteratively generates samples from a continuous Gaussian distribution, whose parameters are obtained analytically. This makes the complexity of the proposed algorithm to be independent of the size of the modulation alpha-bet. Also, the algorithm is able to achieve near-optimal performance for different antenna and modulation alphabet settings at low complexities. Random restart reactive tabu search (R3TS): Next, we study receiver algorithms based on reactive tabu search (RTS) technique in large-scale MIMO systems. We propose a multiple random restarts based reactive tabu search (R3TS) algorithm that achieves near-optimal performance in large-scale MIMO systems. A key feature of the proposed R3TS algorithm is its performance based restart criterion, which gives very good performance-complexity tradeoff in large-dimension systems. Lower bound on maximum likelihood (ML) bit error rate (BER) performance: We propose an approach to obtain lower bounds on the ML performance of large-scale MIMO systems using RTS simulation. In the proposed approach, we run the RTS algorithm using the transmitted vector as the initial vector, along with a suitable neighborhood definition, and find a lower bound on number of errors in ML solution. We demonstrate that the proposed bound is tight (within about 0.5 dB of the optimal performance in a 16×16MIMO system) at moderate to high SNRs. Factor graph using Gaussian approximation of interference (FG-GAI): Multiuser MIMO channels can be represented by graphical models that are fully/densely connected (loopy graphs), where conventional belief propagation yields suboptimal performance and requires high complexity. We propose a solution to this problem that uses a simple, yet effective, Gaussian approximation of interference (GAI) approach that carries out a linear per-symbol complexity message passing on a factor graph (FG) based graphical model. The proposed algorithm achieves near-optimal performance in large dimensions in frequency-flat as well as frequency-selective channels. Gaussian sampling based channel estimation: Next, we propose a Gaussian sampling based channel estimation technique for large-scale time-division duplex (TDD) MIMO systems. The proposed algorithm refines the initial estimate of the channel by iteratively detecting the data block and using that knowledge to improve the estimated channel knowledge using a Gaussian sampling based technique. We demonstrate that this algorithm achieves near-optimal performance both in terms of mean square error of the channel estimates and BER of detected data in both frequency-flat and frequency-selective channels.
2. Generalized spatial modulation In the second part of the thesis, we investigate generalized spatial modulation (GSM) in point-to point MIMO systems. GSM is attractive because of its ability to work with less number of transmit RF chains compared to traditional spatial multiplexing, without com-promising much on spectral efficiency. In this work, we show that, by using an optimum combination of number of transmit antennas and number of transmit RF chains, GSM can achieve better throughput and/or BER than spatial multiplexing. We compute tight bounds on the maximum achievable rate in a GSM system, and quantify the percentage savings in the number of transmit RF chains as well as the percentage increase in the rate achieved in GSM compared to spatial multiplexing. We also propose a Gibbs sampling based algorithm suited to detect GSM signals, which yields impressive BER performance and complexity results.
|
132 |
Grassmannian Fusion Frames for Block Sparse Recovery and Its Application to Burst Error CorrectionMukund Sriram, N January 2013 (has links) (PDF)
Fusion frames and block sparse recovery are of interest in signal processing and communication applications. In these applications it is required that the fusion frame have some desirable properties. One such requirement is that the fusion frame be tight and its subspaces form an optimal packing in a Grassmannian manifold. Such fusion frames are called Grassmannian fusion frames.
Grassmannian frames are known to be optimal dictionaries for sparse recovery as they have minimum coherence. By analogy Grassmannian fusion frames are potential candidates as optimal dictionaries in block sparse processing. The present work intends to study fusion frames in finite dimensional vector spaces assuming a specific structure useful in block sparse signal processing.
The main focus of our work is the design of Grassmannian fusion frames and their implication in block sparse recovery. We will consider burst error correction as an application of block sparsity and fusion frame concepts.
We propose two new algebraic methods for designing Grassmannian fusion frames. The first method involves use of Fourier matrix and difference sets to obtain a partial Fourier matrix which forms a Grassmannian fusion frame. This fusion frame has a specific structure and the parameters of the fusion frame are determined by the type of difference set used.
The second method involves constructing Grassmannian fusion frames from Grassmannian frames which meet the Welch bound. This method uses existing constructions of optimal Grassmannian frames. The method, while fairly general, requires that the dimension of the vector space be divisible by the dimension of the subspaces.
A lower bound which is an analog of the Welch bound is derived for the block coherence of dictionaries along with conditions to be satisfied to meet the bound. From these results we conclude that the matrices constructed by us are optimal for block sparse recovery from block coherence viewpoint.
There is a strong relation between sparse signal processing and error control coding. It is known that burst errors are block sparse in nature. So, here we attempt to solve the burst error correction problem using block sparse signal recovery methods. The use of Grassmannian fusion frames which we constructed as optimal dictionary allows correction of maximum possible number of errors, when used in conjunction with reconstruction algorithms which exploit block sparsity. We also suggest a modification to improve the applicability of the technique and point out relationship with a method which appeared previously in literature.
As an application example, we consider the use of the burst error correction technique for impulse noise cancelation in OFDM system. Impulse noise is bursty in nature and severely degrades OFDM performance. The Grassmannian fusion frames constructed with Fourier matrix and difference sets is ideal for use in this application as it can be easily incorporated into the OFDM system.
|
133 |
Device Applications of Epitaxial III-Nitride SemiconductorsShetty, Arjun January 2015 (has links) (PDF)
Through the history of mankind, novel materials have played a key role in techno- logical progress. As we approach the limits of scaling it becomes difficult to squeeze out any more extensions to Moore’s law by just reducing device feature sizes. It is important to look for an alternate semiconductor to silicon in order to continue making the progress predicted by Moore’s law. Among the various semiconductor options being explored world-wide, the III-nitride semiconductor material system has certain unique characteristics that make it one of the leading contenders. We explore the III-nitride semiconductor material system for the unique advantages that it offers over the other alternatives available to us.
This thesis studies the device applications of epitaxial III-nitride films and nanos- tructures grown using plasma assisted molecular beam epitaxy (PAMBE)
The material characterisation of the PAMBE grown epitaxial III-nitrides was car- ried out using techniques like high resolution X-ray diffraction (HR-XRD), field emis- sion scanning electron microscopy (FESEM), room temperature photoluminescence (PL) and transmission electron microscopy (TEM). The epitaxial III-nitrides were then further processed to fabricate devices like Schottky diodes, photodetectors and surface acoustic wave (SAW) devices. The electrical charcterisation of the fabricated devices was carried out using techniques like Hall measurement, IV and CV measure- ments on a DC probe station and S-parameter measurements on a vector network analyser connected to an RF probe station.
We begin our work on Schottky diodes by explaining the motivation for adding an interfacial layer in a metal-semiconductor Schottky contact and how high-k di- electrics like HfO2 have been relatively unexplored in this application. We report the work carried out on the Pt/n-GaN metal-semiconductor (MS) Schottky and the Pt/HfO2/n-GaN metal-insulator-semiconductor (MIS) Schottky diode. We report an improvement in the diode parameters like barrier height (0.52 eV to 0.63 eV), ideality factor (2.1 to 1.3) and rectification ratio (35.9 to 98.9 @2V bias) after the introduction of 5 nm of HfO2 as the interfacial layer. Temperature dependent I-V measurements were done to gain a further understanding of the interface. We observe that the barrier height and ideality factor exhibit a temperature dependence. This was attributed to inhomogeneities at the interface and by assuming a Gaussian distribution of barrier heights.
UV and IR photodetectors using III-nitrides are then studied. Our work on UV photodetectors describes the growth of epitaxial GaN films. Au nanoparticles were fabricated on these films using thermal evaporation and annealing. Al nanostruc- tures were fabricated using nanosphere lithography. Plasmonic enhancement using these metallic nanostructures was explored by fabricating metal-semiconductor-metal (MSM) photodetectors. We observed plasmonic enhancement of photocurrent in both cases. To obtain greater improvement, we etched down on the GaN film using reac tive ion etching (RIE). This resulted in further increase in photocurrent along with a reduction in dark current which was attributed to creation of new trap states. IR photodetectors studied in this thesis are InN quantum dots whose density can be controlled by varying the indium flux during growth. We observe that increase in InN quantum dot density results in increase in photocurrent and decrease in dark current in the fabricated IR photodetectors.
We then explore the advantages that InGaN offers as a material that supports surface acoustic waves and fabricate InGaN based surface acoustic wave devices. We describe the growth of epitaxial In0.23 Ga0.77 N films on GaN template using molecular beam epitaxy. Material characterisation was carried out using HR-XRD, FESEM, PL and TEM. The composition was determined from HR-XRD and PL measurements and both results matched each other. This was followed by the fabrication of interdigited electrodes with finger spacing of 10 µm. S-parameter results showed a transmission
peak at 104 MHz with an insertion loss of 19 dB. To the best of our knowledge, this is the first demonstration of an InGaN based SAW device.
In summary, this thesis demonstrates the practical advantages of epitaxially grown film and nanostructured III-nitride materials such as GaN, InN and InGaN using plasma assisted molecular beam epitaxy for Schottky diodes, UV and IR photodetec- tors and surface acoustic wave devices.
|
134 |
Fully Integrated CMOS Transmitter and Power Amplifier for Software-Defined Radios and Cognitive RadiosRaja, Immanuel January 2017 (has links) (PDF)
Software Defined Radios (SDRs) and Cognitive Radios (CRs) pave the way for next-generation radio technology. They promise versatility, flexibility and cognition which can revolutionize communications systems. However they present greater challenges to the design of radio frequency (RF) front-ends. RF front-ends for the radios in use today are narrow-band in their frequency response and are optimized and tuned to the carrier frequency of interest. SDRs and CRs demand front-ends which are versatile, configurable, tunable and be capable of transmitting and receiving signals with different bandwidths and modulation schemes. Integrating power amplifiers (PAs) with transmitters in CMOS has many advantages and challenges. This thesis deals with the design of an RF transmitter front-end for SDRs and CRs in CMOS.
The thesis begins with an introduction to SDRs and the requirements they place on transmitters and the challenges involved in designing them in CMOS. After a brief overview of the existing techniques, the proposed architecture is presented and explained. A digitally intensive transmitter solution is proposed. The transmitter covers a wide frequency range of 750 MHz to 2.5 GHz. The inputs to the proposed transmitter are in-phase and quadrature (I & Q) data bit streams. Multiple stages of up-sampling and filtering are used to remove all spurs in the spectrum such that only the harmonics of the carrier remain.
Differential rail-to-rail quadrature clocks are generated from a continuous wave signal at twice the carrier frequency. The clocks are corrected for their duty cycle and quadrature impairments.
The heart of the transmitter is an integrated reconfigurable CMOS power amplifier (PA). A methodology to design reconfigurable Class E PAs with a series fixed inductor has been presented. A CMOS power amplifier that can span a wide frequency range with sufficient output power and efficiency, supporting varying envelope complex modulation signals, with good linearity has been designed. Digital pre-distortion (DPD) is used to linearize the PA.
The full transmitter and the clock correction blocks have been designed and fabricated in a commercial 130-nm CMOS process and experimentally characterized. The PA delivers a maximum power of 13 dBm with an efficiency of 27% at 1 GHz. While transmitting a 16-QAM signal at 1 GHz, the measured EVM is 4%. It delivers a maximum power of around 11-13 dBm from 750 MHz to 1.5 GHz and up to 6.5 dBm of power till 2.5 GHz.
Comparing the proposed system with recently published literature, it can be seen that the proposed design is one of the very few transmitters which has an integrated matching network, tunable across the frequency range. The proposed PA produces the highest output power and with largest efficiency for systems with on-chip output networks.
|
135 |
Lattice Codes for Secure Communication and Secret Key GenerationVatedka, Shashank January 2017 (has links) (PDF)
In this work, we study two problems in information-theoretic security. Firstly, we study a wireless network where two nodes want to securely exchange messages via an honest-but-curious bidirectional relay. There is no direct link between the user nodes, and all communication must take place through the relay. The relay behaves like a passive eavesdropper, but otherwise follows the protocol it is assigned. Our objective is to design a scheme where the user nodes can reliably exchange messages such that the relay gets no information about the individual messages. We first describe a perfectly secure scheme using nested lattices, and show that our scheme achieves secrecy regardless of the distribution of the additive noise, and even if this distribution is unknown to the user nodes. Our scheme is explicit, in the sense that for any pair of nested lattices, we give the distribution used for randomization at the encoders to guarantee security. We then give a strongly secure lattice coding scheme, and we characterize the performance of both these schemes in the presence of Gaussian noise. We then extend our perfectly-secure and strongly-secure schemes to obtain a protocol that guarantees end-to-end secrecy in a multichip line network. We also briefly study the robustness of our bidirectional relaying schemes to channel imperfections.
In the second problem, we consider the scenario where multiple terminals have access to private correlated Gaussian sources and a public noiseless communication channel. The objective is to generate a group secret key using their sources and public communication in a way that an eavesdropper having access to the public communication can obtain no information about the key. We give a nested lattice-based protocol for generating strongly secure secret keys from independent and identically distributed copies of the correlated random variables. Under certain assumptions on the joint distribution of the sources, we derive achievable secret key rates.
The tools used in designing protocols for both these problems are nested lattice codes, which have been widely used in several problems of communication and security. In this thesis, we also study lattice constructions that permit polynomial-time encoding and decoding. In this regard, we first look at a class of lattices obtained from low-density parity-check (LDPC) codes, called Low-density Construction-A (LDA) lattices. We show that high-dimensional LDA lattices have several “goodness” properties that are desirable in many problems of communication and security. We also present a new class of low-complexity lattice coding schemes that achieve the capacity of the AWGN channel. Codes in this class are obtained by concatenating an inner Construction-A lattice code with an outer Reed-Solomon code or an expander code. We show that this class of codes can achieve the capacity of the AWGN channel with polynomial encoding and decoding complexities. Furthermore, the probability of error decays exponentially in the block length for a fixed transmission rate R that is strictly less than the capacity. To the best of our knowledge, this is the first capacity-achieving coding scheme for the AWGN channel which has an exponentially decaying probability of error and polynomial encoding/decoding complexities.
|
136 |
Communication Structure and Mixing Patterns in Complex NetworksChoudhury, Sudip Hazra January 2013 (has links) (PDF)
Real world systems like biological, social, technological, infrastructural and many others can be modeled as networks. The field of network science aims to study these complex networks and understand their structure and dynamics. A common feature of networks across domains is the distribution of the degree of the nodes according to a power-law (scale invariance). As a consequence of this skewness, the high degree nodes dominate the properties of these networks.
The rich-club phenomenon is observed when the high degree or the rich nodes of the network prefer to connect amongst themselves. In the first part, the thesis investigates the rich-club phenomenon in higher order neighborhoods of the network by providing an elegant quantification using a geodesic distance based approach. This quantification helped in identifying networks where the trend and intensity of the rich-club phenomenon is significantly different in higher order neighborhoods compared to the immediate neighbors. The thesis also proposes a quantification of the importance of the non-rich nodes in the communication structure of the rich nodes, and broadly classify networks into core-periphery or cellular. Further a lack of universality is noticed in the structure of the networks belonging to a particular domain.
It has been observed in the previous literature that the rich club connectivity dominates assortativity, a measure quantifying the mixing patterns in complex networks. Thus, assortativity is biased. To overcome such drawbacks, in the second part of the thesis proposes a novel measure called regularity. The analytical bounds on regularity and formulation of regularity for different network models are provided. Along with this a measure to quantify the mixing patterns of the neighborhood of a node called local regularity is also defined. The analysis on real-world network based on local regularity and degree distribution shows presence of both type of network, uniformly and non-uniformly mixed across different regions. Further normalized regularity is proposed to quantify the extent of preferential mixing in networks discounting the effect of degree distribution.
|
137 |
Timer-Based Selection Schemes for Wireless NetworksRajendra, Talak Rajat January 2013 (has links) (PDF)
Opportunistic selection is a practically appealing technique that is often used in multi-node wireless systems such as scheduling and rate adaptation in cellular systems and opportunistic wireless local area networks, wireless sensor networks, cooperative communications, and vehicular networks. In it, each node maintains a local preference number called metric that is function of its channel gains, and the best node with the highest metric is selected. Identifying the best node is challenging as the information about a node's metric is available only locally at each node.
In our work, we focus on the popular, simple, and low feedback timer scheme for selection. In it, each node sets a timer as a function of its metric and transmits a packet when the timer expires. The metric-to-timer mapping maps larger metric values to smaller timer values, which ensures that the best node's timer expires first. However, it can fail to select the best node if another node transmits a packet within D s of the transmission by the best node.
In this thesis, we make three contributions to the design and understanding of the timer-based selection scheme. Firstly, we introduce feedback overhead-aware contention resolution in the timer-based selection scheme. The outcome is a novel selection scheme that is faster than the splitting scheme and more reliable than the timer-based selection scheme. We analyze and minimize the average time required by the scheme to select the best node.
Secondly, we characterize the optimal metric-to-timer mapping when the number of nodes in the system is not known, as is the case in several practical deployments. When the prior distribution of the nodes is known, we propose an optimal mapping that maximizes the success probability averaged over the distribution on the number of nodes. When even the prior distribution is not known, we propose a robust mapping that maximizes the worst case average success probability over all possible probability distributions on the number of nodes. In both cases, we show that the timers can expire only at 0, D, 2D, ... in the optimal timer mapping. For the known prior case, we develop recursive techniques to effectively compute the optimal timer mapping for binomial and Poisson priors.
Lastly, we consider a discrete rate adaptive system and design an optimal timer scheme to maximize the end-to-end performance measure of system throughput. We derive several novel, insightful results about the optimal mapping that culminate in an iterative algorithm to compute it. We show that the design of the selection scheme is intimately related to the rate adaptation rule and the selection policy used. In all cases, extensive benchmarking with several ad hoc schemes proposed in the literature shows the significant gains that the proposed designs can deliver.
|
138 |
Energy Harvesting Wireless Sensor Networks : Performance Evaluation And Trade-offsRao, Shilpa Dinkar January 2016 (has links) (PDF)
Wireless sensor networks(WSNs) have a diverse set of applications such as military surveillance, health and environmental monitoring, and home automation. Sensor nodes are equipped with pre-charged batteries, which drain out when the nodes sense, process, and communicate data. Eventually, the nodes of the WSN die and the network dies.
Energy harvesting(EH) is a green alternative to solve the limited lifetime problem in WSNs. EH nodes recharge their batteries by harvesting ambient energy such as solar, wind, and radio energy. However, due to the randomness in the EH process and the limited amounts of energy that can be harvested, the EH nodes are often intermittently available. Therefore, even though EH nodes live perpetually, they do not cater to the network continuously. We focus on the energy-efficient design of WSNs that incorporate EH, and investigate the new design trade-offs that arise in exploiting the potentially scarce and random energy arrivals and channel fading encountered by the network. To this end, firstly, we compare the performance of conventional, all-EH, and hybrid WSNs, which consist of both conventional and EH nodes. We then study max function computation, which aims at energy-efficient data aggregation, in EH WSNs.
We first argue that the conventional performance criteria used for evaluating WSNs, which are motivated by lifetime, and for evaluating EH networks are at odds with each other and are unsuitable for evaluating hybrid WSNs. We propose two new and insightful performance criteria called the k-outage and n-transmission durations to evaluate and compare different WSNs. These criteria capture the effect of the battery energies of the nodes and the channel fading conditions on the network operations. We prove two computationally-efficient bounds for evaluating these criteria, and show their use in a cost-constrained deployment of a WSN involving EH nodes.
Next, we study the estimation of maximum of sensor readings in an all-EH WSN. We analyze the mean absolute error(MAE) in estimating the maximum reading when a random subset of the EH nodes periodically transmit their readings to the fusion node. We determine the optimal transmit power and the number of scheduled nodes that minimize the MAE. We weigh the benefits of the availability of channel information at the nodes against the cost of acquiring it. The results are first developed assuming that the readings are transmitted with infinite resolution. The new trade-offs that arise when quantized readings are instead transmitted are then characterized.Our results hold for any distribution of sensor readings, and for any stationary and ergodic EH process.
|
139 |
Joint Estimation of Impairments in MIMO-OFDM SystemsJose, Renu January 2014 (has links) (PDF)
The integration of Multiple Input Multiple Output (MIMO) and Orthogonal Frequency Division Multiplexing (OFDM) techniques has become a preferred solution for the high rate wireless technologies due to its high spectral efficiency, robustness to frequency selective fading, increased diversity gain, and enhanced system capacity. The main drawback of OFDM-based systems is their susceptibility to impairments such as Carrier Frequency Offset (CFO), Sampling Frequency Offset (SFO), Symbol Timing Error (STE), Phase Noise (PHN), and fading channel. These impairments, if not properly estimated and compensated, degrade the performance of the OFDM-based systems
In this thesis, a system model for MIMO-OFDM that takes into account the effects of all these impairments is formulated. Using this system model, we de-rive Cramer-Rao Lower Bounds (CRLBs) for the joint estimation of deterministic impairments in MIMO-OFDM system, which show the coupling effect among different impairments and the significance of the joint estimation. Also, Bayesian CRLBs for the joint estimation of random impairments in OFDM system are derived. Similarly, we derive Hybrid CRLBs for the joint estimation of random and deterministic impairments in OFDM system, which show the significance of using Bayesian approach in estimation.
Further, we investigate different algorithms for the joint estimation of all impairments in OFDM-based system. Maximum Likelihood (ML) algorithms and its low complexity variants, for the joint estimation of CFO, SFO, STE, and channel in MIMO-OFDM system, are proposed. We propose a low complexity ML algorithm which uses Compressed Sensing (CS) based channel estimation method in a sparse fading sce-nario, where the received samples used for estimation are less than that required for a Least Squares (LS) or Maximum a posteriori (MAP) based estimation. Also, we propose MAP algorithms for the joint estimation of the random impairments, PHN and channel, utilizing their statistical knowledge which is known a priori. Joint estimation algorithms for SFO and channel in OFDM system, using Bayesian framework, are also proposed in this thesis. The performance of the estimation methods is studied through simulations and numerical results show that the performance of the proposed algorithms is better than existing algorithms and is closer to the derived CRLBs.
|
140 |
Demodulation of Narrowband Speech SpectrogramsAragonda, Haricharan January 2014 (has links) (PDF)
Speech is a non-stationary signal and contains modulations in both spectral and temporal domains. Based on the type of modulations studied, most speech processing algorithms can be classified into short-time analysis algorithms, narrow-band analysis algorithms, or joint spectro-temporal analysis algorithms. While traditional methods of speech analysis study the modulation along either time (Short-time analysis algorithms) or frequency (Narrowband analysis) at a time. A new class of algorithms that work simultaneously along both temporal as well as spectral dimensions, called the spectro-temporal analysis algorithms, have become prominent over the past decade.
Joint spectro-temporal analysis (also referred to as 2-D speech analysis) has shown promise in applications such as formant estimation, pitch estimation, speech recognition, etc.
Over the past decade, 2-D speech analysis has been independently motivated from several directions. Broadly these motivations for 2-D speech models can be grouped into speech-production motivated, source-separation/machine- learning motivated and neurophysiology motivated.
In this thesis, we develop 2-D speech model based on the speech production motivation. The overall organization of the thesis is as follows: We first develop the context of 2-D speech processing in Chapter one, we then proceed to develop a 2-D multicomponent AM-FM model for narrowband spectrogram patch of voiced speech and experiment with the perceptual significance of number of components needed to represent a spectrogram patch in Chapter two. In Chapter three we develop a demodulation algorithm called the inphase and the quadrature phase demodulation (IQ), compared to the state-of-the art sinusoidal demodulation, the AM obtained using this method is more robust to carrier estimation errors. The demodulation algorithm was verified on call voiced sentences taken from the TIMIT database. In chapter four we develop a demodulation algorithm based on Riesz transform, a natural extension of the Hilbert transform to higher dimensions, unlike the sinusoidal and the IQ demodulation techniques, Riesz-transform-based demodulation does not require explicit carrier estimation and is also robust to pitch discontinuous in patches. The algorithm was validated on all voiced sentences from the TIMIT database. Both IQ and Riesz-transform-based methods were found to give more accurate estimates of the 2-D AM (relates to vocal tract) and 2-D carrier (relates to source) compared with the sinusoidal modulation. In Chapter five we show application of the demodulated AM and carrier to pitch estimation and for creation of hybrid sounds. The hybrid sounds created were found to have better perceptual quality compared with their counterparts created using the linear prediction analysis. In Chapter six we summarize the work and present with possible directions of future research.
|
Page generated in 0.1802 seconds