Return to search

Algorithms and analysis for next generation biosensing and sequencing systems

Recent advancements in massively parallel biosensing and sequencing technologies have
revolutionized the field of molecular biology and paved the way to novel and exciting
innovations in medicine, biology, and environmental monitoring. Among them, biosensor
arrays (e.g., DNA and protein microarrays) have gained a lot of attention. DNA microarrays
are parallel affinity biosensors that can detect the presence and quantify the
amounts of nucleic acid molecules of interest. They rely on chemical attraction between
target nucleic acid sequences and their Watson-Crick complements that serve as probes
and capture the targets. The molecular binding between the probes and targets is a stochastic
process and hence the number of captured targets at any time is a random variable. Detection
in conventional DNA microarrays is based on a single measurement taken in the steady
state of the binding process. Recently developed real-time DNA microarrays, on the other hand,
acquire multiple temporal measurements which allow more precise characterization of the
reaction and enable faster detection based on the early dynamics of the binding process.
In this thesis, I study target estimation and limits of performance of real time affinity
biosensors. Target estimation is mapped to the problem of estimating parameters of discretely
observed nonlinear diffusion processes. Performance of the estimators is characterized
analytically via Cramer-Rao lower bound on the mean-square error. The proposed algorithms
are verified on both simulated and experimental data, demonstrating significant gains over
state-of-the-art techniques.

In addition to biosensor arrays, in this thesis I present studies of the signal processing
aspects of next-generation sequencing systems. Novel sequencing technologies will
provide significant improvements in many aspects of human condition, ultimately leading
towards the understanding, diagnosis, treatment and prevention of diseases. Reliable
decision-making in such downstream applications is predicated upon accurate
base-calling, i.e., identification of the order of nucleotides from noisy sequencing data.
Base-calling error rates are nonuniform and typically deteriorate with the length of the
reads. I have studied performance limits of base-calling, characterizing it by means of an
upper bound on the error rates. Moreover, in the context of shotgun sequencing, I analyzed
how accuracy of an assembled sequence depends on coverage, i.e., on the average
number of times each base in a target sequence is represented in different reads.
These analytical results are verified using experimental data.

Among many downstream applications of high-throughput biosensing and sequencing
technologies, reconstruction of gene regulatory networks is of particular importance. In this
thesis, I consider the gene network inference problem and propose a probabilistic graphical
approach for solving it. Specifically, I develop graphical models and design message passing
algorithms which are then verified using experimental data provided by the Dialogue for
Reverse Engineering Assessment and Methods (DREAM) initiative. / text

Identiferoai:union.ndltd.org:UTEXAS/oai:repositories.lib.utexas.edu:2152/ETD-UT-2012-08-6001
Date19 November 2012
CreatorsShamaiah, Manohar
Source SetsUniversity of Texas
LanguageEnglish
Detected LanguageEnglish
Typethesis
Formatapplication/pdf

Page generated in 0.002 seconds