Spelling suggestions: "subject:"estatistics"" "subject:"cstatistics""
301 |
STATISTICAL METHODS FOR EXPLORING NEURONAL INTERACTIONSZhao, Mengyuan 01 October 2010 (has links)
Generalized linear models (GLMs) offer a platform for analyzing multi-electrode
recordings of neuronal spiking. We suggest an L1-regularized logistic regression
model to detect short-term interactions under certain experimental setups. We
estimate parameters of this model using a coordinate descent algorithm; we determine
the optimal tuning parameter using BIC, and prove its asymptotic validity. Simulation
studies of the method's performance show that this model can detect excitatory
interactions with high sensitivity and specificity with reasonably large recordings,
even when the magnitude of the interactions is small; similar results hold for
inhibition for sufficiently high baseline firing rates. The method is somewhat robust
to network complexity and partial observation of networks. We apply our method to
multi-electrode recording data from monkey dorsal premotor cortex (PMd). Our results
point to certain features of short-term interactions when a monkey plans a reach.
Next, we propose a variable coefficients GLM model to assess the temporal variation
of interactions across trials. We treat the parameters of interest as functions over
trials, and fit them by penalized splines. There are also nuisance parameters assumed
constant, which are mildly penalized to guarantee the finite maximum of the
log-likelihood. We choose tuning parameters for smoothness by generalized cross
validation, and provide simultaneous confidence bands and hypothesis tests for
null models. To achieve efficient computation, some modifications are also made. We
apply our method to a subset of the monkey PMd data. Before the implementation to the
real data, simulations are done to assess the performance of the proposed model.
Finally, for the logistic and Poisson models, one possible difficulty is that iterative
algorithms for estimation may not converge because of certain data configurations
(called complete and quasicomplete separation for the logistic). We show that these
features are likely to occur because of refractory periods of neurons, and show how
standard software deals with this difficulty. For the Poisson model, we show that such
difficulties arise possibly due to bursting or specifics of the binning. We
characterize the nonconvergent configurations for both models, show that they can be
detected by linear programming methods, and propose remedies.
|
302 |
Reconstructing Images from In Vivo Laser Scanning Microscope DataObreja, Mihaela 30 January 2011 (has links)
Two-photon laser-scanning microscopy can be used for in vivo neuro-imaging of small animals. Due to the very high resolution of the images, any brain motion can cause significant artifacts; often the tissue may get displaced by 10 or more pixels from its rest position. To scan an image of 512 lines it takes about 1s. During this time, at least 3 heart beats and 1 respiration happen moving the brain. Therefore some tissue locations are scanned several times while others are missed. Consequently, although the images may appear reasonable, they can lead to incorrect conclusions with respect to brain structure or function. As lines are scanned almost instantaneously (~1ms), our problem is reduced to relocating each line in a three-dimensional stack of images to its correct location. In order to model the movement process and quantify the effect of the physiological signal, we collected hybrid image data: fixing y and z, the microscope was set to scan in the x direction for several thousands of times. Classifying these lines using Normalized Cross-Correlation kernel function, we were able to track the trajectory that the line follows due to brain motion. Based on it, we can predict the number of replicates that we may need to reconstruct a reliable image. Also, we can study how it relates with the physiological values. To address the motion effects, we describe a Semi-Hidden Markov Model to estimate the sequence of hidden states most likely to have generated the observations. The model considers that at the scanning time the brain is either in near-to-rest(S1) state, or in far-from-rest(S2) state. Our algorithm assigns probabilities for each state based on concomitant physiological measurements. Using Viterbi's approach we estimate the most likely path of states and we select the lines observed in S1. Because there is no gold standard, we suggest comparing our result with a stack of images collected after the animal is sacrificed. Conditioned on inherent experimental and technological limitations, the results of this work offer a description of the brain movement caused by physiology and a solution for reconstructing reliable images from in vivo microscopy.
|
303 |
Functional Connectivity Analysis of FMRI Time-Series DataZhou, Dongli 31 January 2011 (has links)
The term ``functional connectivity' is used to denote correlations in activation among
spatially-distinct brain regions, either in a resting state or when processing external stimuli. Functional connectivity has been extensively evaluated with several functional
neuroimaging methods, particularly PET and fMRI. Yet these relationships have been quantified using very different measures and the extent to which they index the same constructs is unclear.
We have implemented a variety of these functional connectivity measures in a new freely-available MATLAB toolbox. These measures are categorized into two groups: whole time-series and trial-based approaches. We evaluate these measures via simulations with different patterns of functional connectivity and provide recommendations for their use. We also apply these measures to a previously published fMRI dataset in which activity in dorsal anterior cingulate cortex (dACC) and dorsolateral prefrontal cortex (DLPFC) was evaluated in 32 healthy subjects during a digit sorting task.
Though all implemented measures demonstrate functional connectivity between dACC and DLPFC activity during event-related tasks, different participants appeared to display qualitatively different relationships.
We also propose a new methodology for exploring functional connectivity in slow event-related designs, where stimuli are presented at a sufficient separation to examine the dynamic responses in brain regions. Our methodology simultaneously determines the level of smoothing to obtain the underlying noise-free BOLD response and the functional connectivity among several regions. Smoothing is accomplished through an empirical basis via functional principal components analysis. The coefficients of the basis are assumed to be correlated across regions, and the nature and strength of functional connectivity is derived from this correlation matrix. The model is implemented within a Bayesian framework by specifying priors on the parameters and using a Markov Chain Monte Carlo (MCMC) Gibbs sampling algorithm. We demonstrate this new approach on a sample of clinically depressed subjects and healthy controls in examining relationships among three brain regions implicated in depression and emotion during emotional information processing. The results show that depressed subjects display decreased coupling between left amygdala and DLPFC compared to healthy subjects and this may potentially be due to inefficient functioning in mediating connectivity from the rostral portion Brodmanns area24 (BA24).
|
304 |
Improved sample size re-estimation in adaptive clinical trials without unblindingTeel, Chen 30 June 2011 (has links)
Sample size calculations in clinical trials depend on good estimates of the standard devotion. Due to the uncertainty in the planning phase, adaptive sample size designs have been used to re-estimate the standard deviation based on interim data and adjust the sample size as necessary. Our research concentrates on carrying out the sample size re-estimation without obtaining the treatment identities.
Gould and Shih treated the data at the interim as coming from a mixture of two normal distributions with common standard deviation. In order to adjust the sample size, they used EM algorithm to obtain the MLE of the standard deviation while keeping treatment identities blinded. However, the approach has been criticized in the literature and our simulation studies show that Gould and Shih's EM algorithm sometimes obtains incorrect boundary modes as estimates of the standard deviation. In our research, we establish a new procedure to re-estimate sample size without breaking the blind but using additional information concerning randomization structure at the interim. We enhance Gould and Shihs EM procedure by utilizing the conditional Bernoulli model to incorporate the available information that equal numbers of subjects are observed at the interim stage. Properties of the proposed enhanced EM estimator are investigated in detail.
Furthermore, we use the full information of the blocked randomization schedule in the enhanced EM algorithm that the numbers of subjects are equal across treatment groups within each randomization block. With increased information that occurs with increasing block sizes, the accuracy of the estimation of the standard deviation improves. More specifically, the estimator has quite a small bias when the block size is small which is fairly common the case in clinical trials. Moreover, for the case of two treatment groups, the preservation of the actual type I error rate when using the standard t-test at the end of the trial is verified through a simulation study in many parameter scenarios. We also analytically computed and simulated the actual power and the expected sample size. Finally, we develop the details of sample size re-estimation for multi-center clinical trials, where we have the randomization schedule blocked within center.
|
305 |
Unconventional Approach with the Likelihood of Correlation MatricesSong, Myung Soon 30 September 2011 (has links)
Numerical approximations are important research areas for dealing with complicated functional forms. Techniques for developing accurate and efficient calculation of combined likelihood functions in meta-analysis are studied. The first part of the thesis introduces a B-spline approximation for making a parsimonious model in the simplest case(2-dimensional case) of correlation structure. Inference about the correlation between vitamin C intake & vitamin C serum level is developed by using likelihood intervals and the MLE, along with comparison with conventional methods. The second part studies a multivariate numerical integration method for developing a better approximation of the likelihood for correlation matrices. Analyses for (1) intercorrelations among Math, Spatial and Verbal scores in an SAT exam and (2) intercorrelations among Cognitive Anxiety, Somatic Anxiety and Self Confidence from Competitive State Anxiety Inventory (CSAI-2) are explored. Algorithms to evaluate likelihood and to find the MLE is developed. Comparison with two conventional methods (joint asymptotic weighted average method & marginal asymptotic weighted average method) is shown.
|
306 |
VOLATILITY AND JUMPS IN HIGH FREQUENCY FINANCIAL DATA: ESTIMATION AND TESTINGZhou, Nan 30 September 2011 (has links)
It has been widely accepted in financial econometrics that both the microstructure noise
and jumps are significantly involved in high frequency data. In some empirical situations,
the noise structure is more complex than independent and identically distributed (i.i.d.)
assumption. Therefore, it is important to carefully study the noise and jumps when using
high frequency financial data. In this dissertation, we develop several methods related to the volatility estimation and testing for jumps.
Chapter 1 proposes a new method for volatility estimation in the case where both the noise level and noise dependence are significant. This estimator is a weighted combination
of sub-sampling realized covariances, constructed from discretely observed high frequency data. It is proved to be a consistent estimator of quadratic variation in the case with either i.i.d. or dependent noise. It is also shown to have good finite-sample properties compared with existing estimators in the literature.
Chapter 2 focuses on the testing for jumps based on high frequency data. We generalize
the methods in Ait-Sahalia and Jacod (2009a) and Fan and Fan (2010). The generalized method allows more flexible choices for the construction of test statistics, and has smaller
asymptotic variance under both null and alternative hypotheses. However, all these methods are not effective when the microstructure noise is significant. To reduce the influence from noise, we further design a new statistical test, which is robust with the i.i.d. microstructure
noise. This new method is compared with the old tests through Monte Carlo studies.
|
307 |
A Statistical Approach to the Inverse Problem in MagnetoencephalographyYao, Zhigang 30 September 2011 (has links)
Magnetoencephalography (MEG) is an imaging technique used to measure the magnetic field outside the human head produced by the electrical activity inside the brain. The MEG inverse problem, identifying the location of the electric sources from the magnetic signal measurements, is ill-posed; that is, there is an infinite number of mathematically correct solutions. Common source localization methods assume the source does not vary with time and do not provide estimates of the variability of the fitted model. We reformulate the MEG inverse problem by considering time-varying sources and we model their time evolution using a state space model. Based on our model, we investigate the inverse problem by finding the posterior source distribution given the multiple channels of observations at each time rather than fitting fixed source estimates. A computational challenge arises because the data likelihood is nonlinear, where Markov chain Monte Carlo (MCMC) methods including conventional Gibbs sampling are difficult to implement. We propose two new Monte Carlo methods based on sequential importance sampling. Unlike the usual MCMC sampling scheme, our new methods work in this situation without needing to tune a high-dimensional transition kernel which has a very high-cost. We have created a set of C programs under LINUX and use Parallel Virtual Machine (PVM) software to speed up the computation.
Common methods used to estimate the number of sources in the MEG data include principal component analysis and factor analysis, both of which make use of the eigenvalue distribution of the data. Other methods involve the information criterion and minimum description length. Unfortunately, all these methods are very sensitive to the signal-to-noise ratio (SNR). First, we consider a wavelet approach, a residual analysis approach and a Fourier approach to estimate the noise variance. Second, a Neyman-Pearson detection theory-based eigenthresholding method is used to decide the number of signal sources. We apply our methods to simulated data where we know the truth. A real MEG dataset without a human subject is also tested. Our methods allow us to estimate the noise more accurately and are robust in deciding the number of signal sources.
|
308 |
Bicriterion clustering and selecting the optimal number of clusters via agreement measure /Liu, Heng, January 2007 (has links)
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2007. / Source: Dissertation Abstracts International, Volume: 68-07, Section: B, page: 4575. Adviser: Douglas Simpson. Includes bibliographical references (leaves 96-100) Available on microfilm from Pro Quest Information and Learning.
|
309 |
Nonparametric Bayes Analysis of Social Science DataKunihama, Tsuyoshi January 2015 (has links)
<p>Social science data often contain complex characteristics that standard statistical methods fail to capture. Social surveys assign many questions to respondents, which often consist of mixed-scale variables. Each of the variables can follow a complex distribution outside parametric families and associations among variables may have more complicated structures than standard linear dependence. Therefore, it is not straightforward to develop a statistical model which can approximate structures well in the social science data. In addition, many social surveys have collected data over time and therefore we need to incorporate dynamic dependence into the models. Also, it is standard to observe massive number of missing values in the social science data. To address these challenging problems, this thesis develops flexible nonparametric Bayesian methods for the analysis of social science data. </p><p>Chapter 1 briefly explains backgrounds and motivations of the projects in the following chapters. Chapter 2 develops a nonparametric Bayesian modeling of temporal dependence in large sparse contingency tables, relying on a probabilistic factorization of the joint pmf. Chapter 3 proposes nonparametric Bayes inference on conditional independence with conditional mutual information used as a measure of the strength of conditional dependence. Chapter 4 proposes a novel Bayesian density estimation method in social surveys with complex designs where there is a gap between sample and population. We correct for the bias by adjusting mixture weights in Bayesian mixture models. Chapter 5 develops a nonparametric model for mixed-scale longitudinal surveys, in which various types of variables can be induced through latent continuous variables and dynamic latent factors lead to flexibly time-varying associations among variables.</p> / Dissertation
|
310 |
Some Optimal and Sequential Experimental Designs with Potential Applications to Nanostructure Synthesis and BeyondZhu, Li 17 August 2012 (has links)
Design of Experiments (DOE) is an important topic in statistics. Efficient experimentation can help an investigator to extract maximum information from a dataset. In recent times, DOE has found new and challenging applications in science, engineering and technology. In this thesis, two different experimental design problems, motivated by the need for modeling the growth of nanowires are studied. In the fi rst problem, we consider issues of determining an optimal experimental design for estimation of parameters of a complex curve characterizing nanowire growth that is partially exponential and partially linear. A locally D-optimal design for the non-linear change-point growth model is obtained by using a geometric approach. Further, a Bayesian sequential algorithm is proposed for obtaining the D-optimal design. The advantages of the proposed algorithm over traditional approaches adopted in recent nano-experiments are demonstrated using Monte-Carlo simulations. The second problem deals with generating space- lling design in feasible regions of complex response surfaces with unknown constraints. Two di erent types of sequential design strategies are proposed with the objective of generating a sequence of design points that will quickly carve out the (unknown) infeasible regions and generate more and more points in the (unknown) feasible region. The generated design is space- lling (in certain sense) within the feasible region. The rst strategy is model independent, whereas the second one is model-based. Theoretical properties of proposed strategies are derived and simulation studies are conducted to evaluate the performance of proposed strategies. The strategies are developed assuming that the response function is deterministic, and extensions are proposed for random response functions. / Statistics
|
Page generated in 0.0897 seconds