Spelling suggestions: "subject:"then EM algorithm"" "subject:"them EM algorithm""
1 |
Using EM Algorithm to identify defective parts per million on shifting production processFreeman, James Wesley 23 April 2013 (has links)
The objective of this project is to determine whether utilizing an EM Algorithm to fit a Gaussian mixed model distribution model provides needed accuracy in identifying the number of defective parts per million when the overall population is made up of multiple independent runs or lots. The other option is approximating using standard software tools and common known techniques available to a process, industrial or quality engineer. These tools and techniques provide methods utilizing familiar distributions and statistical process control methods widely understood. This paper compares these common methods with an EM Algorithm programmed in R using a dataset of actual measurements for length of manufactured product. / text
|
2 |
The Application of the Expectation-Maximization Algorithm to the Identification of Biological ModelsChen, Shuo 09 March 2007 (has links)
With the onset of large-scale gene expression profiling, many researchers have turned their attention toward biological process modeling and system identification. The abundance of data available, while inspiring, is also daunting to interpret. Following the initial work of Rangel et al., we propose a linear model for identifying the biological model behind the data and utilize a modification of the Expectation-Maximization algorithm for training it. With our model, we explore some commonly accepted assumptions concerning sampling, discretization, and state transformations. Also, we illuminate the model complexities and interpretation difficulties caused by unknown state transformations and propose some solutions for resolving these problems. Finally, we elucidate the advantages and limitations of our linear state-space model with simulated data from several nonlinear networks. / Master of Science
|
3 |
Multiple ARX Model Based Identification for Switching/Nonlinear Systems with EM AlgorithmJin, Xing 06 1900 (has links)
Two different types of switching mechanism are considered in this thesis; one is featured with abrupt/sudden switching while the other one shows gradual changing behavior in its dynamics. It is shown that, through the comparison of the identification results from the proposed method and a benchmark method, the proposed robust identification method can achieve better performance when dealing with the data set mixed with outliers.
To model the switched systems exhibiting gradual or smooth transition among different local models, in addition to estimating the local sub-systems parameters, a smooth validity (an exponential function) function is introduced to combine all the local models so that throughout the working range of the gradual switched system, the dynamics of the nonlinear process can be appropriately approximated. Verification results on a simulated numerical example and CSTR process confirm the effectiveness of the proposed Linear Parameter Varying (LPV) identification algorithm. / Process Control
|
4 |
Model-based Pre-processing in Protein Mass SpectrometryWagaman, John C. 2009 December 1900 (has links)
The discovery of proteomic information through the use of mass spectrometry (MS) has been an active area of research in the diagnosis and prognosis of many types of cancer. This process involves feature selection through peak detection but is often complicated by many forms of non-biologicalbias. The need to extract biologically relevant peak information from MS data has resulted in the development of statistical techniques to aid in spectra pre-processing. Baseline estimation and normalization are important pre-processing steps because the subsequent quantification of peak heights depends on this baseline estimate. This dissertation introduces a mixture model to estimate the baseline and peak heights simultaneously through the expectation-maximization (EM) algorithm and a penalized likelihood approach. Our model-based pre-processing performs well in the presence of raw, unnormalized data, with few subjective inputs. We also propose a model-based normalization solution for use in subsequent classification procedures, where misclassification results compare favorably with existing methods of normalization. The performance of our pre-processing method is evaluated using popular matrix-assisted laser desorption and ionization (MALDI) and surface-enhanced laser desorption and ionization (SELDI) datasets as well as through simulation.
|
5 |
Image Restoration for Multiplicative Noise with Unknown ParametersChen, Ren-Chi 28 July 2006 (has links)
First, we study a Poisson model a polluted random screen. In this model, the defects on random screen are assumed Poisson-distribution and overlapped. The transmittance effects of overlapping defects are multiplicative. We can compute the autocorrelation function of the screen is obtained by defects' density, radius, and transmittance. Using the autocorrelation function, we then restore the telescope astronomy images. These image signals are generally degraded by their propagation through the random scattering in atmosphere.
To restore the images, we estimate the three key parameters by three methods. They are expectation- maximization (EM) method and two Maximum-Entropy (ME) methods according to two different definitions. The restoration are successful and demonstrated in this thesis.
|
6 |
Comparing Approaches to Initializing the Expectation-Maximization AlgorithmDicintio, Sabrina 09 October 2012 (has links)
The expectation-maximization (EM) algorithm is a widely utilized approach to max-
imum likelihood estimation in the presence of missing data, this thesis focuses on its
application within the model-based clustering framework. The performance of the
EM algorithm can be highly dependent on how the algorithm is initialized. Several
ways of initializing the EM algorithm have been proposed, however, the best method
to use for initialization remains a somewhat controversial topic. From an attempt to
obtain a superior method of initializing the EM algorithm, comes the concept of using
multiple existing methods together in what will be called a `voting' procedure. This
procedure will use several common initialization methods to cluster the data, then
a nal starting ^zig matrix will be obtained in two ways. The hard `voting' method
follows a majority rule, whereas the soft `voting' method takes an average of the
multiple group memberships. The nal ^zig matrix obtained from both methods will
dictate the starting values of ^ g; ^
g; and ^ g used to initialize the EM algorithm.
|
7 |
Multiple ARX Model Based Identification for Switching/Nonlinear Systems with EM AlgorithmJin, Xing Unknown Date
No description available.
|
8 |
A new normalized EM algorithm for clustering gene expression dataNguyen, Phuong Minh, Electrical Engineering & Telecommunications, Faculty of Engineering, UNSW January 2008 (has links)
Microarray data clustering represents a basic exploratory tool to find groups of genes exhibiting similar expression patterns or to detect relevant classes of molecular subtypes. Among a wide range of clustering approaches proposed and applied in the gene expression community to analyze microarray data, mixture model-based clustering has received much attention to its sound statistical framework and its flexibility in data modeling. However, clustering algorithms following the model-based framework suffer from two serious drawbacks. The first drawback is that the performance of these algorithms critically depends on the starting values for their iterative clustering procedures. Additionally, they are not capable of working directly with very high dimensional data sets in the sample clustering problem where the dimension of the data is up to hundreds or thousands. The thesis focuses on the two challenges and includes the following contributions: First, the thesis introduces the statistical model of our proposed normalized Expectation Maximization (EM) algorithm followed by its clustering performance analysis on a number of real microarray data sets. The normalized EM is stable even with random initializations for its EM iterative procedure. The stability of the normalized EM is demonstrated through its performance comparison with other related clustering algorithms. Furthermore, the normalized EM is the first mixture model-based clustering approach to be capable of working directly with very high dimensional microarray data sets in the sample clustering problem, where the number of genes is much larger than the number of samples. This advantage of the normalized EM is illustrated through the comparison with the unnormalized EM (The conventional EM algorithm for Gaussian mixture model-based clustering). Besides, for experimental microarray data sets with the availability of class labels of data points, an interesting property of the convergence speed of the normalized EM with respect to the radius of the hypersphere in its corresponding statistical model is uncovered. Second, to support the performance comparison of different clusterings a new internal index is derived using fundamental concepts from information theory. This index allows the comparison of clustering approaches in which the closeness between data points is evaluated by their cosine similarity. The method for deriving this internal index can be utilized to design other new indexes for comparing clustering approaches which employ a common similarity measure.
|
9 |
Evolutionary Algorithms for Model-Based ClusteringKampo, Regina S. January 2021 (has links)
Cluster analysis is used to detect underlying group structure in data. Model-based
clustering is the process of performing cluster analysis which involves the fitting of
finite mixture models. However, parameter estimation in mixture model-based approaches
to clustering is notoriously difficult. To this end, this thesis focuses on the
development of evolutionary computation as an alternative technique for parameter
estimation in mixture models. An evolutionary algorithm is proposed and illustrated
on the well-established Gaussian mixture model with missing values. Next, the family
of Gaussian parsimonious clustering models is considered, and an evolutionary
algorithm is developed to estimate the parameters. Next, an evolutionary algorithm
is developed for latent Gaussian mixture models and to facilitate the flexible clustering
of high-dimensional data. For all models and families of models considered in
this thesis, the proposed algorithms used for model-fitting and parameter estimation
are presented and the performance illustrated using real and simulated data sets to
assess the clustering ability of all models. This thesis concludes with a discussion
and suggestions for future work. / Dissertation / Doctor of Philosophy (PhD)
|
10 |
Computation of Weights for Probabilistic Record Linkage Using the EM AlgorithmBauman, G. John 29 June 2006 (has links) (PDF)
Record linkage is the process of combining information about a single individual from two or more records. Probabilistic record linkage gives weights to each field that is compared. The decision of whether the records should be linked is then determined by the sum of the weights, or “Score”, over all fields compared. Using methods similar to the simple versus simple most powerful test, an optimal record linkage decision rule can be established to minimize the number of unlinked records when the probability of false positive and false negative errors are specified. The weights needed for probabilistic record linkage necessitate linking a “training” subset of records for the computations. This is not practical in many settings, as hand matching requires a considerable time investment. In 1989, Matthew A. Jaro demonstrated how the Expectation-Maximization, or EM, algorithm could be used to compute the needed weights when fields have Binomial matching possibilities. This project applies this method of using the EM algorithm to calculate weights for head-of-household records from the 1910 and 1920 Censuses for Ascension Parish of Louisiana and Church and County Records from Perquimans County, North Carolina. This project also expands the Jaro's EM algorithm to a Multinomial framework. The performance of the EM algorithm for calculating weights will be assessed by comparing the computed weights to weights computed by clerical matching. Simulations will also be conducted to investigate the sensitivity of the algorithm to the total number of record pairs, the number of fields with missing entries, the starting values of estimated probabilities, and the convergence epsilon value.
|
Page generated in 0.0713 seconds