• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 21
  • 6
  • 4
  • 3
  • 1
  • Tagged with
  • 41
  • 41
  • 41
  • 16
  • 14
  • 13
  • 11
  • 9
  • 9
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Deterministic annealing EM algorithm for robust learning of Gaussian mixture models

Wang, Bo Yu January 2011 (has links)
University of Macau / Faculty of Science and Technology / Department of Electrical and Electronics Engineering
22

Time series analysis of Saudi Arabia oil production data

Albarrak, Abdulmajeed Barrak 14 December 2013 (has links)
Saudi Arabia is the largest petroleum producer and exporter in the world. Saudi Arabian economy hugely depends on production and export of oil. This motivates us to do research on oil production of Saudi Arabia. In our research the prime objective is to find the most appropriate models for analyzing Saudi Arabia oil production data. Initially we think of considering integrated autoregressive moving average (ARIMA) models to fit the data. But most of the variables under study show some kind of volatility and for this reason we finally decide to consider autoregressive conditional heteroscedastic (ARCH) models for them. If there is no ARCH effect, it will automatically become an ARIMA model. But the existence of missing values for almost each of the variable makes the analysis part complicated since the estimation of parameters in an ARCH model does not converge when observations are missing. As a remedy to this problem we estimate missing observations first. We employ the expectation maximization (EM) algorithm for estimating the missing values. But since our data are time series data, any simple EM algorithm is not appropriate for them. There is also evidence of the presence of outliers in the data. Therefore we finally employ robust regression least trimmed squares (LTS) based EM algorithm to estimate the missing values. After the estimation of missing values we employ the White test to select the most appropriate ARCH models for all sixteen variables under study. Normality test on resulting residuals is performed for each of the variable to check the validity of the fitted model. / ARCH/GARCH models, outliers and robustness : tests for normality and estimation of missing values in time series -- Outlier analysis and estimation of missing values by robust EM algorithm for Saudi Arabia oil production data -- Selection of ARCH models for Saudi Arabia oil production data. / Department of Mathematical Sciences
23

Estimating parameters in markov models for longitudinal studies with missing data or surrogate outcomes /

Yeh, Hung-Wen. Chan, Wenyaw. January 2007 (has links)
Thesis (Ph. D.)--University of Texas Health Science Center at Houston, School of Public Health, 2007. / Includes bibliographical references (leaves 58-59).
24

Object and concept recognition for content-based image retrieval /

Li, Yi, January 2005 (has links)
Thesis (Ph. D.)--University of Washington, 2005. / Vita. Includes bibliographical references (p. 82-87).
25

Mass Spectrum Analysis of a Substance Sample Placed into Liquid Solution

Wang, Yunli January 2011 (has links)
Mass spectrometry is an analytical technique commonly used for determining elemental composition in a substance sample. For this purpose, the sample is placed into some liquid solution called liquid matrix. Unfortunately, the spectrum of the sample is not observable separate from that of the solution. Thus, it is desired to distinguish the sample spectrum. The analysis is usually based on the comparison of the mixed spectrum with the one of the sole solution. Introducing the missing information about the origin of observed spectrum peaks, the author obtains a classic set up for the Expectation-Maximization (EM) algorithm. The author proposed a mixture modeling the spectrum of the liquid solution as well as that of the sample. A bell-shaped probability mass function obtained by discretization of the univariate Gaussian probability density function was proposed or serving as a mixture component. The E- and M- steps were derived under the proposed model. The corresponding R program is written and tested on a small but challenging simulation example. Varying the number of mixture components for the liquid matrix and sample, the author found the correct model according to Bayesian Information Criterion. The initialization of the EM algorithm is a difficult standalone problem that was successfully resolved for this case. The author presents the findings and provides results from the simulation example as well as corresponding illustrations supporting the conclusions.
26

Application of Inter-Die Rank Statistics in Defect Detection

Bakshi, Vivek 01 March 2012 (has links)
This thesis presents a statistical method to identify the test escapes. Test often acquires parametric measurements as a function of logical state of a chip. The usual method of classifying chips as pass or fail is to compare each state measurement to a test limit. Subtle manufacturing defects are escaping the test limits due to process variations in deep sub-micron technologies which results in mixing of healthy and faulty parametric test measurements. This thesis identifies the chips with subtle defects by using rank order of the parametric measurements. A hypothesis is developed that a defect is likely to disturb the defect-free ranking, whereas a shift caused by process variations will not affect the rank. The hypothesis does not depend on a-priori knowledge of a defect-free ranking of parametric measurements. This thesis introduces a modified Estimation Maximization (EM) algorithm to separate the healthy and faulty tau components calculated from parametric responses of die pairs on a wafer. The modified EM uses generalized beta distributions to model the two components of tau mixture distribution. The modified EM estimates the faulty probability of each die on a wafer. The sensitivity of the modified EM is evaluated using Monte Carlo simulations. The modified EM is applied on production product A. An average 30% reduction in DPPM (defective parts per million) is observed in Product A across all lots.
27

Investigation of probabilistic principal component analysis compared to proper orthogonal decomposition methods for basis extraction and missing data estimation

Lee, Kyunghoon 21 May 2010 (has links)
The identification of flow characteristics and the reduction of high-dimensional simulation data have capitalized on an orthogonal basis achieved by proper orthogonal decomposition (POD), also known as principal component analysis (PCA) or the Karhunen-Loeve transform (KLT). In the realm of aerospace engineering, an orthogonal basis is versatile for diverse applications, especially associated with reduced-order modeling (ROM) as follows: a low-dimensional turbulence model, an unsteady aerodynamic model for aeroelasticity and flow control, and a steady aerodynamic model for airfoil shape design. Provided that a given data set lacks parts of its data, POD is required to adopt a least-squares formulation, leading to gappy POD, using a gappy norm that is a variant of an L2 norm dealing with only known data. Although gappy POD is originally devised to restore marred images, its application has spread to aerospace engineering for the following reason: various engineering problems can be reformulated in forms of missing data estimation to exploit gappy POD. Similar to POD, gappy POD has a broad range of applications such as optimal flow sensor placement, experimental and numerical flow data assimilation, and impaired particle image velocimetry (PIV) data restoration. Apart from POD and gappy POD, both of which are deterministic formulations, probabilistic principal component analysis (PPCA), a probabilistic generalization of PCA, has been used in the pattern recognition field for speech recognition and in the oceanography area for empirical orthogonal functions in the presence of missing data. In formulation, PPCA presumes a linear latent variable model relating an observed variable with a latent variable that is inferred only from an observed variable through a linear mapping called factor-loading. To evaluate the maximum likelihood estimates (MLEs) of PPCA parameters such as a factor-loading, PPCA can invoke an expectation-maximization (EM) algorithm, yielding an EM algorithm for PPCA (EM-PCA). By virtue of the EM algorithm, the EM-PCA is capable of not only extracting a basis but also restoring missing data through iterations whether the given data are intact or not. Therefore, the EM-PCA can potentially substitute for both POD and gappy POD inasmuch as its accuracy and efficiency are comparable to those of POD and gappy POD. In order to examine the benefits of the EM-PCA for aerospace engineering applications, this thesis attempts to qualitatively and quantitatively scrutinize the EM-PCA alongside both POD and gappy POD using high-dimensional simulation data. In pursuing qualitative investigations, the theoretical relationship between POD and PPCA is transparent such that the factor-loading MLE of PPCA, evaluated by the EM-PCA, pertains to an orthogonal basis obtained by POD. By contrast, the analytical connection between gappy POD and the EM-PCA is nebulous because they distinctively approximate missing data due to their antithetical formulation perspectives: gappy POD solves a least-squares problem whereas the EM-PCA relies on the expectation of the observation probability model. To juxtapose both gappy POD and the EM-PCA, this research proposes a unifying least-squares perspective that embraces the two disparate algorithms within a generalized least-squares framework. As a result, the unifying perspective reveals that both methods address similar least-squares problems; however, their formulations contain dissimilar bases and norms. Furthermore, this research delves into the ramifications of the different bases and norms that will eventually characterize the traits of both methods. To this end, two hybrid algorithms of gappy POD and the EM-PCA are devised and compared to the original algorithms for a qualitative illustration of the different basis and norm effects. After all, a norm reflecting a curve-fitting method is found to more significantly affect estimation error reduction than a basis for two example test data sets: one is absent of data only at a single snapshot and the other misses data across all the snapshots. From a numerical performance aspect, the EM-PCA is computationally less efficient than POD for intact data since it suffers from slow convergence inherited from the EM algorithm. For incomplete data, this thesis quantitatively found that the number of data-missing snapshots predetermines whether the EM-PCA or gappy POD outperforms the other because of the computational cost of a coefficient evaluation, resulting from a norm selection. For instance, gappy POD demands laborious computational effort in proportion to the number of data-missing snapshots as a consequence of the gappy norm. In contrast, the computational cost of the EM-PCA is invariant to the number of data-missing snapshots thanks to the L2 norm. In general, the higher the number of data-missing snapshots, the wider the gap between the computational cost of gappy POD and the EM-PCA. Based on the numerical experiments reported in this thesis, the following criterion is recommended regarding the selection between gappy POD and the EM-PCA for computational efficiency: gappy POD for an incomplete data set containing a few data-missing snapshots and the EM-PCA for an incomplete data set involving multiple data-missing snapshots. Last, the EM-PCA is applied to two aerospace applications in comparison to gappy POD as a proof of concept: one with an emphasis on basis extraction and the other with a focus on missing data reconstruction for a given incomplete data set with scattered missing data. The first application exploits the EM-PCA to efficiently construct reduced-order models of engine deck responses obtained by the numerical propulsion system simulation (NPSS), some of whose results are absent due to failed analyses caused by numerical instability. Model-prediction tests validate that engine performance metrics estimated by the reduced-order NPSS model exhibit considerably good agreement with those directly obtained by NPSS. Similarly, the second application illustrates that the EM-PCA is significantly more cost effective than gappy POD at repairing spurious PIV measurements obtained from acoustically-excited, bluff-body jet flow experiments. The EM-PCA reduces computational cost on factors 8 ~ 19 compared to gappy POD while generating the same restoration results as those evaluated by gappy POD. All in all, through comprehensive theoretical and numerical investigation, this research establishes that the EM-PCA is an efficient alternative to gappy POD for an incomplete data set containing missing data over an entire data set.
28

Distributed estimation in resource-constrained wireless sensor networks

Li, Junlin 13 November 2008 (has links)
Wireless sensor networks (WSN) are an emerging technology with a wide range of applications including environment monitoring, security and surveillance, health care, smart homes, etc. Subject to severe resource constraints in wireless sensor networks, in this research, we address the distributed estimation of unknown parameters by studying the correlation among resource, distortion, and lifetime, which are three major concerns for WSN applications. The objective of the proposed research is to design efficient distributed estimation algorithms for resource-constrained wireless sensor networks, where the major challenge is the integrated design of local signal processing operations and strategies for inter-sensor communication and networking so as to achieve a desirable tradeoff among resource efficiency (bandwidth and energy), system performance (estimation distortion and network lifetime), and implementation simplicity. More specifically, we address the efficient distributed estimation from the following perspectives: (i) rate-distortion perspective, where the objective is to study the rate-distortion bound for the distributed estimation and to design practical and distributed algorithms suitable for wireless sensor networks to approach the performance bound by optimally allocating the bit rate for each sensor, (ii) energy-distortion perspective, where the objective is to study the energy-distortion bound for the distributed estimation and to design practical and distributed algorithms suitable for wireless sensor networks to approach the performance bound by optimally allocating the bit rate and transmission energy for each sensor, and (iii) lifetime-distortion perspective, where the objective is to maximize the network lifetime while meeting estimation distortion requirements by jointly optimizing the source coding, source throughput and multi-hop routing. Also, energy-efficient cluster-based distributed estimation is studied, where the objective is to minimize the overall energy cost by appropriately dividing the sensor field into multiple clusters with data aggregation at cluster heads.
29

Enhanced classification approach with semi-supervised learning for reliability-based system design

Patel, Jiten 02 July 2012 (has links)
Traditionally design engineers have used the Factor of Safety method for ensuring that designs do not fail in the field. Access to advanced computational tools and resources have made this process obsolete and new methods to introduce higher levels of reliability in an engineering systems are currently being investigated. However, even though high computational resources are available the computational resources required by reliability analysis procedures leave much to be desired. Furthermore, the regression based surrogate modeling techniques fail when there is discontinuity in the design space, caused by failure mechanisms, when the design is required to perform under severe externalities. Hence, in this research we propose efficient Semi-Supervised Learning based surrogate modeling techniques that will enable accurate estimation of a system's response, even under discontinuity. These methods combine the available set of labeled dataset and unlabeled dataset and provide better models than using labeled data alone. Labeled data is expensive to obtain since the responses have to be evaluated whereas unlabeled data is available in plenty, during reliability estimation, since the PDF information of uncertain variables is assumed to be known. This superior performance is gained by combining the efficiency of Probabilistic Neural Networks (PNN) for classification and Expectation-Maximization (EM) algorithm for treating the unlabeled data as labeled data with hidden labels.
30

Diagnóstico de influência em modelos com erros na variável skew-normal/independente / Influence of diagnostic in models with errors in variable skew-normal/independent

Carvalho, Rignaldo Rodrigues 17 August 2018 (has links)
Orientadores: Victor Hugo Lachos Dávila, Filidor Edilfonso Vilca Labra / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica / Made available in DSpace on 2018-08-17T09:37:18Z (GMT). No. of bitstreams: 1 Carvalho_RignaldoRodrigues_M.pdf: 1849605 bytes, checksum: 07ea5638a2dbfa2227f9a949d4723bbf (MD5) Previous issue date: 2010 / Resumo: O modelo de medição de Barnett é frequentemente usado para comparar vários instrumentos de medição. é comum assumir que os termos aleatórios têm uma distribuição normal. Entretanto, tal suposição faz a inferência vulnerável a observações atípicas por outro lado distribuições de misturas de escala skew-normal tem sido uma interessante alternativa para produzir estimativas robustas tendo a elegância e simplicidade da teoria da máxima verossimilhança. Nós usamos resultados de Lachos et al. (2008) para obter a estimação dos parâmetros via máxima verossimilhança, baseada no algoritmo EM, o qual rende expressões de forma fechada para as equações no passo M. Em seguida desenvolvemos o método de influência local de Zhu e Lee (2001) para avaliar os aspectos de estimação dos parâmetros sob alguns esquemas de perturbação. Os resultados obtidos são aplicados a conjuntos de dados bastante estudados na literatura, ilustrando a utilidade da metodologia proposta / Abstract: The Barnett measurement model is frequently used to comparing several measuring devices. It is common to assume that the random terms have a normal distribution. However, such assumption makes the inference vulnerable to outlying observations whereas scale mixtures of skew-normal distributions have been an interesting alternative to produce robust estimates keeping the elegancy and simplicity of the maximum likelihood theory. We used results in Lachos et al. (2008) for obtaining parameter estimation via maximum likelihood, based on the EM-algorithm, which yields closed form expressions for the equations in the M-step. Then we developed the local influence method to assessing the robustness aspects of these parameter estimates under some usual perturbation schemes. Results obtained for one real data set are reported, illustrating the usefulness of the proposed methodology / Mestrado / Métodos Estatísticos / Mestre em Estatística

Page generated in 0.1775 seconds