• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 67
  • 62
  • 11
  • 10
  • 10
  • 6
  • 5
  • 4
  • 4
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 189
  • 47
  • 45
  • 39
  • 31
  • 25
  • 20
  • 19
  • 18
  • 16
  • 16
  • 16
  • 15
  • 15
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

Cost-Sensitive Learning-based Methods for Imbalanced Classification Problems with Applications

Razzaghi, Talayeh 01 January 2014 (has links)
Analysis and predictive modeling of massive datasets is an extremely significant problem that arises in many practical applications. The task of predictive modeling becomes even more challenging when data are imperfect or uncertain. The real data are frequently affected by outliers, uncertain labels, and uneven distribution of classes (imbalanced data). Such uncertainties create bias and make predictive modeling an even more difficult task. In the present work, we introduce a cost-sensitive learning method (CSL) to deal with the classification of imperfect data. Typically, most traditional approaches for classification demonstrate poor performance in an environment with imperfect data. We propose the use of CSL with Support Vector Machine, which is a well-known data mining algorithm. The results reveal that the proposed algorithm produces more accurate classifiers and is more robust with respect to imperfect data. Furthermore, we explore the best performance measures to tackle imperfect data along with addressing real problems in quality control and business analytics.
162

A Robust Dynamic State and Parameter Estimation Framework for Smart Grid Monitoring and Control

Zhao, Junbo 30 May 2018 (has links)
The enhancement of the reliability, security, and resiliency of electric power systems depends on the availability of fast, accurate, and robust dynamic state estimators. These estimators should be robust to gross errors on the measurements and the model parameter values while providing good state estimates even in the presence of large dynamical system model uncertainties and non-Gaussian thick-tailed process and observation noises. It turns out that the current Kalman filter-based dynamic state estimators given in the literature suffer from several important shortcomings, precluding them from being adopted by power utilities for practical applications. To be specific, they cannot handle (i) dynamic model uncertainty and parameter errors; (ii) non-Gaussian process and observation noise of the system nonlinear dynamic models; (iii) three types of outliers; and (iv) all types of cyber attacks. The three types of outliers, including observation, innovation, and structural outliers are caused by either an unreliable dynamical model or real-time synchrophasor measurements with data quality issues, which are commonly seen in the power system. To address these challenges, we have pioneered a general theoretical framework that advances both robust statistics and robust control theory for robust dynamic state and parameter estimation of a cyber-physical system. Specifically, the generalized maximum-likelihood-type (GM)-estimator, the unscented Kalman filter (UKF), and the H-infinity filter are integrated into a unified framework to yield various centralized and decentralized robust dynamic state estimators. These new estimators include the GM-iterated extended Kalman filter (GM-IEKF), the GM-UKF, the H-infinity UKF and the robust H-infinity UKF. The GM-IEKF is able to handle observation and innovation outliers but its statistical efficiency is low in the presence of non-Gaussian system process and measurement noise. The GM-UKF addresses this issue and achieves a high statistical efficiency under a broad range of non-Gaussian process and observation noise while maintaining the robustness to observation and innovation outliers. A reformulation of the GM-UKF with multiple hypothesis testing further enables it to handle structural outliers. However, the GM-UKF may yield biased state estimates in presence of large system uncertainties. To this end, the H-infinity UKF that relies on robust control theory is proposed. It is shown that H-infinity is able to bound the system uncertainties but lacks of robustness to outliers and non-Gaussian noise. Finally, the robust H-infinity filter framework is proposed that leverages the H-infinity criterion to bound system uncertainties while relying on the robustness of GM-estimator to filter out non-Gaussian noise and suppress outliers. Furthermore, these new robust estimators are applied for system bus frequency monitoring and control and synchronous generator model parameter calibration. Case studies of several different IEEE standard systems show the efficiency and robustness of the proposed estimators. / Ph. D.
163

Shluková analýza rozsáhlých souborů dat: nové postupy založené na metodě k-průměrů / Cluster analysis of large data sets: new procedures based on the method k-means

Žambochová, Marta January 2005 (has links)
Abstract Cluster analysis has become one of the main tools used in extracting knowledge from data, which is known as data mining. In this area of data analysis, data of large dimensions are often processed, both in the number of objects and in the number of variables, which characterize the objects. Many methods for data clustering have been developed. One of the most widely used is a k-means method, which is suitable for clustering data sets containing large number of objects. It is based on finding the best clustering in relation to the initial distribution of objects into clusters and subsequent step-by-step redistribution of objects belonging to the clusters by the optimization function. The aim of this Ph.D. thesis was a comparison of selected variants of existing k-means methods, detailed characterization of their positive and negative characte- ristics, new alternatives of this method and experimental comparisons with existing approaches. These objectives were met. I focused on modifications of the k-means method for clustering of large number of objects in my work, specifically on the algorithms BIRCH k-means, filtering, k-means++ and two-phases. I watched the time complexity of algorithms, the effect of initialization distribution and outliers, the validity of the resulting clusters. Two real data files and some generated data sets were used. The common and different features of method, which are under investigation, are summarized at the end of the work. The main aim and benefit of the work is to devise my modifications, solving the bottlenecks of the basic procedure and of the existing variants, their programming and verification. Some modifications brought accelerate the processing. The application of the main ideas of algorithm k-means++ brought to other variants of k-means method better results of clustering. The most significant of the proposed changes is a modification of the filtering algorithm, which brings an entirely new feature of the algorithm, which is the detection of outliers. The accompanying CD is enclosed. It includes the source code of programs written in MATLAB development environment. Programs were created specifically for the purpose of this work and are intended for experimental use. The CD also contains the data files used for various experiments.
164

Confiabilidade de rede GPS de referência cadastral municipal - estudo de caso : rede do município de Vitória (ES) / Reliability of network GPS of municipal cadastral reference - study of case : network of the municipal district of Vitória (ES)

Amorim, Geraldo Passos 25 March 2004 (has links)
A proposta deste trabalho é estudar as teorias de análise de qualidade de rede GPS, baseando-se nas teorias de confiabilidade de rede propostas por Baarda, em 1968. As hipóteses estatísticas para detecção de "outliers" constituem a base desse estudo, pois são fundamentais para elaboração dos testes de detecção de "outliers", localização e eliminação de erros grosseiros e, também, para a análise da confiabilidade da rede. A confiabilidade, que traduz a controlabilidade da rede e depende do número de redundância, é estudada em dois aspectos: confiabilidade interna e confiabilidade externa. A rede de referência cadastral do município de Vitória – ES, escolhida para o estudo de caso foi estabelecida por GPS, em 2001, tendo como concepção básica a implantação de 37 pares de vértices intervisíveis, privilegiando locais públicos e de livre acesso. Essa rede foi ajustada em 2001 pela Prefeitura Municipal de Vitória, e as coordenadas ajustadas dos vértices são usadas, deste então, para apoiar todos os levantamentos topográficos e cadastrais realizados no município. O ajustamento dessa rede, em 2001, constituiu-se de um ajustamento simples em que os testes estatísticos de detecção de "outliers", a localização e eliminação dos erros grosseiros não foram levados em conta. A parte prática desta pesquisa compreendeu a medição de 21 novos vetores (linhas bases) para formar uma rede de controle, conforme estabelece a NBR-14166, o ajustamento dessa rede de controle (15 vértices) e o ajustamento da rede principal (78 vértices), tendo por injunção a rede de controle previamente ajustada. A principal diferença ente o ajustamento de 2001, feito pela Prefeitura Municipal de Vitória, e ajustamento de 2004, feito para esta pesquisa, foi a consideração no novo ajustamento dos testes estatísticos baseados nas teorias de confiabilidade propostas por Baarda. A comparação entre os resultados dos dois ajustamentos da rede cadastral de Vitória não apontou diferenças significativas entre as coordenadas ajustadas / The proposal of this work is to study the theories of analysis of network quality GPS, basing on the theories of reliability network proposed by Baarda, in 1968. The statistical hypotheses for outlier's detection constitute the base of this study, because they are fundamental for elaboration of the tests of outlier's detection tests, location and elimination of observations with gross errors as well as for the analysis of the realiability of the network. The reliability, that translates the controllability of the network and it depends of the redundancy number, it was studied in two aspects: internal reliability and external reliability. The network of cadastral reference of the municipal district of Vitória (ES), chosen for the case study it established by GPS, in 2001. The basic conception of this network was the implantation of 37 pair of vertexes inter-visible, privileging public places (of free access), as sidewalks and central stonemasons. This network adjusted in 2001 by the Municipal City Hall of Vitória, and the adjusted coordinates of the vertexes used, of this then, to support all topographical and cadastral survey accomplished in the municipal district. The adjustment of this network, in 2001, constituted of a simple adjustment in that did not take into account the statistical tests of outlier's detection and location and elimination of observations with gross errors. The practical part of this research was constituted of the measurement of 21 new vectors (line bases) to form a control network, as it establishes NBR-14166, the adjustment of that control network (15 vertexes) and the adjustment of the main network (78 vertexes), tends previously for injunction the control network adjusted. To principal it differentiates being the adjustment of 2001, done by the Municipal City Hall of Vitória, and adjustment of 2004, done for this research; it was the consideration in the new adjustment of the based statistical tests, mainly, in the reliability theories proposed by Baarda. The results of the adjustment of 2001 and of 2004 compared, and it verified that, in the case of the cadastral network of Vitória, there was not significant difference among results found in the two adjustments
165

Extensões dos modelos de regressão quantílica bayesianos / Extensions of bayesian quantile regression models

Santos, Bruno Ramos dos 29 April 2016 (has links)
Esta tese visa propor extensões dos modelos de regressão quantílica bayesianos, considerando dados de proporção com inflação de zeros, e também dados censurados no zero. Inicialmente, é sugerida uma análise de observações influentes, a partir da representação por mistura localização-escala da distribuição Laplace assimétrica, em que as distribuições a posteriori das variáveis latentes são comparadas com o intuito de identificar possíveis observações aberrantes. Em seguida, é proposto um modelo de duas partes para analisar dados de proporção com inflação de zeros ou uns, estudando os quantis condicionais e a probabilidade da variável resposta ser igual a zero. Além disso, são propostos modelos de regressão quantílica bayesiana para dados contínuos com um componente discreto no zero, em que parte dessas observações é suposta censurada. Esses modelos podem ser considerados mais completos na análise desse tipo de dados, uma vez que a probabilidade de censura é verificada para cada quantil de interesse. E por último, é considerada uma aplicação desses modelos com correlação espacial, para estudar os dados da eleição presidencial no Brasil em 2014. Nesse caso, os modelos de regressão quantílica são capazes de incorporar essa informação espacial a partir do processo Laplace assimétrico. Para todos os modelos propostos foi desenvolvido um pacote do software R, que está exemplificado no apêndice. / This thesis aims to propose extensions of Bayesian quantile regression models, considering proportion data with zero inflation, and also censored data at zero. Initially, it is suggested an analysis of influential observations, based on the location-scale mixture representation of the asymmetric Laplace distribution, where the posterior distribution of the latent variables are compared with the goal of identifying possible outlying observations. Next, a two-part model is proposed to analyze proportion data with zero or one inflation, studying the conditional quantile and the probability of the response variable being equal to zero. Following, Bayesian quantile regression models are proposed for continuous data with a discrete component at zero, where part of these observations are assumed censored. These models may be considered more complete in the analysis of this type of data, as the censoring probability varies with the quantiles of interest. For last, it is considered an application of these models with spacial correlation, in order to study the data about the last presidential election in Brazil in 2014. In this example, the quantile regression models are able to incorporate spatial dependence with the asymmetric Laplace process. For all the proposed models it was developed a R package, which is exemplified in the appendix.
166

從貝氏觀點診斷離群值及具有影響力之觀察值 / Some diagnostics for outliers and influential observations from Bayesian point of view

謝季英, Shieh, Jih Ing Unknown Date (has links)
在線性迴歸分析中,資料的不適當,常導致研究者選擇了不當的模式,為避免此缺失,在分析資料前須先做好診斷工作。本文中將從貝氏觀點提出一些不同的診斷方法以供參考。首先推導出均數移動參數a=(a<sub>1</sub>,…,a<sub>k</sub>)'的事後分配,並利用a'a/k的事後均數診斷出不當資料點。接著,考慮在個別模式下以β事後分配之總變異及廣義變異為標準,診斷出離群值及具有潛在影響力之觀測值。最後,分別利用(i)β的事後分配(ii)σ<sup>2</sup>的事後分配(iii)(β,σ<sup>2</sup>)的聯合事後分配,推導出對應的對稱均方差以做為診斷標準。 / In this thesis, some different diagnostic methodologies for outliers and influential observations from Bayesian point of view are proposed. We firstly derive the marginal posterior distribution of the mean-shift parameter a=(a<sub>1</sub>,a<sub>k</sub>)<sup>1</sup>, then use the posterior mean of a<sup>1</sup>a/k to detect the spurious data items. Secondly, we use the posterior total variance and generalized variance of β as diagnostic criterions for outliers and influential observations. Finally, we utilize (i) the posterior distribution of β, (ii) the posterior distribution of σ<sup>2</sup>, and (iii) the joint posterior distribution of β, σ<sup>2</sup> to find their corresponding symmetric mean square differences , which can be used as diagnostic criterions.
167

季節性時間序列之預測─類神經網路模式之探討 / Forecasting Seasonal Time Series : A Neural Network Approach

賴家瑞, Lia, Chia Jui Unknown Date (has links)
本論文主要研究以類神經網路模式預測季節性時間序列之有效性。利用適 當地建構樣本訓練集,網路經訓練後可作為季節性時間序列之預測工具。 文中亦提出移動學習法以期提高預測之準確度。並以台灣地區每季進口商 品與勞務總值則作為實證之研究。此季節性時間序列因受離群值之影響而 增加其預測困難度。實證結果顯示類神經網路模式之預測表現較傳統之統 計方法優異,即使此序列受到離群值之干擾。 / We investigate the effectiveness of neural networks for predicting the future behavior of seasonal time series. Utilizing the training set constructed properly, we can train the network who can be used to predict the future of seasonal time series. A shifting-learning method is also employed in order to obtained a better forecasting performance. The quarterly imports of goods and services of Taiwan between the first quarter of 1968 and the fourth quarter of 1990 are studied in the research. The series are contaminated with outliers, which will increase the difficulty of forecasting. Empirical results exhibit that neural networks model free approach have better prediction performance than the classical Box-Jenkins approach, even the series are contaminated with outliers.
168

Phenology in Germany in the 20th century : methods, analyses and models

Schaber, Jörg January 2002 (has links)
Die Länge der Vegetationsperiode (VP) spielt eine zentrale Rolle für die interannuelle Variation der Kohlenstoffspeicherung terrestrischer Ökosysteme. Die Analyse von Beobachtungsdaten hat gezeigt, dass sich die VP in den letzten Jahrzehnten in den nördlichen Breiten verlängert hat. Dieses Phänomen wurde oft im Zusammenhang mit der globalen Erwärmung diskutiert, da die Phänologie von der Temperatur beeinflusst wird.<br /> <br /> Die Analyse der Pflanzenphänologie in Süddeutschland im 20. Jahrhundert zeigte:<br /> - Die starke Verfrühung der Frühjahrsphasen in dem Jahrzehnt vor 1999 war kein singuläres Ereignis im 20. Jahrhundert. Schon in früheren Dekaden gab es ähnliche Trends. Es konnten Perioden mit unterschiedlichem Trendverhalten identifiziert werden.<br /> - Es gab deutliche Unterschiede in den Trends von frühen und späten Frühjahrsphasen. Die frühen Frühjahrsphasen haben sich stetig verfrüht, mit deutlicher Verfrühung zwischen 1931 und 1948, moderater Verfrühung zwischen 1948 und 1984 und starker Verfrühung zwischen 1984 und 1999. Die späten Frühjahrsphasen hingegen, wechselten ihr Trendverhalten in diesen Perioden von einer Verfrühung zu einer deutlichen Verspätung wieder zu einer starken Verfrühung.<br /> <br /> Dieser Unterschied in der Trendentwicklung zwischen frühen und späten Frühjahrsphasen konnte auch für ganz Deutschland in den Perioden 1951 bis 1984 und 1984 bis 1999 beobachtet werden.<br /> Der bestimmende Einfluss der Temperatur auf die Frühjahrsphasen und ihr modifizierender Einfluss auf die Herbstphasen konnte bestätigt werden. Es zeigt sich jedoch, dass <br /> - die Phänologie bestimmende Funktionen der Temperatur nicht mit einem globalen jährlichen CO2 Signal korreliert waren, welches als Index für die globale Erwärmung verwendet wurde<br /> - ein Index für grossräumige regionale Zirkulationsmuster (NAO-Index) nur zu einem kleinen Teil die beobachtete phänologischen Variabilität erklären konnte.<br /> <br /> Das beobachtete unterschiedliche Trendverhalten zwischen frühen und späten Frühjahrsphasen konnte auf die unterschiedliche Entwicklung von März- und Apriltemperaturen zurückgeführt werden. Während sich die Märztemperaturen im Laufe des 20. Jahrhunderts mit einer zunehmenden Variabilität in den letzten 50 Jahren stetig erhöht haben, haben sich die Apriltemperaturen zwischen dem Ende der 1940er und Mitte der 1980er merklich abgekühlt und dann wieder deutlich erwärmt.<br /> Es wurde geschlussfolgert, dass die Verfrühungen in der Frühjahrsphänologie in den letzten Dekaden Teile multi-dekadischer Fluktuationen sind, welche sich nach Spezies und relevanter saisonaler Temperatur unterscheiden. Aufgrund dieser Fluktuationen konnte kein Zusammenhang mit einem globalen Erwärmungsignal gefunden werden.<br /> Im Durchschnitt haben sich alle betrachteten Frühjahrsphasen zwischen 1951 und 1999 in Naturräumen in Deutschland zwischen 5 und 20 Tagen verfrüht. Ein starker Unterschied in der Verfrühung zwischen frühen und späten Frühjahrsphasen liegt an deren erwähntem unterschiedlichen Verhalten. Die Blattverfärbung hat sich zwischen 1951 und 1999 für alle Spezies verspätet, aber nach 1984 im Durchschnitt verfrüht. Die VP hat sich in Deutschland zwischen 1951 und 1999 um ca. 10 Tage verlängert.<br /> Es ist hauptsächlich die Änderung in den Frühjahrphasen, die zu einer Änderung in der potentiell absorbierten Strahlung (PAS) führt. Darüber hinaus sind es die späten Frühjahrsphasen, die pro Tag Verfrühung stärker profitieren, da die zusätzlichen Tage länger undwärmer sind als dies für die frühen Phasen der Fall ist. Um die relative Änderung in PAS im Vergleich der Spezies abzuschätzen, müssen allerdings auch die Veränderungen in den Herbstphasen berücksichtigt werden.<br /> Der deutliche Unterschied zwischen frühen und späten Frühjahrsphasen konnte durch die Anwendung einer neuen Methode zur Konstruktion von Zeitreihen herausgearbeitet werden. Der neue methodische Ansatz erlaubte die Ableitung verlässlicher 100-jähriger Zeitreihen und die Konstruktion von lokalen kombinierten Zeitreihen, welche die Datenverfügbarkeit für die Modellentwicklung erhöhten.<br /> Ausser analysierten Protokollierungsfehlern wurden mikroklimatische, genetische und Beobachtereinflüsse als Quellen von Unsicherheit in phänologischen Daten identifiziert. Phänologischen Beobachtungen eines Ortes können schätzungsweise 24 Tage um das parametrische Mittel schwanken.Dies unterstützt die 30-Tage Regel für die Detektion von Ausreissern.<br /> Neue Phänologiemodelle, die den Blattaustrieb aus täglichen Temperaturreihen simulieren, wurden entwickelt. Diese Modelle basieren auf einfachen Interaktionen zwischen aktivierenden und hemmenden Substanzen, welche die Entwicklungsstadien einer Pflanze bestimmen. Im Allgemeinen konnten die neuen Modelle die Beobachtungsdaten besser simulieren als die klassischen Modelle.<br /> <br /> Weitere Hauptresultate waren:<br /> - Der Bias der klassischen Modelle, d.h. Überschätzung von frühen und Unterschätzung von späten Beobachtungen, konnte reduziert, aber nicht vollständig eliminiert werden.<br /> - Die besten Modellvarianten für verschiedene Spezies wiesen darauf hin, dass für die späten Frühjahrsphasen die Tageslänge eine wichtigere Rolle spielt als für die frühen Phasen.<br /> - Die Vernalisation spielte gegenüber den Temperaturen kurz vor dem Blattaustrieb nur eine untergeordnete Rolle. / The length of the vegetation period (VP) plays a central role for the interannual variation of carbon fixation of terrestrial ecosystems. Observational data analysis has indicated that the length of the VP has increased in the last decades in the northern latitudes mainly due to an advancement of bud burst (BB). This phenomenon has been widely discussed in the context of Global Warming because phenology is correlated to temperatures. <br /> <br /> Analyzing the patterns of spring phenology over the last century in Southern Germany provided two main findings:<br /> - The strong advancement of spring phases especially in the decade before 1999 is not a singular event in the course of the 20th century. Similar trends were also observed in earlier decades. Distinct periods of varying trend behavior for important spring phases could be distinguished.<br /> - Marked differences in trend behavior between the early and late spring phases were detected. Early spring phases changed as regards the magnitude of their negative trends from strong negative trends between 1931 and 1948 to moderate negative trends between 1948 and 1984 and back to strong negative trends between 1984 and 1999. Late spring phases showed a different behavior. Negative trends between 1931 and 1948 are followed by marked positive trends between 1948 and 1984 and then strong negative trends between 1984 and 1999.<br /> This marked difference in trend development between early and late spring phases was also found all over Germany for the two periods 1951 to 1984 and 1984 to 1999.<br /> <br /> The dominating influence of temperature on spring phenology and its modifying effect on autumn phenology was confirmed in this thesis. However,<br /> - temperature functions determining spring phenology were not significantly correlated with a global annual CO2 signal which was taken as a proxy for a Global Warming pattern.<br /> - an index for large scale regional circulation patterns (NAO index) could only to a small part explain the observed phenological variability in spring.<br /> <br /> The observed different trend behavior of early and late spring phases is explained by the differing behavior of mean March and April temperatures. Mean March temperatures have increased on average over the 20th century accompanied by an increasing variation in the last 50 years. April temperatures, however, decreased between the end of the 1940s and the mid-1980s, followed by a marked warming after the mid-1980s. <br /> It can be concluded that the advancement of spring phenology in recent decades are part of multi-decadal fluctuations over the 20th century that vary with the species and the relevant seasonal temperatures. Because of these fluctuations a correlation with an observed Global Warming signal could not be found.<br /> On average all investigated spring phases advanced between 5 and 20 days between 1951 and 1999 for all Natural Regions in Germany. A marked difference be! tween late and early spring phases is due to the above mentioned differing behavior before and after the mid-1980s. Leaf coloring (LC) was delayed between 1951 and 1984 for all tree species. However, after 1984 LC was advanced. Length of the VP increased between 1951 and 1999 for all considered tree species by an average of ten days throughout Germany.<br /> It is predominately the change in spring phases which contributes to a change in the potentially absorbed radiation. Additionally, it is the late spring species that are relatively more favored by an advanced BB because they can additionally exploit longer days and higher temperatures per day advancement. To assess the relative change in potentially absorbed radiation among species, changes in both spring and autumn phenology have to be considered as well as where these changes are located in the year.<br /> For the detection of the marked difference between early and late spring phenology a new time series construction method was developed. This method allowed the derivation of reliable time series that spanned over 100 years and the construction of locally combined time series increasing the available data for model development.<br /> Apart from analyzed protocolling errors, microclimatic site influences, genetic variation and the observers were identified as sources of uncertainty of phenological observational data. It was concluded that 99% of all phenological observations at a certain site will vary within approximately 24 days around the parametric mean. This supports to the proposed 30-day rule to detect outliers. <br /> New phenology models that predict local BB from daily temperature time series were developed. These models were based on simple interactions between inhibitory and promotory agents that are assumed to control the developmental status of a plant. Apart from the fact that, in general, the new models fitted and predicted the observations better than classical models, the main modeling results were: <br /> - The bias of the classical models, i.e. overestimation of early observations and underestimation of late observations, could be reduced but not completely removed. <br /> - The different favored model structures for each species indicated that for the late spring phases photoperiod played a more dominant role than for early spring phases. <br /> - Chilling only plays a subordinate role for spring BB compared to temperatures directly preceding BB.
169

Robust estimation for spatial models and the skill test for disease diagnosis

Lin, Shu-Chuan 25 August 2008 (has links)
This thesis focuses on (1) the statistical methodologies for the estimation of spatial data with outliers and (2) classification accuracy of disease diagnosis. Chapter I, Robust Estimation for Spatial Markov Random Field Models: Markov Random Field (MRF) models are useful in analyzing spatial lattice data collected from semiconductor device fabrication and printed circuit board manufacturing processes or agricultural field trials. When outliers are present in the data, classical parameter estimation techniques (e.g., least squares) can be inefficient and potentially mislead the analyst. This chapter extends the MRF model to accommodate outliers and proposes robust parameter estimation methods such as the robust M- and RA-estimates. Asymptotic distributions of the estimates with differentiable and non-differentiable robustifying function are derived. Extensive simulation studies explore robustness properties of the proposed methods in situations with various amounts of outliers in different patterns. Also provided are studies of analysis of grid data with and without the edge information. Three data sets taken from the literature illustrate advantages of the methods. Chapter II, Extending the Skill Test for Disease Diagnosis: For diagnostic tests, we present an extension to the skill plot introduced by Mozer and Briggs (2003). The method is motivated by diagnostic measures for osteoporosis in a study. By restricting the area under the ROC curve (AUC) according to the skill statistic, we have an improved diagnostic test for practical applications by considering the misclassification costs. We also construct relationships, using the Koziol-Green model and mean-shift model, between the diseased group and the healthy group for improving the skill statistic. Asymptotic properties of the skill statistic are provided. Simulation studies compare the theoretical results and the estimates under various disease rates and misclassification costs. We apply the proposed method in classification of osteoporosis data.
170

Analyse des modèles résines pour la correction des effets de proximité en lithographie optique / Resist modeling analysis for optical proximity correction effect in optical lithography

Top, Mame Kouna 12 January 2011 (has links)
Les progrès réalisés dans la microélectronique répondent à la problématique de la réduction des coûts de production et celle de la recherche de nouveaux marchés. Ces progrès sont possibles notamment grâce à ceux effectués en lithographie optique par projection, le procédé lithographique principalement utilisé par les industriels. La miniaturisation des circuits intégrés n’a donc été possible qu’en poussant les limites d’impression lithographique. Cependant en réduisant les largeurs des transistors et l’espace entre eux, on augmente la sensibilité du transfert à ce que l’on appelle les effets de proximité optique au fur et à mesure des générations les plus avancées de 45 et 32 nm de dimension de grille de transistor.L’utilisation des modèles OPC est devenue incontournable en lithographie optique, pour les nœuds technologiques avancés. Les techniques de correction des effets de proximité (OPC) permettent de garantir la fidélité des motifs sur plaquette, par des corrections sur le masque. La précision des corrections apportées au masque dépend de la qualité des modèles OPC mis en œuvre. La qualité de ces modèles est donc primordiale. Cette thèse s’inscrit dans une démarche d’analyse et d’évaluation des modèles résine OPC qui simulent le comportement de la résine après exposition. La modélisation de données et l’analyse statistique ont été utilisées pour étudier ces modèles résine de plus en plus empiriques. Outre la fiabilisation des données de calibrage des modèles, l’utilisation des plateformes de création de modèles dédiées en milieu industriel et la méthodologie de création et de validation des modèles OPC ont également été étudié. Cette thèse expose le résultat de l’analyse des modèles résine OPC et propose une nouvelles méthodologie de création, d’analyse et de validation de ces modèles. / The Progress made in microelectronics responds to the matter of production costs reduction and to the search of new markets. These progresses have been possible thanks those made in optical lithography, the printing process principally used in integrated circuit (IC) manufacturing.The miniaturization of integrated circuits has been possible only by pushing the limits of optical resolution. However this miniaturization increases the sensitivity of the transfer, leading to more proximity effects at progressively more advanced technology nodes (45 and 32 nm in transistor gate size). The correction of these optical proximity effects is indispensible in photolithographic processes for advanced technology nodes. Techniques of optical proximity correction (OPC) enable to increase the achievable resolution and the pattern transfer fidelity for advanced lithographic generations. Corrections are made on the mask based on OPC models which connect the image on the resin to the changes made on the mask. The reliability of these OPC models is essential for the improvement of the pattern transfer fidelity.This thesis analyses and evaluates the OPC resist models which simulates the behavior of the resist after the photolithographic process. Data modeling and statistical analysis have been used to study these increasingly empirical resist models. Besides the model calibration data reliability, we worked on the way of using the models calibration platforms generally used in IC manufacturing.This thesis exposed the results of the analysis of OPC resist models and proposes a new methodology for OPC resist models creation, analysis and validation.

Page generated in 0.0749 seconds