1 |
Incremental nonparametric discriminant analysis based active learning and its applicationsDhoble, Kshitij January 2010 (has links)
Learning is one such innate general cognitive ability which has empowered the living animate entities and especially humans with intelligence. It is obtained by acquiring new knowledge and skills that enable them to adapt and survive. With the advancement of technology, a large amount of information gets amassed. Due to the sheer volume of increasing information, its analysis is humanly unfeasible and impractical. Therefore, for the analysis of massive data we need machines (such as computers) with the ability to learn and evolve in order to discover new knowledge from the analysed data. The majority of the traditional machine learning algorithms function optimally on a parametric (static) data. However, the datasets acquired in real practices are often vast, inaccurate, inconsistent, non-parametric and highly volatile. Therefore, the learning algorithms’ optimized performance can only be transitory, thus requiring a learning algorithm that can constantly evolve and adapt according to the data it processes. In light of a need for such machine learning algorithm, we look for the inspiration in humans’ innate cognitive learning ability. Active learning is one such biologically inspired model, designed to mimic humans’ dynamic, evolving, adaptive and intelligent cognitive learning ability. Active learning is a class of learning algorithms that aim to create an accurate classifier by iteratively selecting essentially important unlabeled data points by the means of adaptive querying and training the classifier on those data points which are potentially useful for the targeted learning task (Tong & Koller, 2002). The traditional active learning techniques are implemented under supervised or semi-supervised learning settings (Pang et al., 2009). Our proposed model performs the active learning in an unsupervised setting by introducing a discriminative selective sampling criterion, which reduces the computational cost by substantially decreasing the number of irrelevant instances to be learned by the classifier. The methods based on passive learning (which assumes the entire dataset for training is truly informative and is presented in advance) prove to be inadequate in a real world application (Pang et al., 2009). To overcome this limitation, we have developed Active Mode Incremental Nonparametric Discriminant Analysis (aIncNDA) which undertakes adaptive discriminant selection of the instances for an incremental NDA learning. NDA is a discriminant analysis method that has been incorporated in our selective sampling technique in order to reduce the effects of the outliers (which are anomalous observations/data points in a dataset). It works with significant efficiency on the anomalous datasets, thereby minimizing the computational cost (Raducanu & Vitri´a, 2008). NDA is one of the methods used in the proposed active learning model. This thesis presents the research on a discrimination-based active learning where NDA is extended for fast discrimination analysis and data sampling. In addition to NDA, a base classifier (such as Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN)) is applied to discover and merge the knowledge from the newly acquired data. The performance of our proposed method is evaluated against benchmark University of California, Irvine (UCI) datasets, face image, and object image category datasets. The assessment that was carried out on the UCI datasets showed that Active Mode Incremental NDA (aIncNDA) performs at par and in many cases better than the incremental NDA with a lower number of instances. Additionally, aIncNDA also performs efficiently under the different levels of redundancy, but has an improved discrimination performance more often than a passive incremental NDA. In an application that undertakes the face image and object image recognition and retrieval task, it can be seen that the proposed multi-example active learning system dynamically and incrementally learns from the newly obtained images, thereby gradually reducing its retrieval (classification) error rate by the means of iterative refinement. The results of the empirical investigation show that our proposed active learning model can be used for classification with increased efficiency. Furthermore, given the nature of network data which is large, streaming, and constantly changing, we believe that our method can find practical application in the field of Internet security.
|
2 |
Incremental nonparametric discriminant analysis based active learning and its applicationsDhoble, Kshitij January 2010 (has links)
Learning is one such innate general cognitive ability which has empowered the living animate entities and especially humans with intelligence. It is obtained by acquiring new knowledge and skills that enable them to adapt and survive. With the advancement of technology, a large amount of information gets amassed. Due to the sheer volume of increasing information, its analysis is humanly unfeasible and impractical. Therefore, for the analysis of massive data we need machines (such as computers) with the ability to learn and evolve in order to discover new knowledge from the analysed data. The majority of the traditional machine learning algorithms function optimally on a parametric (static) data. However, the datasets acquired in real practices are often vast, inaccurate, inconsistent, non-parametric and highly volatile. Therefore, the learning algorithms’ optimized performance can only be transitory, thus requiring a learning algorithm that can constantly evolve and adapt according to the data it processes. In light of a need for such machine learning algorithm, we look for the inspiration in humans’ innate cognitive learning ability. Active learning is one such biologically inspired model, designed to mimic humans’ dynamic, evolving, adaptive and intelligent cognitive learning ability. Active learning is a class of learning algorithms that aim to create an accurate classifier by iteratively selecting essentially important unlabeled data points by the means of adaptive querying and training the classifier on those data points which are potentially useful for the targeted learning task (Tong & Koller, 2002). The traditional active learning techniques are implemented under supervised or semi-supervised learning settings (Pang et al., 2009). Our proposed model performs the active learning in an unsupervised setting by introducing a discriminative selective sampling criterion, which reduces the computational cost by substantially decreasing the number of irrelevant instances to be learned by the classifier. The methods based on passive learning (which assumes the entire dataset for training is truly informative and is presented in advance) prove to be inadequate in a real world application (Pang et al., 2009). To overcome this limitation, we have developed Active Mode Incremental Nonparametric Discriminant Analysis (aIncNDA) which undertakes adaptive discriminant selection of the instances for an incremental NDA learning. NDA is a discriminant analysis method that has been incorporated in our selective sampling technique in order to reduce the effects of the outliers (which are anomalous observations/data points in a dataset). It works with significant efficiency on the anomalous datasets, thereby minimizing the computational cost (Raducanu & Vitri´a, 2008). NDA is one of the methods used in the proposed active learning model. This thesis presents the research on a discrimination-based active learning where NDA is extended for fast discrimination analysis and data sampling. In addition to NDA, a base classifier (such as Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN)) is applied to discover and merge the knowledge from the newly acquired data. The performance of our proposed method is evaluated against benchmark University of California, Irvine (UCI) datasets, face image, and object image category datasets. The assessment that was carried out on the UCI datasets showed that Active Mode Incremental NDA (aIncNDA) performs at par and in many cases better than the incremental NDA with a lower number of instances. Additionally, aIncNDA also performs efficiently under the different levels of redundancy, but has an improved discrimination performance more often than a passive incremental NDA. In an application that undertakes the face image and object image recognition and retrieval task, it can be seen that the proposed multi-example active learning system dynamically and incrementally learns from the newly obtained images, thereby gradually reducing its retrieval (classification) error rate by the means of iterative refinement. The results of the empirical investigation show that our proposed active learning model can be used for classification with increased efficiency. Furthermore, given the nature of network data which is large, streaming, and constantly changing, we believe that our method can find practical application in the field of Internet security.
|
3 |
Essays on Nonparametric Methods in Econometrics / 計量経済学におけるノンパラメトリック手法に関する論文Yanagi, Takahide 25 May 2015 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(経済学) / 甲第19164号 / 経博第518号 / 新制||経||274(附属図書館) / 32156 / 京都大学大学院経済学研究科経済学専攻 / (主査)教授 西山 慶彦, 准教授 奥井 亮, 准教授 山田 憲 / 学位規則第4条第1項該当 / Doctor of Economics / Kyoto University / DFAM
|
4 |
Confronting Theory with Evidence: Methods & ApplicationsThomas, Stephanie January 2016 (has links)
Empirical economics frequently involves testing whether a theoretical proposition is evident in a data set. This thesis explores methods for confronting such theoretical propositions with evidence. Chapter 1 develops a methodological framework for assessing whether binary (`Yes'/`No') observations exhibit a discrete change, confronting a theoretical model with data from an experiment investigating the effect of introducing a private finance option into a public system of finance. Chapter 2 expands the framework to identify two discrete changes, applying the method to the evaluation of adherence to clinical practice guidelines. The framework uses a combination of existing analytical techniques and provides results which are robust and visually intuitive. The overall result is a methodology for evaluation of guideline adherence which leverages existing patient care records and is generalizable across clinical contexts. An application to a set of field data on supplemental oxygen administration decisions of volunteer medical first responders illustrates.
Chapter 3 compares the results of two mechanisms used to control industrial emissions. Cap and Trade imposes an absolute cap on emissions and any emission capacity not utilized by a firm can be sold to other firms via tradable permits. In Intensity Targets systems firms earn (owe) tradable credits for emissions below (above) a baseline implied by a relative Intensity Target. Cap and Trade is commonly believed to be superior to Intensity Targets because the relative Intensity Target subsidizes emissions. Chapter 3 reports on an experiment designed to test theoretical predictions in a long-run laboratory environment in which firms make emission abatement technology and output production decisions when demand for output is uncertain, and banking of tradable permits may or may not be permitted. Particular focus is placed on testing whether the flexibility inherent to Intensity Targets can lead them to be superior to Cap and Trade when demand is stochastic. / Thesis / Doctor of Philosophy (PhD)
|
5 |
Bayesian nonparametric analysis of longitudinal data with non-ignorable non-monotone missingnessCao, Yu 01 January 2019 (has links)
In longitudinal studies, outcomes are measured repeatedly over time, but in reality clinical studies are full of missing data points of monotone and non-monotone nature. Often this missingness is related to the unobserved data so that it is non-ignorable. In such context, pattern-mixture model (PMM) is one popular tool to analyze the joint distribution of outcome and missingness patterns. Then the unobserved outcomes are imputed using the distribution of observed outcomes, conditioned on missing patterns. However, the existing methods suffer from model identification issues if data is sparse in specific missing patterns, which is very likely to happen with a small sample size or a large number of repetitions. We extend the existing methods using latent class analysis (LCA) and a shared-parameter PMM. The LCA groups patterns of missingness with similar features and the shared-parameter PMM allows a subset of parameters to be different among latent classes when fitting a model, thus restoring model identifiability. A novel imputation method is also developed using the distribution of observed data conditioned on latent classes. We develop this model for continuous response data and extend it to handle ordinal rating scale data. Our model performs better than existing methods for data with small sample size. The method is applied to two datasets from a phase II clinical trial that studies the quality of life for patients with prostate cancer receiving radiation therapy, and another to study the relationship between the perceived neighborhood condition in adolescence and the drinking habit in adulthood.
|
6 |
Die statistische Auswertung von ordinalen Daten bei zwei Zeitpunkten und zwei Stichproben / The Statistical Analysis of Ordinal Data at two Timepoints and two GroupsSiemer, Alexander 03 April 2002 (has links)
No description available.
|
Page generated in 0.0877 seconds