1 |
Color Image Based Face RecognitionGanapathi, Tejaswini 24 February 2009 (has links)
Traditional appearance based face recognition (FR) systems use gray scale images, however recently attention has been drawn to the use of color images. Color inputs have a larger dimensionality, which increases the computational cost, and makes the small sample size (SSS) problem in supervised FR systems more challenging. It is therefore important to determine the scenarios in which usage of color information helps the FR system.
In this thesis, it was found that inclusion of chromatic information in FR systems is shown to be particularly advantageous in poor illumination conditions. In supervised
systems, a color input of optimal dimensionality would improve the FR performance under SSS conditions. A fusion of decisions from individual spectral planes also helps in the SSS scenario. Finally, chromatic information is integrated into a supervised ensemble learner to address pose and illumination variations. This framework significantly boosts FR performance under a range of learning scenarios.
|
2 |
A New Reclassification Method for Highly Uncertain Microarray Data in Allergy Gene PredictionPaul, Jasmin 11 April 2012 (has links)
The analysis of microarray data is a challenging task because of the large dimensionality and small sample size involved. Although a few methods are available to address the problem of small sample size, they are not sufficiently successful in dealing with microarray data from extremely small (~<20) sample sizes. We propose a method to incorporate information from diverse sources to analyze the microarray data so as to improve the predictability of significant genes. A transformed data set, including statistical parameters, literature mining and gene ontology data, is evaluated. We performed classification experiments to identify potential allergy-related genes. Feature selection is used to identify the effect of features on classifier behaviour.
An exploratory and domain knowledge analysis was performed on noisy real-life allergy data, and a subset of genes was selected as positive and negative class. A new set of transformed variables, depending on the mean and standard deviation statistics of the data distribution and other data sources, was identified. Significant allergy- and immune-related genes from the microarray data were selected. Experiments showed that classification predictability of significant genes can be improved. Important features from the transformed variable set were also identified.
|
3 |
Color Image Based Face RecognitionGanapathi, Tejaswini 24 February 2009 (has links)
Traditional appearance based face recognition (FR) systems use gray scale images, however recently attention has been drawn to the use of color images. Color inputs have a larger dimensionality, which increases the computational cost, and makes the small sample size (SSS) problem in supervised FR systems more challenging. It is therefore important to determine the scenarios in which usage of color information helps the FR system.
In this thesis, it was found that inclusion of chromatic information in FR systems is shown to be particularly advantageous in poor illumination conditions. In supervised
systems, a color input of optimal dimensionality would improve the FR performance under SSS conditions. A fusion of decisions from individual spectral planes also helps in the SSS scenario. Finally, chromatic information is integrated into a supervised ensemble learner to address pose and illumination variations. This framework significantly boosts FR performance under a range of learning scenarios.
|
4 |
A forecasting approach to estimating cartel damages : The importance of considering estimation uncertaintyProhorenko, Didrik January 2020 (has links)
In this study, I consider the performance of simple forecast models frequently applied in counterfactual analysis when the information at hand is limited. Furthermore, I discuss the robustness of the standard t-test commonly used to statistically detect cartels. I empirically verify that the standard t-statistics encompasses parameter estimation uncertainty when one of the time series in a two-sided t-test has been estimated. Thereafter, I compare the results with those from a corrected t-test, recently proposed, where the uncertainty has been accounted for. The results from the study show that a simple OLS-model can be used to detect a cartel and to compute a counterfactual price when data is limited, at least as long as the price overcharge inflicted by the cartel members is relatively large. Yet, the level of accuracy may vary and at a point where the data used for estimating the model become relatively limited, the model predictions tend to be inaccurate.
|
5 |
Application Of Statistical Methods In Risk And ReliabilityHeard, Astrid 01 January 2005 (has links)
The dissertation considers construction of confidence intervals for a cumulative distribution function F(z) and its inverse at some fixed points z and u on the basis of an i.i.d. sample where the sample size is relatively small. The sample is modeled as having the flexible Generalized Gamma distribution with all three parameters being unknown. This approach can be viewed as an alternative to nonparametric techniques which do not specify distribution of X and lead to less efficient procedures. The confidence intervals are constructed by objective Bayesian methods and use the Jeffreys noninformative prior. Performance of the resulting confidence intervals is studied via Monte Carlo simulations and compared to the performance of nonparametric confidence intervals based on binomial proportion. In addition, techniques for change point detection are analyzed and further evaluated via Monte Carlo simulations. The effect of a change point on the interval estimators is studied both analytically and via Monte Carlo simulations.
|
6 |
A simulation study of the error induced in one-sided reliability confidence bounds for the Weiball distribution using a small sample size with heavily censored dataHartley, Michael A. 12 1900 (has links)
Approved for public release; distribution in unlimited. / Budget limitations have reduced the number of military components available for testing, and time constraints have reduced the amount of time available for actual testing resulting in many items still operating at the end of test cycles. These two factors produce small test populations (small sample size) with "heavily" censored data. The assumption of "normal approximation" for estimates based on these small sample sizes reduces the accuracy of confidence bounds of the probability plots and the associated quantities. This creates a problem in acquisition analysis because the confidence in the probability estimates influences the number of spare parts required to support a mission or deployment or determines the length of warranty ensuring proper operation of systems. This thesis develops a method that simulates small samples with censored data and examines the error of the Fisher-Matrix (FM) and the Likelihood Ratio Bounds (LRB) confidence methods of two test populations (size 10 and 20) with three, five, seven and nine observed failures for the Weibull distribution. This thesis includes a Monte Carlo simulation code written in S-Plus that can be modified by the user to meet their particular needs for any sampling and censoring scheme. To illustrate the approach, the thesis includes a catalog of corrected confidence bounds for the Weibull distribution, which can be used by acquisition analysts to adjust their confidence bounds and obtain a more accurate representation for warranty and reliability work. / Civilian, Department of the Air Force
|
7 |
A decompositional investigation of 3D face recognitionCook, James Allen January 2007 (has links)
Automated Face Recognition is the process of determining a subject's identity from digital imagery of their face without user intervention. The term in fact encompasses two distinct tasks; Face Verficiation is the process of verifying a subject's claimed identity while Face Identification involves selecting the most likely identity from a database of subjects. This dissertation focuses on the task of Face Verification, which has a myriad of applications in security ranging from border control to personal banking. Recently the use of 3D facial imagery has found favour in the research community due to its inherent robustness to the pose and illumination variations which plague the 2D modality. The field of 3D face recognition is, however, yet to fully mature and there remain many unanswered research questions particular to the modality. The relative expense and specialty of 3D acquisition devices also means that the availability of databases of 3D face imagery lags significantly behind that of standard 2D face images. Human recognition of faces is rooted in an inherently 2D visual system and much is known regarding the use of 2D image information in the recognition of individuals. The corresponding knowledge of how discriminative information is distributed in the 3D modality is much less well defined. This dissertations addresses these issues through the use of decompositional techniques. Decomposition alleviates the problems associated with dimensionality explosion and the Small Sample Size (SSS) problem and spatial decomposition is a technique which has been widely used in face recognition. The application of decomposition in the frequency domain, however, has not received the same attention in the literature. The use of decomposition techniques allows a map ping of the regions (both spatial and frequency) which contain the discriminative information that enables recognition. In this dissertation these techniques are covered in significant detail, both in terms of practical issues in the respective domains and in terms of the underlying distributions which they expose. Significant discussion is given to the manner in which the inherent information of the human face is manifested in the 2D and 3D domains and how these two modalities inter-relate. This investigation is extended to cover also the manner in which the decomposition techniques presented can be recombined into a single decision. Two new methods for learning the weighting functions for both the sum and product rules are presented and extensive testing against established methods is presented. Knowledge acquired from these examinations is then used to create a combined technique termed Log-Gabor Templates. The proposed technique utilises both the spatial and frequency domains to extract superior performance to either in isolation. Experimentation demonstrates that the spatial and frequency domain decompositions are complimentary and can combined to give improved performance and robustness.
|
8 |
Data-Driven Success in Infrastructure Megaprojects. : Leveraging Machine Learning and Expert Insights for Enhanced Prediction and Efficiency / Datadriven framgång inom infrastrukturmegaprojekt. : Utnyttja maskininlärning och expertkunskap för förbättrad prognostisering och effektivitet.Nordmark, David E.G. January 2023 (has links)
This Master's thesis utilizes random forest and leave-one-out cross-validation to predict the success of megaprojects involving infrastructure. The goal was to enhance the efficiency of the design and engineering phase of the infrastructure and construction industries. Due to the small sample size of megaprojects and limitated data sharing, the lack of data poses significant challenges for implementing artificial intelligence for the evaluation and prediction of megaprojects. This thesis explore how megaprojects can benefit from data collection and machine learning despite small sample sizes. The focus of the research was on analyzing data from thirteen megaprojects and identifying the most influential data for machine learning analysis. The results prove that the incorporation of expert data, representing critical success factors for megaprojects, significantly enhanced the accuracy of the predictive model. The superior performance of expert data over economic data, experience data, and documentation data demonstrates the significance of domain expertise. In addition, the results demonstrate the significance of the planning phase by implementing feature selection techniques and feature importance scores. In the planning phase, a small, devoted, and highly experienced team of project planners has proven to be a crucial factor for project success. The thesis concludes that in order for companies to maximize the utility of machine learning, they must identify their critical success factors and collect the corresponding data. / Denna magisteruppsats undersöker följande forskningsfråga: Hur kan maskininlärning och insiktsfull dataanalys användas för att öka effektiviteten i infrastruktursektorns plannerings- och designfas? Denna utmaning löses genom att analysera data från verkliga megaprojekt och tillämpa avancerade maskininlärningsalgoritmer för att förutspå projektframgång och ta reda på framgångsfaktorerna. Vår forskning är särskilt intresserad av megaprojekt på grund av deras komplicerade natur, unika egenskaper och enorma inverkan på samhället. Dessa projekt slutförs sällan, vilket gör att det är svårt att få tillgång till stora mängder verklig data. Det är uppenbart att AI har potential att vara ett ovärderligt verktyg för att förstå och hantera megaprojekts komplexitet, trots de problem vi står inför. Artificiell intelligens gör det möjligt att fatta beslut som är datadrivna och mer informerade. Uppsatsen lyckas med att hanterard det stora problemet som är bristen på data från megaprojekt. Uppsatsen motiveras även av denna brist på data, vilket gör forskningen relevant för andra områden som präglas av litet dataurval. Resultaten från uppsatsen visar att evalueringen av megaprojekt går att förbättra genom smart användning av specifika dataattribut. Uppsatsen inspirerar även företag att börja samla in viktig data för att möjliggöra användningen av artificiell intelligens och maskinginlärning till sin fördel.
|
Page generated in 0.0891 seconds