• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Variable selection and neural networks for high-dimensional data analysis: application in infrared spectroscopy and chemometrics

Benoudjit, Nabil 24 November 2003 (has links)
This thesis focuses particularly on the application of chemometrics in the field of analytical chemistry. Chemometrics (or multivariate analysis) consists in finding a relationship between two groups of variables, often called dependent and independent variables. In infrared spectroscopy for instance, chemometrics consists in the prediction of a quantitative variable (the obtention of which is delicate, requiring a chemical analysis and a qualified operator), such as the concentration of a component present in the studied product from spectral data measured on various wavelengths or wavenumbers (several hundreds, even several thousands). In this research we propose a methodology in the field of chemometrics to handle the chemical data (spectrophotometric data) which are often in high dimension. To handle these data, we first propose a new incremental method (step-by-step) for the selection of spectral data using linear and non-linear regression based on the combination of three principles: linear or non-linear regression, incremental procedure for the variable selection, and use of a validation set. This procedure allows on one hand to benefit from the advantages of non-linear methods to predict chemical data (there is often a non-linear relationship between dependent and independent variables), and on the other hand to avoid the overfitting phenomenon, one of the most crucial problems encountered with non-linear models. Secondly, we propose to improve the previous method by a judicious choice of the first selected variable, which has a very important influence on the final performances of the prediction. The idea is to use a measure of the mutual information between the independent and dependent variables to select the first one; then the previous incremental method (step-by-step) is used to select the next variables. The variable selected by mutual information can have a good interpretation from the spectrochemical point of view, and does not depend on the data distribution in the training and validation sets. On the contrary, the traditional chemometric linear methods such as PCR or PLSR produce new variables which do not have any interpretation from the spectrochemical point of view. Four real-life datasets (wine, orange juice, milk powder and apples) are presented in order to show the efficiency and advantages of both proposed procedures compared to the traditional chemometric linear methods often used, such as MLR, PCR and PLSR.
2

A Logistic Regression Analysis of Utah Colleges Exit Poll Response Rates Using SAS Software

Stevenson, Clint W. 27 October 2006 (has links) (PDF)
In this study I examine voter response at an interview level using a dataset of 7562 voter contacts (including responses and nonresponses) in the 2004 Utah Colleges Exit Poll. In 2004, 4908 of the 7562 voters approached responded to the exit poll for an overall response rate of 65 percent. Logistic regression is used to estimate factors that contribute to a success or failure of each interview attempt. This logistic regression model uses interviewer characteristics, voter characteristics (both respondents and nonrespondents), and exogenous factors as independent variables. Voter characteristics such as race, gender, and age are strongly associated with response. An interviewer's prior retail sales experience is associated with whether a voter will decide to respond to a questionnaire or not. The only exogenous factor that is associated with voter response is whether the interview occurred in the morning or afternoon.

Page generated in 0.1071 seconds