• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 299
  • 61
  • 39
  • 34
  • 23
  • 12
  • 9
  • 8
  • 8
  • 6
  • 4
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 601
  • 289
  • 120
  • 118
  • 96
  • 80
  • 60
  • 50
  • 49
  • 44
  • 36
  • 34
  • 33
  • 33
  • 32
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Missing imputation methods explored in big data analytics

Brydon, Humphrey Charles January 2018 (has links)
Philosophiae Doctor - PhD (Statistics and Population Studies) / The aim of this study is to look at the methods and processes involved in imputing missing data and more specifically, complete missing blocks of data. A further aim of this study is to look at the effect that the imputed data has on the accuracy of various predictive models constructed on the imputed data and hence determine if the imputation method involved is suitable. The identification of the missingness mechanism present in the data should be the first process to follow in order to identify a possible imputation method. The identification of a suitable imputation method is easier if the mechanism can be identified as one of the following; missing completely at random (MCAR), missing at random (MAR) or not missing at random (NMAR). Predictive models constructed on the complete imputed data sets are shown to be less accurate for those models constructed on data sets which employed a hot-deck imputation method. The data sets which employed either a single or multiple Monte Carlo Markov Chain (MCMC) or the Fully Conditional Specification (FCS) imputation methods are shown to result in predictive models that are more accurate. The addition of an iterative bagging technique in the modelling procedure is shown to produce highly accurate prediction estimates. The bagging technique is applied to variants of the neural network, a decision tree and a multiple linear regression (MLR) modelling procedure. A stochastic gradient boosted decision tree (SGBT) is also constructed as a comparison to the bagged decision tree. Final models are constructed from 200 iterations of the various modelling procedures using a 60% sampling ratio in the bagging procedure. It is further shown that the addition of the bagging technique in the MLR modelling procedure can produce a MLR model that is more accurate than that of the other more advanced modelling procedures under certain conditions. The evaluation of the predictive models constructed on imputed data is shown to vary based on the type of fit statistic used. It is shown that the average squared error reports little difference in the accuracy levels when compared to the results of the Mean Absolute Prediction Error (MAPE). The MAPE fit statistic is able to magnify the difference in the prediction errors reported. The Normalized Mean Bias Error (NMBE) results show that all predictive models constructed produced estimates that were an over-prediction, although these did vary depending on the data set and modelling procedure used. The Nash Sutcliffe efficiency (NSE) was used as a comparison statistic to compare the accuracy of the predictive models in the context of imputed data. The NSE statistic showed that the estimates of the models constructed on the imputed data sets employing a multiple imputation method were highly accurate. The NSE statistic results reported that the estimates from the predictive models constructed on the hot-deck imputed data were inaccurate and that a mean substitution of the fully observed data would have been a better method of imputation. The conclusion reached in this study shows that the choice of imputation method as well as that of the predictive model is dependent on the data used. Four unique combinations of imputation methods and modelling procedures were concluded for the data considered in this study.
92

Bayesian Nonresponse Models for the Analysis of Data from Small Areas: An Application to BMD and Age in NHANES III

Liu, Ning 28 April 2003 (has links)
We analyze data on bone mineral density (BMD) and age for white females age 20+ in the third National Health and Nutrition Examination Survey. For the sample the age of each individual is known, but some individuals did not have their BMD measured, mainly because they did not show up in the mobile examination centers. We have data from 35 counties, the small areas. We use two types of models to analyze the data. In the ignorable nonresponse model, BMD does not depend on whether an individual responds or not. In the nonignorable nonresponse model, BMD is related to whether he/she responds. We incorporate this relationship in our model by using a Bayesian approach. We further divide these two types of models into continuous and categorical data models. Our nonignorable nonresponse models have one important feature: They are ``close' to the ignorable nonresponse model thereby reducing the effects of the untestable assumptions so common in nonresponse models. In the continuous data models, because the age of all nonrespondents are known and there is a relation between BMD and age, age is used as a covariate. In the categorical data models BMD has three levels (normal, osteopenia, osteoporosis) and age has two levels (younger than 50 years, at least 50 years). Thus, age is a supplemental margin for the $2 imes 3$ categorical table. Our research on the categorical models is much deeper than on the continuous models. Our models are hierarchical, a feature that allows a ``borrowing of strength' across the counties. Individual inference for most of the counties is unreliable because there is large variation. This ``borrowing of strength' is therefore necessary because it permits a substantial reduction in variation. The joint posterior density of the parameters for each model is complex. Thus, we fit each model using Markov chain Monte Carlo methods to obtain samples from the posterior density. These samples are used to make inference about BMD and age, and the relation between BMD and age. For the continuous data models, we show that there is an important relation between BMD and age by using a deviance measure, and we show that the nonignorable nonresponse models are to be preferred. For the categorical data models, we are able to estimate the proportion of individuals in each BMD and age cell of the categorical table, and we can assess the relation between BMD and age using the Bayes factor. A sensitivity analysis shows that there are differences, typically small, in inference that permits different levels of association between BMD and age. A simulation study shows that there is not much difference in inference between the ignorable nonresponse models and the nonignorable nonresponse models. As expected, BMD depends on age and this inference can be obtained for some small counties. For the data we use, there are virtually no young individuals with osteoporosis. The nonignorable nonresponse models generalize the ignorable nonresponse models, and therefore, allow broader inference.
93

A fundamental residue pitch perception bias for tone language speakers

Petitti, Elizabeth Marie 08 April 2016 (has links)
A complex tone composed of only higher-order harmonics typically elicits a pitch percept equivalent to the tone's missing fundamental frequency (f0). When judging the direction of residue pitch change between two such tones, however, listeners may have completely opposite perceptual experiences depending on whether they are biased to perceive changes based on the overall spectrum or the missing f0 (harmonic spacing). Individual differences in residue pitch change judgments are reliable and have been associated with musical experience and functional neuroanatomy. Tone languages put greater pitch processing demands on their speakers than non-tone languages, and we investigated whether these lifelong differences in linguistic pitch processing affect listeners' bias for residue pitch. We asked native tone language speakers and native English speakers to perform a pitch judgment task for two tones with missing fundamental frequencies. Given tone pairs with ambiguous pitch changes, listeners were asked to judge the direction of pitch change, where the direction of their response indicated whether they attended to the overall spectrum (exhibiting a spectral bias) or the missing f0 (exhibiting a fundamental bias). We found that tone language speakers are significantly more likely to perceive pitch changes based on the missing f0 than English speakers. These results suggest that tone-language speakers' privileged experience with linguistic pitch fundamentally tunes their basic auditory processing.
94

Missing-ness, history and apartheid-era disappearances: The figuring of Siphiwo Mthimkulu, Tobekile ‘Topsy’ Madaka and Sizwe Kondile as missing dead persons

Moosage, Riedwaan January 2018 (has links)
Philosophiae Doctor - PhD / The argument of this dissertation calls for an abiding by missing-ness as it relates to apartheid-era disappearances. I am concerned with the ways in which the category missing is articulated in histories of apartheid-era disappearances through histories seeking to account for apartheid and how that category is enabled and /or constrained through mediating practices, processes and discourses such as that of forensics and history itself. My deployment of a notion of missing-ness therefore is put to work in underscoring notions of history and its relation to a category of missing persons in South Africa as they emerge and are figured through various discursive strategies constituted by and through apartheid’s violence and iterations thereof. I focus specifically on the enforced disappearances of Siphiwo Mthimkulu, Tobekile ‘Topsy’ Madaka and Sizwe Kondile and the vicarious ways in which they have been produced and (re)figured in a postapartheid present. Mthimkulu and Madaka were abducted, tortured, interrogated, killed and their bodies disposed through burning by apartheid’s security police in 1982. In 2007 South Africa’s Missing Persons Task Team exhumed commingled burnt human fragments at a farm, Post Chalmers. After two years of forensic examinations, those remains were identified as most likely those of Mthimkulu and Madaka. Their commingled remains were reburied in 2009 during an official government sanctioned Provincial re-burial. Kondile was similarly abducted in 1981 and after being imprisoned, tortured, interrogated and killed, his physical remains were burnt. The MPTT has been unsuccessful in locating and thus exhuming his remains for re-burial. Sizwe Kondile remains missing. Missing-ness as I evoke it serves to signal the lack and excess as potentiality and instability of histories accounting for the condition and symptom of being missing. The productivity of deploying missing-ness and an abidance to it in the ways I argue is precisely in not explicitly naming it, but rather by holding onto its elusiveness by marking the contours of discourses on absence-presence, those which it simultaneously touches upon and is constitutive of. Articulating it thus is to affirm missing-ness as a question that I argue, be put to work and abided by.
95

Baseline free approach for the semiparametric transformation models with missing covariates.

January 2003 (has links)
Leung Man-Kit. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2003. / Includes bibliographical references (leaves 37-41). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Basic concepts of survival data --- p.3 / Chapter 1.2 --- Missing Complete at Random (MCAR) --- p.8 / Chapter 1.3 --- Missing at Random (MAR) --- p.9 / Chapter 2 --- The maximaization of the marginal likelihood --- p.11 / Chapter 2.1 --- Survival function --- p.11 / Chapter 2.2 --- Missing covariate pattern --- p.13 / Chapter 2.3 --- Set of survival time with rank restrictions --- p.13 / Chapter 2.4 --- Marginal likelihood --- p.14 / Chapter 2.5 --- Score function --- p.15 / Chapter 3 --- The MCMC stochastic approximation approach --- p.17 / Chapter 4 --- Simulations Studies --- p.22 / Chapter 4.1 --- MCAR : Simulation 1 --- p.23 / Chapter 4.2 --- MCAR : Simulation 2 --- p.24 / Chapter 4.3 --- MAR : Simulation 3 --- p.26 / Chapter 4.4 --- MAR : Simulation 4 --- p.27 / Chapter 5 --- Example --- p.30 / Chapter 6 --- Discussion --- p.33 / Appendix --- p.35 / Bibliography --- p.37
96

Behavior Contracting with Dependent Runaway Youth

Colon, Jessica 24 June 2008 (has links)
The number of dependent youth reported as runaways to the Florida Department of Law Enforcement has become an increasing concern to the Department of Children and Families (Child Welfare League of America, 2005). Youth under state supervision, who are reported as runaways, most often leave from foster care settings, although some youth are also reported as runaways from the homes of relatives, non-relatives, and biological parents (CWLA, 2005). Community based care (CBC) agencies responsible for the supervision of dependent children in the State of Florida have struggled to develop an effective means of addressing the problem of running away and have subsequently been unable to decrease the number of dependent youth reported as runaways each year (CWLA, 2005). The current study evaluated a behavioral approach through a multiple baseline design to address the runaway behavior of dependent youth. Behavior contracts were used with three runaway youth placed in foster care which showed an initial increase in the number of days spent in an approved placement for all three participants. While the increase in the number of days spent in an approved placement did not maintain for one participant, a decrease in runaway behavior was demonstrated and maintained for the other two participants.
97

Techniques to handle missing values in a factor analysis

Turville, Christopher, University of Western Sydney, Faculty of Informatics, Science and Technology January 2000 (has links)
A factor analysis typically involves a large collection of data, and it is common for some of the data to be unrecorded. This study investigates the ability of several techniques to handle missing values in a factor analysis, including complete cases only, all available cases, imputing means, an iterative component method, singular value decomposition and the EM algorithm. A data set that is representative of that used for a factor analysis is simulated. Some of this data are then randomly removed to represent missing values, and the performance of the techniques are investigated over a wide range of conditions. Several criteria are used to investigate the abilities of the techniques to handle missing values in a factor analysis. Overall, there is no one technique that performs best for all of the conditions studied. The EM algorithm is generally the most effective technique except when there are ill-conditioned matrices present or when computing time is of concern. Some theoretical concerns are introduced regarding the effects that changes in the correlation matrix will have on the loadings of a factor analysis. A complicated expression is derived that shows that the change in factor loadings as a result of change in the elements of a correlation matrix involves components of eigenvectors and eigenvalues. / Doctor of Philosophy (PhD)
98

Cosmological models with quintessence : dynamical properties and observational constraints

Ng, Shao-Chin Cindy. January 2001 (has links) (PDF)
Bibliography: leaves 100-106. Studies different models of "quintessence", in particular, a quintessence arising from an ultra-light pseudo Nambu-Goldstone boson. Overviews dynamical properties for these models using phase-space analyses to study attractor and tracker solutions. Studies high-redshift type Ia supernovae constraints on these models. Studies the impact of a simple phenomenological model for supernovae luminosity evolution on the PNGB models and the potentials of a future supernovae data set to discriminate the PNGB models over the other quintessence models. Studies gravitational lensing statistics of high luminosity quasars upon the quintessence models.
99

Cosmological models with quintessence : dynamical properties and observational constraints / Shao-Chin Cindy Ng.

Ng, Shao-Chin Cindy January 2001 (has links)
Bibliography: leaves 100-106. / v, 107 leaves : ill. ; 30 cm. / Title page, contents and abstract only. The complete thesis in print form is available from the University Library. / Studies different models of "quintessence", in particular, a quintessence arising from an ultra-light pseudo Nambu-Goldstone boson. Overviews dynamical properties for these models using phase-space analyses to study attractor and tracker solutions. Studies high-redshift type Ia supernovae constraints on these models. Studies the impact of a simple phenomenological model for supernovae luminosity evolution on the PNGB models and the potentials of a future supernovae data set to discriminate the PNGB models over the other quintessence models. Studies gravitational lensing statistics of high luminosity quasars upon the quintessence models. / Thesis (Ph.D.)--University of Adelaide, Dept. of Physics and Mathematical Physics, 2001
100

Neural network imputation : a new fashion or a good tool

Amer, Safaa R. 07 June 2004 (has links)
Most statistical surveys and data collection studies encounter missing data. A common solution to this problem is to discard observations with missing data while reporting the percentage of missing observations in different output tables. Imputation is a tool used to fill in the missing values. This dissertation introduces the missing data problem as well as traditional imputation methods (e.g. hot deck, mean imputation, regression, Markov Chain Monte Carlo, Expectation-Maximization, etc.). The use of artificial neural networks (ANN), a data mining technique, is proposed as an effective imputation procedure. During ANN imputation, computational effort is minimized while accounting for sample design and imputation uncertainty. The mechanism and use of ANN in imputation for complex survey designs is investigated. Imputation methods are not all equally good, and none are universally good. However, simulation results and applications in this dissertation show that regression, Markov chain Monte Carlo, and ANN yield comparable results. Artificial neural networks could be considered as implicit models that take into account the sample design without making strong parametric assumptions. Artificial neural networks make few assumptions about the data, are asymptotically good and robust to multicollinearity and outliers. Overall, ANN could be time and resources efficient for an experienced user compared to other conventional imputation techniques. / Graduation date: 2005

Page generated in 0.1223 seconds