151 |
Modeling proportions to assess the soil nematode community structure in a two year alfalfa cropZbylut, Joanna January 1900 (has links)
Master of Science / Department of Statistics / Leigh Murray / The southern root-knot nematode (SRKN) and the weedy perennials, yellow nutsedge (YNS) and purple nutsedge (PNS) are simultaneously occurring pests in the irrigated agricultural soils of southern New Mexico. Previous research has very well characterized SRKN, YNS and PNS as a mutually-beneficial pest complex and has revealed their enhanced population growth and survival when they occur together. The density of nutsedge in a field could be used as a predictor of SRKN juveniles in the soil. In addition to SRKN, which is the most harmful of the plant parasitic nematodes, in southern New Mexico, other species or categories of nematodes could be identified and counted. Some of them are not as damaging to the plant as SRKN, and some of them may be essential for soil health. The nematode species could be grouped into categories according to trophic level (what nematodes eat) and herbivore feeding behavior (how herbivore nematodes eat). Subsequently, three ratios of counts were calculated for trophic level and for feeding behavior level to investigate the soil nematode community structure. These proportions were modeled as functions of the weed hosts YNS and PNS by generalized linear regression models using the logit link function and three probability distributions: the Binomial, Zero Inflated Binomial (ZIB) and Binomial Hurdle (BH). The latter two were used to account for potential high proportions of zeros in the data. The SAS NLMIXED procedure was used to fit models for each of the six sampling dates (May, July and September) over the two years of the alfalfa study. General results showed that the Binomial pmf generally provided the best fit, indicating lower zero-inflation than expected. Importance of YNS and PNS predictors varied over time and the different ratios. Specific results illustrate the differences in estimated probabilities between Binomial, ZIB and BH distributions as YNS counts increase for two selected ratios.
|
152 |
Statistical inference for varying coefficient modelsChen, Yixin January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Weixin Yao / This dissertation contains two projects that are related to varying coefficient models.
The traditional least squares based kernel estimates of the varying coefficient model will
lose some efficiency when the error distribution is not normal. In the first project, we propose a novel adaptive estimation method that can adapt to different error distributions and provide an efficient EM algorithm to implement the proposed estimation. The asymptotic properties of the resulting estimator is established. Both simulation studies and real data examples are used to illustrate the finite sample performance of the new estimation procedure. The numerical results show that the gain of the adaptive procedure over the least squares estimation can be quite substantial for non-Gaussian errors.
In the second project, we propose a unified inference for sparse and dense longitudinal
data in time-varying coefficient models. The time-varying coefficient model is a special case of the varying coefficient model and is very useful in longitudinal/panel data analysis. A mixed-effects time-varying coefficient model is considered to account for the within subject correlation for longitudinal data. We show that when the kernel smoothing method is used to estimate the smooth functions in the time-varying coefficient model for sparse or dense longitudinal data, the asymptotic results of these two situations are essentially different. Therefore, a subjective choice between the sparse and dense cases may lead to wrong conclusions for statistical inference. In order to solve this problem, we establish a unified self-normalized central limit theorem, based on which a unified inference is proposed without deciding whether the data are sparse or dense. The effectiveness of the proposed unified inference is demonstrated through a simulation study and a real data application.
|
153 |
Technology mediated communication in intimate relationshipsNorton, Aaron Michael January 1900 (has links)
Doctor of Philosophy / Department of Family Studies and Human Services / Joyce Baptist / Very little research has been conducted to understand how the technology revolution has changed and impacted couple relationships. The proposed study examined the impact of technology on couples in committed relationships through the lens of the couple and technology framework. Specifically, this study used data from 2,826 European couples to examine associations between online boundary crossing, online intrusion, relationship satisfaction, and partner responsiveness. The results suggest that when participants’ reported that their partner checked up on their online activities more frequently that this was linked with lower scores on relationship satisfaction and partner responsiveness. Also, decreased scores for relationship satisfaction and partner responsiveness were associated with increased acceptance for their partner using the Internet to talk with someone attractive about everyday life or pop culture, personal information, and relationship troubles or concerns. Lastly, the results suggest that men, but not women, who reported greater acceptability for online boundary crossing were more likely to have partners who reported lower relationship satisfaction in their relationships. Implications for clinicians, relationship educators, and researchers are discussed.
|
154 |
Statistical inference for rankings in the presence of panel segmentationXie, Lin January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Paul Nelson / Panels of judges are often used to estimate consumer preferences for m items such as food products. Judges can either evaluate each item on several ordinal scales and indirectly produce an overall ranking, or directly report a ranking of the items. A complete ranking orders all the items from best to worst. A partial ranking, as we use the term, only reports rankings of the best q out of m items. Direct ranking, the subject of this report, does not require the widespread but questionable practice of treating ordinal measurement as though they were on ratio or interval scales. Here, we develop and study segmentation models in which the panel may consist of relatively homogeneous subgroups, the segments. Judges within a subgroup will tend to agree among themselves and differ from judges in the other subgroups. We develop and study the statistical analysis of mixture models where it is not known to which segment a judge belongs or in some cases how many segments there are. Viewing segment membership indicator variables as latent data, an E-M algorithm was used to find the maximum likelihood estimators of the parameters specifying a mixture of Mallow’s (1957) distance models for complete and partial rankings. A simulation study was conducted to evaluate the behavior of the E-M algorithm in terms of such issues as the fraction of data sets for which the algorithm fails to converge and the sensitivity of initial values to the convergence rate and the performance of the maximum likelihood estimators in terms of bias and mean square error, where applicable.
A Bayesian approach was developed and credible set estimators was constructed. Simulation was used to evaluate the performance of these credible sets as
confidence sets.
A method for predicting segment membership from covariates measured on a judge was derived using a logistic model applied to a mixture of Mallows probability distance models. The effects of covariates on segment membership were assessed.
Likelihood sets for parameters specifying mixtures of Mallows distance models were constructed and explored.
|
155 |
Strategies to Sustain Small Businesses Beyond 5 YearsWani, Kayaso Cosmas 01 January 2018 (has links)
According to the U.S. Small Business Association, the failure rates for small businesses in 2014 were as high as 50% to 80% within the first 5 years of establishment. The purpose for this multiple case study was to explore the strategies that small business owners have used to sustain their businesses beyond 5 years. Guided by entrepreneurship theory as the conceptual framework, and a purposive sampling method, this qualitative case study used semistructured interviews with 3 successful, small, ethnic grocery business owners in Anchorage, AK to better understand small business strategies for survival. Member checking and triangulation with field notes, interview data, business websites, customer comments, and government documents helped ensure theoretical saturation and trustworthiness of interpretations. Using pre-coded themes for the data analysis, the 8 themes from this study were entrepreneur characteristics, education and management skills, financial planning, marketing strategies and competitive advantages, social networks and human relationships, technology and innovation, government supports and social responsibility, and motivational influence. Two key results indicated the strategies needed for small business owners were entrepreneur management skills and government support for small businesses. These findings may influence positive social change by improving small business owner competence and sustainability, rising higher business incomes, providing a better quality of life to employees and their communities welfare benefiting the entire U.S. economy.
|
156 |
Model Selection via Minimum Description LengthLi, Li 10 January 2012 (has links)
The minimum description length (MDL) principle originated from data compression literature and has been considered for deriving statistical model selection procedures. Most existing methods utilizing the MDL principle focus on models consisting of independent data, particularly in the context of linear regression. The data considered in this thesis are in the form of repeated measurements, and the exploration of MDL principle begins with classical linear mixed-effects models. We distinct two kinds of research focuses: one concerns the population parameters and the other concerns the cluster/subject parameters. When the research interest is on the population level, we propose a class of MDL procedures which incorporate the dependence structure within individual or cluster with data-adaptive penalties and enjoy the advantages of Bayesian information criteria. When the number of covariates is large, the penalty term is adjusted by data-adaptive structure to diminish the under selection issue in BIC and try to mimic the behaviour of AIC. Theoretical justifications are provided from both data compression and statistical perspectives. Extensions to categorical response modelled by generalized estimating equations and functional data modelled by functional principle components are illustrated. When the interest is on the cluster level, we use group LASSO to set up a class of candidate models. Then we derive a MDL criterion for this LASSO technique in a group manner to selection the final model via the tuning parameters. Extensive numerical experiments are conducted to demonstrate the usefulness of the proposed MDL procedures on both population level and cluster level.
|
157 |
Model Selection via Minimum Description LengthLi, Li 10 January 2012 (has links)
The minimum description length (MDL) principle originated from data compression literature and has been considered for deriving statistical model selection procedures. Most existing methods utilizing the MDL principle focus on models consisting of independent data, particularly in the context of linear regression. The data considered in this thesis are in the form of repeated measurements, and the exploration of MDL principle begins with classical linear mixed-effects models. We distinct two kinds of research focuses: one concerns the population parameters and the other concerns the cluster/subject parameters. When the research interest is on the population level, we propose a class of MDL procedures which incorporate the dependence structure within individual or cluster with data-adaptive penalties and enjoy the advantages of Bayesian information criteria. When the number of covariates is large, the penalty term is adjusted by data-adaptive structure to diminish the under selection issue in BIC and try to mimic the behaviour of AIC. Theoretical justifications are provided from both data compression and statistical perspectives. Extensions to categorical response modelled by generalized estimating equations and functional data modelled by functional principle components are illustrated. When the interest is on the cluster level, we use group LASSO to set up a class of candidate models. Then we derive a MDL criterion for this LASSO technique in a group manner to selection the final model via the tuning parameters. Extensive numerical experiments are conducted to demonstrate the usefulness of the proposed MDL procedures on both population level and cluster level.
|
158 |
Statistical Methods for Dating Collections of Historical DocumentsTilahun, Gelila 31 August 2011 (has links)
The problem in this thesis was originally motivated by problems presented with documents of Early England Data Set (DEEDS). The central problem with these medieval documents is the lack of methods to assign accurate dates to those documents which bear no date.
With the problems of the DEEDS documents in mind, we present two methods to impute missing features of texts.
In the first method, we suggest a new class of metrics for measuring distances between texts. We then show how to combine the distances between the texts using statistical smoothing. This method can be adapted to settings where the features of the texts are ordered or unordered categoricals (as in the case of, for example, authorship assignment problems).
In the second method, we estimate the probability of occurrences of words in texts using nonparametric regression techniques of local polynomial fitting with kernel weight to generalized linear models. We combine the
estimated probability of occurrences of words of a text to estimate the probability of occurrence of a text as a function of its feature -- the feature in this case being the date in which the text is written. The
application and results of our methods to the DEEDS documents are presented.
|
159 |
Effects of vitamin D supplementation and floor space on pig performanceFlohr, Joshua Richard January 1900 (has links)
Doctor of Philosophy / Animal Sciences and Industry / Michael D. Tokach / Three experiments using 2,385 pre-weaned pigs, growing pigs, and sows were performed in addition to a meta-analysis and industry survey. Experiment 1 tested the effects of sow vitamin D supplementation from vitamin D₃ (low, medium, or high) or 25OHD₃ (same IU equivalency as the medium level of vitamin D₃) on maternal performance, neonatal pig bone and muscle characteristics, subsequent pre-weaned pig performance and serum 25OHD₃ with only differences in serum 25OHD₃ being impacted. In the second experiment a subsample of pigs weaned from the maternal portion of the study were used in a split-plot design and fed 2 different forms of vitamin D in the nursery and growth performance was evaluated until the pigs reached market weight. Overall, the nursery vitamin D treatments did not impact growth; however, pigs from sows fed the medium level of vitamin D₃ performed better after weaning compared to pigs from sows fed the low or the high level of vitamin D₃, and serum 25OHD₃ was altered based on maternal and nursery vitamin D supplementation. In the third experiment, finishing pigs were initially provided 2 different floor space allowances (0.64 or 0.91 m²) and pigs initially provided 0.64 m² were subject to 1 of 3 marketing strategies which removed the heaviest pigs from the pen in order to provide additional floor space to the pigs remaining in the pen. Overall, pigs initially provided more floor space had improved ADG and ADFI, but increasing the number of marketing events increased ADG of the pigs remaining in the pen following market events. The meta-analysis suggested that a multi-term empirical model using random effects to account for known error and weighted observations to account for heterogeneous experimental designs and replication provided models that best fit the database. Also, the meta-analysis concluded that floor space allowance does influence ADG, ADFI, and G:F and BW of the pig can alter the floor space response. Finally, the vitamin and trace mineral survey suggested that a wide range of supplementation practices are used in the swine industry but most production systems supplement micronutrients above the basal requirement estimates of the animals.
|
160 |
Methods for handling missing data due to a limit of detection in longitudinal lognormal dataDick, Nicole Marie January 1900 (has links)
Master of Science / Department of Statistics / Suzanne Dubnicka / In animal science, challenge model studies often produce longitudinal data. Many times
the lognormal distribution is useful in modeling the data at each time point. Escherichia coli
O157 (E. coli O157) studies measure and record the concentration of colonies of the bacteria.
There are times when the concentration of colonies present is too low, falling below a limit of
detection. In these cases a zero is recorded for the concentration. Researchers employ a method
of enrichment to determine if E. coli O157 was truly not present. This enrichment process
searches for bacteria colony concentrations a second time to confirm or refute the previous
measurement. If enrichment comes back without evidence of any bacteria colonies present, a
zero remains as the observed concentration. If enrichment comes back with presence of bacteria
colonies, a minimum value is imputed for the concentration. At the conclusion of the study the
data are log10-transformed. One problem with the transformation is that the log of zero is
mathematically undefined, so any observed concentrations still recorded as a zero after
enrichment can not be log-transformed. Current practice carries the zero value from the
lognormal data to the normal data. The purpose of this report is to evaluate methods for handling
missing data due to a limit of detection and to provide results for various analyses of the
longitudinal data. Multiple methods of imputing a value for the missing data are compared.
Each method is analyzed by fitting three different models using SAS. To determine which
method is most accurately explaining the data, a simulation study was conducted.
|
Page generated in 0.0136 seconds