• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 453
  • 158
  • 49
  • 47
  • 46
  • 38
  • 33
  • 25
  • 20
  • 8
  • 6
  • 6
  • 5
  • 4
  • 4
  • Tagged with
  • 1045
  • 1045
  • 250
  • 147
  • 129
  • 124
  • 113
  • 112
  • 96
  • 95
  • 88
  • 84
  • 83
  • 80
  • 79
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
411

Comparisons of Classification Methods in Efficiency and Robustness

Wang, Rui 31 August 2012 (has links)
No description available.
412

EFFECTS OF COVARIATES ON THE PERFORMANCE OF CERVICAL CANCER SCREENING TESTS: LOGISTIC REGRESSION AND LATENT CLASS MODELS

Raifu, Amidu O. 10 1900 (has links)
<p>In diagnostic accuracy studies, sensitivity and specificity are the most common measures to assess the performance of diagnostic or screening tests. The estimation of these measures can be done using empirical or model-based methods. The primary objective of this thesis is to use both the empirical and the model-based (logistic regression) approach to assess the effects of covariates on the performance of the visual inspection with acetic acid (VIA) and lugol iodine (VILI) tests using the data from women screened for cervical cancer in Kinshasa, the Democratic Republic of Congo. The secondary objectives are: first, to adjust for the false negative and false positive error rates by the two tests through latent class models (LCM), and second, to evaluate the effects of covariates on the agreement between the measurements of the two tests taken by nurse and physician through Kappa statistic.</p> <p>No particular pattern could be observed in the trend of empirically estimated sensitivity and specificity of the VIA and VILI tests measured by the nurse and by the physician across age and parity categories. From the logistic regression models, both age, parity, and their respective quadratic terms have significant effects on the probability of VIA and VILI tests to detect cervical cancer. However, there is no significant effect of marital status, smoking, and hybrid capture2 (HPV DNA) on the probability of VIA and VILI tests measured by nurse to detect cervical cancer while HPV DNA does in the probability of VIA and VILI tests measured by physician to detect cervical cancer. The trend of the estimated sensitivity of VIA and VILI tests measured by the nurse is not different across age groups but the specificity does vary. The trend of both the sensitivity and specificity of VIA and VILI tests are significantly different across parity groups. The reverse is the case for the sensitivity and specificity of VIA and VILI tests measured by physician across age and parity groups. The false negative and false positive error rates in the sensitivity and specificity of VIA and VILI tests measured by nurse are higher compared to that of physician. With Kappa statistic results, there is almost perfect agreement between the ratings by the nurse and physician for the dichotomized VIA and VILI test outcomes.</p> <p>In conclusion, there is a significant effects of age, parity and the quadratic term of age on the performance of VIA and VILI tests outcomes measured by nurse. On the VIA and VILI test outcomes measured by physician, age, parity, HPV DNA and quadratic term of age have shown significant effects on the performance of VIA and VILI tests outcomes measured by physician alone.</p> / Master of Science (MSc)
413

Predicting the occurrence of major adverse cardiac events within 30 days after a patient’s vascular surgery: An individual patient-data meta-analysis

Vanniyasingam, Thuvaraha 04 1900 (has links)
<p><strong>Background:</strong> Major adverse cardiac events, MACE – a composite endpoint of cardiac death and nonfatal myocardial infarction (MI) – are severe harmful outcomes that commonly arise after elective vascular surgeries. As current pre-operative risk prediction models are not as effective in predicting post-operative outcomes, this thesis will discuss the key results of an individual patient data meta-analysis that is based on data from six cohort studies of patients undergoing vascular surgery.</p> <p><strong>Objectives:</strong> The purpose of this thesis is to determine optimal thresholds of continuous covariates and create a prediction model for major adverse cardiac events (MACE), within 30 days after a vascular surgery. The goals include exploring the minimum p-value method to dichotomize cutpoints for continuous variables; employing logistic regression analysis to determine a prediction model for MACE; evaluating its validity against other samples; and assessing its sensitivity to clustering effects. The secondary objectives are to determine individual models for predicting all-cause mortality, cardiac death, and nonfatal MI within 30 days of a vascular surgery, using the final covariates assessed for MACE.<strong></strong></p> <p><strong>Methods: </strong>Both B-type naturietic peptide (BNP) and its N-terminal fragment (NTproBNP) are independently associated with cardiovascular complications after noncardiac surgeries, and particularly frequent after noncardiac vascular surgeries. In a previous study, these covariates were dichotomized using the receiver operating characteristic (ROC) curve approach and a simple logistic regression (SLR) model was created for MACE [1]. The first part of this thesis applies the minimum p-value method to determine a threshold for each natriuretic peptide (NP), BNP and NTproBNP. SLR is then used to model the prediction of MACE within 30 days after a patient’s vascular surgery. Comparisons were made with the ROC curve approach to determine the optimal thresholds and create a prediction model. The validity of this model was tested using bootstrap samples and its robustness was assessed using a mixed effects logistic regression (MELR) model and a generalized estimating equation (GEE). Finally, MELR was performed on each of the secondary outcomes.</p> <p><strong>Results:</strong>A variable, ROC_thrshld, was created to represent the cutpoints of Rodseth’s ROC curve approach, which identified 116pg/mL and 277.5pg/mL to be the optimal thresholds for BNP and NTproBNP, respectively [1]. The minimum p-value method dichotomized these NP thresholds as BNP: 115.57pg/mL (p</p> <p><strong>Discussion:</strong> One key limitation to this thesis is the small sample size received for NTproBNP. Also, determining only one cutpoint for each NP concentration may not be sufficient, since dichotomizing continuous factors can lead to loss of information along with other issues. Further research should be performed to explore other possible cutpoints along with performing reclassification to observe improvements in risk stratification. After validating our final model against other samples, we can conclude that MINP_thrshld, the type of surgery, and diabetes are significant covariates for the prediction of MACE. With the simplicity in only requiring a blood test to measure NP concentration levels and easily learning the status of the other two factors, minimal effort is needed in calculating the points and risk estimates for each patient. Further research should also be performed on the secondary outcomes to examine other factors that may be useful in prediction.</p> <p><strong>Conclusions: </strong>The minimum p-value method produced similar results to the ROC curve method in dichotomizing the NP concentration levels. The cutpoints for BNP and NTproBNP were 115.57pg/mL and 241.7 pg/mL, respectively. Further research needs to be performed to determine the optimality of the final prediction model of MACE, with covariates MINP_thrshld, type of surgery, and diabetes mellitus. <strong></strong></p> <p><strong><br /></strong></p> / Master of Science (MSc)
414

Statistical Methods for Data Integration and Disease Classification

Islam, Mohammad 11 1900 (has links)
Classifying individuals into binary disease categories can be challenging due to complex relationships across different exposures of interest. In this thesis, we investigate three different approaches for disease classification using multiple biomarkers. First, we consider combining information from literature reviews and INTERHEART data set to identify the threshold of ApoB, ApoA1 and the ratio of these two biomarkers to classify individuals at risk of developing myocardial infarction. We develop a Bayesian estimation procedure for this purpose that utilizes the conditional probability distribution of these biomarkers. This method is flexible compared to standard logistic regression approach and allows us to identify a precise threshold of these biomarkers. Second, we consider the problem of disease classification using two dependent biomarkers. An independently identified threshold for this purpose usually leads to a conflicting classification for some individuals. We develop and describe a method of determining the joint threshold of two dependent biomarkers for a disease classification, based on the joint probability distribution function constructed through copulas. This method will allow researchers uniquely classify individuals at risk of developing the disease. Third, we consider the problem of classifying an outcome using a gene and miRNA expression data sets. Linear principal component analysis (PCA) is a widely used approach to reduce the dimension of such data sets and subsequently use it for classification, but many authors suggest using kernel PCA for this purpose. Using real and simulated data sets, we compare these two approaches and assess the performance of components towards genetic data integration for an outcome classification. We conclude that reducing dimensions using linear PCA followed by a logistic regression model for classification seems to be acceptable for this purpose. We also observe that integrating information from multiple data sets using either of these approaches leads to a better performance of an outcome classification. / Thesis / Doctor of Philosophy (PhD)
415

Online Credit Recovery in a Large School Division in Virginia: Examining Factors for Participation and On-Time Graduation

Szybisty, Christopher Conrad 28 May 2024 (has links)
Under the pressure of federal accountability for high schools in the United States to improve and maintain high rates of on-time graduation, online credit recovery has become an increasingly popular intervention to help students earn credit in a course that they have previously failed. While some studies have connected online credit recovery with positive outcomes for participants, others have found negative outcomes and poor learning experiences. Set in a large school division in Virginia, the purpose of this study was to (a) identify explanatory student factors that were associated with participation in online credit recovery and (b) compare the likelihood of on-time graduation of participants with the likelihood of on-time graduation of nonparticipants. Limited to the graduation cohorts of 2019 and 2020, there were 10,010 students in the sample from the participating school division. In the sample, 27% of students were eligible to participate in online credit recovery, but only 2.3% of students participated. Binary logistic regression models were designed to identify factors associated with participation and the likelihood of on-time graduation. Covariates considered for inclusion in the model were gender, race and ethnicity, status as an English learner, status as a student with a disability, status as homeless, status as economically disadvantaged, high school grade point average, and school. Both models failed to meet goodness of fit standards and were rejected as having fit the data. No student factors were found to have explained participation, and differences in the likelihood of on-time graduation were not identified. These findings indicated that there did not appear to be systemic participation given the studied factors, reinforced by the finding that participation was relatively uniformly distributed among the schools. The finding of a lack of significant difference in the likelihood of on-time graduation highlighted flexibility for schools in choosing their recovery interventions. State agencies may also consider collecting and publicly reporting data about student participation in online credit recovery. Opportunities for future studies include replication in other settings, particularly districts of different size and area/region, and qualitative inquiry into decisions made by school and district leaders related to credit recovery. / Doctor of Education / Under the pressure of federal accountability for on-time graduation rates, high schools have increasingly used online credit recovery to help at-risk students. Some studies have identified positive outcomes for students in online credit recovery; however, others have found negative outcomes and poor learning experiences. Set in a large school division in Virginia, the purpose of this study was to identify factors that were associated with participation in online credit recovery and the likelihood of on-time graduation of participants compared to non-participants. Limited to the graduation cohorts of 2019 and 2020, there were 10,010 students in the sample from the participating school division, of which 2.3% of students participated. Logistic regression models were created, and covariates considered for inclusion in the model were gender, race and ethnicity, status as an English learner, status as a student with a disability, status as homeless, status as economically disadvantaged, grade point average, and school. Both models failed to fit the data well; no associated factors were found, and graduation rates were not found to be significantly different. There did not appear to have been systemic participation, and schools appear to have flexibility in offering recovery interventions. State agencies may also consider collecting and publicly reporting of data about student participation in online credit recovery. Opportunities for future studies include replication in other settings and qualitative inquiry into decisions related to credit recovery.
416

Advances in Applied Econometrics: Binary Discrete Choice Models, Artificial Neural Networks, and Asymmetries in the FAST Multistage Demand System

Bergtold, Jason Scott 27 April 2004 (has links)
The dissertation examines advancements in the methods and techniques used in the field of econometrics. These advancements include: (i) a re-examination of the underlying statistical foundations of statistical models with binary dependent variables. (ii) using feed-forward backpropagation artificial neural networks for modeling dichotomous choice processes, and (iii) the estimation of unconditional demand elasticities using the flexible multistage demand system with asymmetric partitions and fixed effects across time. The first paper re-examines the underlying statistical foundations of statistical models with binary dependent variables using the probabilistic reduction approach. This re-examination leads to the development of the Bernoulli Regression Model, a family of statistical models arising from conditional Bernoulli distributions. The paper provides guidelines for specifying and estimating a Bernoulli Regression Model, as well as, methods for generating and simulating conditional binary choice processes. Finally, the Multinomial Regression Model is presented as a direct extension. The second paper empirically compares the out-of-sample predictive capabilities of artificial neural networks to binary logit and probit models. To facilitate this comparison, the statistical foundations of dichotomous choice models and feed-forward backpropagation artificial neural networks (FFBANNs) are re-evaluated. Using contingent valuation survey data, the paper shows that FFBANNs provide an alternative to the binary logit and probit models with linear index functions. Direct comparisons between the models showed that the FFBANNs performed marginally better than the logit and probit models for a number of within-sample and out-of-sample performance measures, but in the majority of cases these differences were not statistically significant. In addition, guidelines for modeling contingent valuation survey data and techniques for estimating median WTP measures using FFBANNs are examined. The third paper estimates a set of unconditional price and expenditure elasticities for 49 different processed food categories using scanner data and the flexible and symmetric translog (FAST) multistage demand system. Due to the use of panel data and the presence of heterogeneity across time, temporal fixed effects were incorporated into the model. Overall, estimated price elasticities are larger, in absolute terms, than previous estimates. The use of disaggregated product groupings, scanner data, and the estimation of unconditional elasticities likely accounts for these differences. / Ph. D.
417

The influence of probability of detection when modeling species occurrence using GIS and survey data

Williams, Alison Kay 12 April 2004 (has links)
I compared the performance of habitat models created from data of differing reliability. Because the reliability is dependent on the probability of detecting the species, I experimented to estimate detectability for a salamander species. Based on these estimates, I investigated the sensitivity of habitat models to varying detectability. Models were created using a database of amphibian and reptile observations at Fort A.P. Hill, Virginia, USA. Performance was compared among modeling methods, taxa, life histories, and sample sizes. Model performance was poor for all methods and species, except for the carpenter frog (Rana virgatipes). Discriminant function analysis and ecological niche factor analysis (ENFA) predicted presence better than logistic regression and Bayesian logistic regression models. Database collections of observations have limited value as input for modeling because of the lack of absence data. Without knowledge of detectability, it is unknown whether non-detection represents absence. To estimate detectability, I experimented with red-backed salamanders (Plethodon cinereus) using daytime, cover-object searches and nighttime, visual surveys. Salamanders were maintained in enclosures (n = 124) assigned to four treatments, daytime__low density, daytime__high density, nighttime__low density, and nighttime__high density. Multiple observations of each enclosure were made. Detectability was higher using daytime, cover-object searches (64%) than nighttime, visual surveys (20%). Detection was also higher in high-density (49%) versus low-density enclosures (35%). Because of variation in detectability, I tested model sensitivity to the probability of detection. A simulated distribution was created using functions relating habitat suitability to environmental variables from a landscape. Surveys were replicated by randomly selecting locations (n = 50, 100, 200, or 500) and determining whether the species was observed, based on the probability of detection (p = 40%, 60%, 80%, or 100%). Bayesian logistic regression and ENFA models were created for each sample. When detection was 80 __ 100%, Bayesian predictions were more correlated with the known suitability and identified presence more accurately than ENFA. Probability of detection was variable among sampling methods and effort. Models created from presence/absence data were sensitive to the probability of detection in the input data. This stresses the importance of quantifying detectability and using presence-only modeling methods when detectability is low. If planning for sampling as an input for suitability modeling, it is important to choose sampling methods to ensure that detection is 80% or higher. / Ph. D.
418

Factors Influencing the Timing of FASFA Application and the Impact of Late Filing on Student Finances

Daku, Feride 06 December 2017 (has links)
A college degree provides benefits to individuals and society, but education is an expensive endeavor. College costs are high and they continue to rise while the median family income shows only modest increases. By lowering the cost of attendance, financial aid makes it possible for many students, especially those from low and middle-income families to attend college. FAFSA is the main instrument used in distributing financial assistance although completing the form is not an easy task. Each year, many students do not file the FAFSA or file it too late, missing valuable financial resources. The focus of this research was on students who file FAFSA late. The purpose of the study was two-fold: to explore the relationship between the timing of FASFA filing and the characteristics of financial aid applicants, and to assess the impact of late filing on student finances. Logistic regression analysis was used to examine how much of the variation in timing of FAFSA filing could be explained by students characteristics. The findings indicate late FAFSA filers tend to be in-state, male students, coming from single households, with weak high school academic performance. Focusing on low-income group, the study found the odds of filing late were nearly 2.8 times higher for in-state students than they were for out-of-state students. Being male increased the chances of late filing; the odds of filing late for low-income male students were 1.53 times higher than they were for low-income females. The impact of late FAFSA filing on student finances was assessed through linear regression analyses. The results show late filers received less grant aid but larger loan amounts. Compared to on time filers, late FAFSA filers received, on average, $2,815 less in grant aid and $662 more in loans. The current study shed light on several key factors that make students more likely to miss the FAFSA deadlines. In addition, it demonstrated that late filing has major financial consequences for students and their families. The findings can be used by high school guidance offices, college administrators, state and federal governments, and higher education leaders concerned with improving college affordability. / Ph. D.
419

Trust-Based Service Management for Service-Oriented Mobile Ad Hoc Networks and Its Application to Service Composition and Task Assignment with Multi-Objective Optimization Goals

Wang, Yating 11 May 2016 (has links)
With the proliferation of fairly powerful mobile devices and ubiquitous wireless technology, traditional mobile ad hoc networks (MANETs) now migrate into a new era of service-oriented MANETs wherein a node can provide and receive service from other nodes it encounters and interacts with. This dissertation research concerns trust management and its applications for service-oriented MANETs to answer the challenges of MANET environments, including no centralized authority, dynamically changing topology, limited bandwidth and battery power, limited observations, unreliable communication, and the presence of malicious nodes who act to break the system functionality as well as selfish nodes who act to maximize their own gain. We propose a context-aware trust management model called CATrust for service-oriented ad hoc networks. The novelty of our design lies in the use of logit regression to dynamically estimate trustworthiness of a service provider based on its service behavior patterns in a context environment, treating channel conditions, node status, service payoff, and social disposition as 'context' information. We develop a recommendation filtering mechanism to effectively screen out false recommendations even in extremely hostile environments in which the majority recommenders are malicious. We demonstrate desirable convergence, accuracy, and resiliency properties of CATrust. We also demonstrate that CATrust outperforms contemporary peer-to-peer and Internet of Things trust models in terms of service trust prediction accuracy against collusion recommendation attacks. We validate the design of trust-based service management based on CATrust with a node-to-service composition and binding MANET application and a node-to-task assignment MANET application with multi-objective optimization (MOO) requirements. For either application, we propose a trust-based algorithm to effectively filter out malicious nodes exhibiting various attack behaviors by penalizing them with trust loss, which ultimately leads to high user satisfaction. Our trust-based algorithm is efficient with polynomial runtime complexity while achieving a close-to-optimal solution. We demonstrate that our trust-based algorithm built on CATrust outperforms a non-trust-based counterpart using blacklisting techniques and trust-based counterparts built on contemporary peer-to-peer trust protocols. We also develop a dynamic table-lookup method to apply the best trust model parameter settings upon detection of rapid MANET environment changes to maximize MOO performance. / Ph. D.
420

Not All Biomass is Created Equal: An Assessment of Social and Biophysical Factors Constraining Wood Availability in Virginia

Braff, Pamela Hope 19 May 2014 (has links)
Most estimates of wood supply do not reflect the true availability of wood resources. The availability of wood resources ultimately depends on collective wood harvesting decisions across the landscape. Both social and biophysical constraints impact harvesting decisions and thus the availability of wood resources. While most constraints do not completely inhibit harvesting, they may significantly reduce the probability of harvest. Realistic assessments of woody availability and distribution are needed for effective forest management and planning. This study focuses on predicting the probability of harvest at forested FIA plot locations in Virginia. Classification and regression trees, conditional inferences trees, random forest, balanced random forest, conditional random forest, and logistic regression models were built to predict harvest as a function of social and biophysical availability constraints. All of the models were evaluated and compared to identify important variables constraining harvest, predict future harvests, and estimate the available wood supply. Variables related to population and resource quality seem to be the best predictors of future harvest. The balanced random forest and logistic regressions models are recommended for predicting future harvests. The balanced random forest model is the best predictor, while the logistic regression model can be most easily shared and replicated. Both models were applied to predict harvest at recently measured FIA plots. Based on the probability of harvest, we estimate that between 2012 and 2017, 10 – 21 percent of total wood volume on timberland will be available for harvesting. / Master of Science

Page generated in 0.0758 seconds