• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 127
  • 25
  • 20
  • 17
  • 4
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 250
  • 250
  • 77
  • 53
  • 53
  • 52
  • 35
  • 33
  • 31
  • 25
  • 25
  • 24
  • 23
  • 20
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
201

Data analysis and multiple imputation for two-level nested designs

Bailey, Brittney E. 25 October 2018 (has links)
No description available.
202

Methodologies for Missing Data with Range Regressions

Stoll, Kevin Edward 24 April 2019 (has links)
No description available.
203

Data Classification System Based on Combination Optimized Decision Tree : A Study on Missing Data Handling, Rough Set Reduction, and FAVC Set Integration / Dataklassificeringssystem baserat på kombinationsoptimerat beslutsträd : En studie om saknad datahantering, grov uppsättningsreduktion och FAVC-uppsättningsintegration

Lu, Xuechun January 2023 (has links)
Data classification is a novel data analysis technique that involves extracting valuable information with potential utility from databases. It has found extensive applications in various domains, including finance, insurance, government, education, transportation, and defense. There are several methods available for data classification, with decision tree algorithms being one of the most widely used. These algorithms are based on instance-based inductive learning and offer advantages such as rule extraction, low computational complexity, and the ability to highlight important decision attributes, leading to high classification accuracy. According to statistics, decision tree algorithms[1] are among the most widely utilized data mining algorithms. To address these challenges, a decision tree algorithm is employed to solve classification problems. However, the existing decision tree algorithm exhibits limitations such as low calculation efficiency and multi-valued[2] bias. Therefore, a data classification system based on an optimized decision tree algorithm written in Python and a data storage system based on PostgreSQL were developed. The proposed algorithm surpasses traditional classification algorithms in terms of dimensionality reduction, attribute selection, and scalability. Ultimately, a combined optimization decision tree classifier system is introduced, which exhibits superior performance compared to the widely used ID3[3] algorithm. The improved decision tree algorithm has both theoretical and practical significance for data mining applications. / Dataklassificering är en ny dataanalysteknik som innebär att man extraherar värdefull information med potentiell nytta från databaser. Den har hittat omfattande tillämpningar inom olika domäner, inklusive finans, försäkring, regering, utbildning, transport och försvar. Det finns flera metoder tillgängliga för dataklassificering, där beslutsträdsalgoritmer är en av de mest använda. Dessa algoritmer är baserade på instansbaserad induktiv inlärning och erbjuder fördelar som regelextraktion, låg beräkningskomplexitet och förmågan att lyfta fram viktiga beslutsattribut, vilket leder till hög klassificeringsnoggrannhet. Enligt statistik är beslutsträdsalgoritmer bland de mest använda datautvinningsalgoritmerna. För att hantera dessa utmaningar används en beslutsträdsalgoritm för att lösa klassificeringsproblem. Den befintliga beslutsträds-algoritmen uppvisar dock begränsningar såsom låg beräkningseffektivitet och flervärdig bias. Därför utvecklades ett dataklassificeringssystem baserat på en optimerad beslutsträdsalgoritm skriven i Python och ett datalagringssystem baserat på PostgreSQL. Den föreslagna algoritmen överträffar traditionella klassificeringsalgoritmer när det gäller dimensionsreduktion, attributval och skalbarhet. I slutändan introduceras ett kombinerat optimeringsbeslutsträd-klassificeringssystem, som uppvisar överlägsen prestanda jämfört med den allmänt använda ID3-algoritmen. Den förbättrade beslutsträdsalgoritmen har både teoretisk och praktisk betydelse för datautvinningstillämpningar.
204

Temporally-Embedded Deep Learning Model for Health Outcome Prediction

Boursalie, Omar January 2021 (has links)
Deep learning models are increasingly used to analyze health records to model disease progression. Two characteristics of health records present challenges to developers of deep learning-based medical systems. First, the veracity of the estimation of missing health data must be evaluated to optimize the performance of deep learning models. Second, the currently most successful deep learning diagnostic models, called transformers, lack a mechanism to analyze the temporal characteristics of health records. In this thesis, these two challenges are investigated using a real-world medical dataset of longitudinal health records from 340,143 patients over ten years called MIIDD: McMaster Imaging Information and Diagnostic Dataset. To address missing data, the performance of imputation models (mean, regression, and deep learning) were evaluated on a real-world medical dataset. Next, techniques from adversarial machine learning were used to demonstrate how imputation can have a cascading negative impact on a deep learning model. Then, the strengths and limitations of evaluation metrics from the statistical literature (qualitative, predictive accuracy, and statistical distance) to evaluate deep learning-based imputation models were investigated. This research can serve as a reference to researchers evaluating the impact of imputation on their deep learning models. To analyze the temporal characteristics of health records, a new model was developed and evaluated called DTTHRE: Decoder Transformer for Temporally-Embedded Health Records Encoding. DTTHRE predicts patients' primary diagnoses by analyzing their medical histories, including the elapsed time between visits. The proposed model successfully predicted patients' primary diagnosis in their final visit with improved predictive performance (78.54 +/- 0.22%) compared to existing models in the literature. DTTHRE also increased the training examples available from limited medical datasets by predicting the primary diagnosis for each visit (79.53 +/- 0.25%) with no additional training time. This research contributes towards the goal of disease predictive modeling for clinical decision support. / Dissertation / Doctor of Philosophy (PhD) / In this thesis, two challenges using deep learning models to analyze health records are investigated using a real-world medical dataset. First, an important step in analyzing health records is to estimate missing data. We investigated how imputation can have a cascading negative impact on a deep learning model's performance. A comparative analysis was then conducted to investigate the strengths and limitations of evaluation metrics from the statistical literature to assess deep learning-based imputation models. Second, the most successful deep learning diagnostic models to date, called transformers, lack a mechanism to analyze the temporal characteristics of health records. To address this gap, we developed a new temporally-embedded transformer to analyze patients' medical histories, including the elapsed time between visits, to predict their primary diagnoses. The proposed model successfully predicted patients' primary diagnosis in their final visit with improved predictive performance (78.54 +/- 0.22%) compared to existing models in the literature.
205

Missing Data Treatments in Multilevel Latent Growth Model: A Monte Carlo Simulation Study

Jiang, Hui 25 September 2014 (has links)
No description available.
206

Assessment of Soil Corrosion in Underground Pipelines via Statistical Inference

Yajima, Ayako 10 September 2015 (has links)
No description available.
207

Modeling Smooth Time-Trajectories for Camera and Deformable Shape in Structure from Motion with Occlusion

Gotardo, Paulo Fabiano Urnau 28 September 2010 (has links)
No description available.
208

A Monte Carlo Study of Missing Data Treatments for an Incomplete Level-2 Variable in Hierarchical Linear Models

Kwon, Hyukje 20 July 2011 (has links)
No description available.
209

Navigating the Risks of Dark Data : An Investigation into Personal Safety

Gautam, Anshu January 2023 (has links)
With the exponential proliferation of data, there has been a surge in data generation fromdiverse sources, including social media platforms, websites, mobile devices, and sensors.However, not all data is readily visible or accessible to the public, leading to the emergence ofthe concept known as "dark data." This type of data can exist in structured or unstructuredformats and can be stored in various repositories, such as databases, log files, and backups.The reasons behind data being classified as "dark" can vary, encompassing factors such as limited awareness, insufficient resources or tools for data analysis, or a perception ofirrelevance to current business operations. This research employs a qualitative research methodology incorporating audio/videorecordings and personal interviews to gather data, aiming to gain insights into individuals'understanding of the risks associated with dark data and their behaviors concerning thesharing of personal information online. Through the thematic analysis of the collected data,patterns and trends in individuals' risk perceptions regarding dark data become evident. The findings of this study illuminate the multiple dimensions of individuals' risk perceptions andt heir influence on attitudes towards sharing personal information in online contexts. Theseinsights provide valuable understanding of the factors that shape individuals' decisionsconcerning data privacy and security in the digital era. By contributing to the existing body ofknowledge, this research offers a deeper comprehension of the interplay between dark datarisks, individuals' perceptions, and their behaviors pertaining to online information sharing.The implications of this study can inform the development of strategies and interventionsaimed at fostering informed decision-making and ensuring personal safety in an increasinglydata-centric world
210

Methodological Issues in Design and Analysis of Studies with Correlated Data in Health Research

Ma, Jinhui 04 1900 (has links)
<p>Correlated data with complex association structures arise from longitudinal studies and cluster randomized trials. However, some methodological challenges in the design and analysis of such studies or trials have not been overcome. In this thesis, we address three of the challenges: 1) <em>Power analysis for population based longitudinal study investigating gene-environment interaction effects on chronic disease:</em> For longitudinal studies with interest in investigating the gene-environment interaction in disease susceptibility and progression, rigorous statistical power estimation is crucial to ensure that such studies are scientifically useful and cost-effective since human genome epidemiology is expensive. However conventional sample size calculations for longitudinal study can seriously overestimate the statistical power due to overlooking the measurement error, unmeasured etiological determinants, and competing events that can impede the occurrence of the event of interest. 2) <em>Comparing the performance of different multiple imputation strategies for missing binary outcomes in cluster randomized trials</em>: Though researchers have proposed various strategies to handle missing binary outcome in cluster randomized trials (CRTs), comprehensive guidelines on the selection of the most appropriate or optimal strategy are not available in the literature. 3) <em>Comparison of population-averaged and cluster-specific models for the analysis of cluster randomized trials with missing binary outcome</em>: Both population-averaged and cluster-specific models are commonly used for analyzing binary outcomes in CRTs. However, little attention has been paid to their accuracy and efficiency when analyzing data with missing outcomes. The objective of this thesis is to provide researchers recommendations and guidance for future research in handling the above issues.</p> / Doctor of Philosophy (PhD)

Page generated in 0.0878 seconds