Global ETD Search

41	Estimation du délai de guérison statistique chez les patients atteints de cancer / Estimation of statistical time-to-cure in cancer patients Romain, Gaëlle 10 December 2019 (has links) Trois millions de personnes vivent en France avec un antécédent personnel de cancer et ont des difficultés d’accès à l’emprunt et à l’assurance. Depuis 2016, la loi de « modernisation de notre système de santé » a fixé le « droit à l'oubli » (délai au-delà duquel les demandeurs d’assurance ayant eu un antécédent de cancer n’auront plus à le déclarer) à 10 ans après la fin des traitements. D’un point de vue statistique, on peut considérer ce délai comme le délai au-delà duquel la surmortalité liée au cancer (taux de mortalité en excès) s’annule durablement, ce qui se traduit sur les courbes de survie nette par un plateau correspondant à la proportion de patients guéris. La vérification de l’hypothèse de guérison repose sur deux critères : un taux de mortalité en excès négligeable et la confirmation graphique de l’existence d’un plateau. Une nouvelle définition du délai de guérison a été proposée pour ce travail comme le temps à partir duquel la probabilité d’appartenir au groupe des guéris atteint 95%.Le premier objectif de cette thèse était de fournir des estimations du délai de guérison à partir des données des registres de cancer du réseau FRANCIM pour chaque localisation de cancer selon le sexe et l’âge. Le délai de guérison est inférieur à 12 ans pour la majorité des localisations vérifiant l’hypothèse de guérison. Il est notamment inférieur ou égal à 5 ans, voire nul pour certaines classes d’âge, pour le mélanome de la peau, le cancer du testicule et de la thyroïde. Les critères pour la vérification de la guérison sont subjectifs et le délai de guérison ne repose pas sur une estimation directe par les modèles de guérison préexistants. Un nouveau modèle de guérison a été développé, incluant le délai de guérison comme paramètre à estimer afin de répondre objectivement à la question de l’existence d’une guérison statistique et de permettre une estimation directe du délai de guérison.Le second objectif de la thèse était de comparer, dans des situations contrôlées pour lesquelles le taux de mortalité en excès devenait nul, les performances de ce nouveau modèle à celles de deux autres modèles de guérison. La survie nette et la proportion de guéris estimées par les modèles ont été comparées aux valeurs théoriques utilisées pour simuler les données. Le nouveau modèle permet, avec des conditions strictes d’application, d’estimer directement le délai de guérison avec des performances aussi satisfaisantes que celles des autres modèles. / Three million people are living in France with a personal past of cancer and undergo difficulties in accessing loans and insurance. Since 2016, the French law « modernisation de notre système de santé » set the "right to be forgotten" (time beyond which insurance applicants with a past of cancer will no longer have to declare it) at 10 years after the end of treatment. From a statistical point of view, this delay can be considered as the time from which mortality due to cancer (excess mortality) disappears. After this time, the net survival curves reach a plateau corresponding to the proportion of cured patients. The verification of this hypothesis is based on two criteria: a negligible excess mortality rate and a graphic confirmation of the existence of a plateau. We proposed a new definition of the time-to-cure as the time from which the probability of belonging to the cured group reaches 95%.The first aim of this thesis was to estimate time-to-cure for each cancer site by sex and age using population-based data from the FRANCIM registries network. Time-to-cure was lower than 12 years in most sites complying with the cure hypothesis. It was less than 5 years, or even null in some age groups, for skin melanoma, testicular and thyroid. Criteria to verify the cure hypothesis are subjective and time-to-cure is not directly estimated in pre-existing cure models. A new model has been developed including time-to-cure as a parameter to address the question of statistical cure and to allow direct estimation of time-to-cure.The second objective of this thesis was to compare, in controlled situations in which the excess mortality rate became null, the performances of this new model with that of two other cure models. Estimated net survival and cure fraction have been compared to the theoretical values used to simulate the data. Direct estimation of time-to-cure is possible under strict conditions. Survie Cancer Modèle de guérison Délai de guérison Données de registre Etude de simulations Survival Cancer Cure model Time-To-Cure Population-Based data Simulation study 616.9
42	Discrete survival models with flexible link functions for age at first marriage among woman in Swaziland Nevhungoni, Thambeleni Portia 18 May 2019 (has links) MSc (Statistics) / Department of Statistics / This study explores the use of exible link functions in discrete survival models through a simulation study and an application to the Swaziland Demographic and Health Survey (SDHS) data. The objective of the research study is to perform simulation exercises in order to compare the e ectiveness of di erent families of link functions and to construct a discrete multilevel survival model for age at rst marriage among women in Swaziland using a exible link function. The Pareto hazard model, Pregibon and Gosset families of link functions were considered in models with and without unobserved heterogeneity. The Pareto model where the family parameter is estimated from the data was found to outperform the other models, followed by the Pregibon and the Gosset family of link functions. The results from both simulation study and real data analysis of the SDHS data illustrated that, misspecication of the link function causes bias on the estimation of results. This demonstrates the importance of choosing the right link. The ndings of this study reveal that women who are highly educated, stay in the Manzini and Shiselweni region, those who reside in urban areas were more likely to marry later compared to their counterparts in Swaziland. The results also reveal that the proportion of early rst marriages is declining since the di erence among birth cohorts is found to be very high, with women of younger cohorts getting married later compared to older women. / NRF Flexible families of link functions Discrete survival models Heterogeneity Simulation study Misspeciation of the link function 306.81096887 Women -- Swaziland Marriage -- Swaziland Married people -- Swaziland Married women -- Swaziland
43	Properties of Hurdle Negative Binomial Models for Zero-Inflated and Overdispersed Count data Bhaktha, Nivedita January 2018 (has links) No description available. Statistics Educational Sociology Social Research
44	The Impact of Consumer Behaviour on Technological Change and the Market Structure - An Evolutionary Simulation Study Buschle, Nicole-Barbara 28 June 2002 (has links) This thesis shows that consumers' behaviour has a decisive impact on the innovative behaviour of firms and on the development of industry. As a framework, an evolutionary simulation model is chosen, and market interactions are modelled according to a search theoretic approach. info:eu-repo/classification/ddc/17 ddc:17
45	Visual Analytics of Big Data from Molecular Dynamics Simulation Rajendran, Catherine Jenifer Rajam 12 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Protein malfunction can cause human diseases, which makes the protein a target in the process of drug discovery. In-depth knowledge of how protein functions can widely contribute to the understanding of the mechanism of these diseases. Protein functions are determined by protein structures and their dynamic properties. Protein dynamics refers to the constant physical movement of atoms in a protein, which may result in the transition between different conformational states of the protein. These conformational transitions are critically important for the proteins to function. Understanding protein dynamics can help to understand and interfere with the conformational states and transitions, and thus with the function of the protein. If we can understand the mechanism of conformational transition of protein, we can design molecules to regulate this process and regulate the protein functions for new drug discovery. Protein Dynamics can be simulated by Molecular Dynamics (MD) Simulations. The MD simulation data generated are spatial-temporal and therefore very high dimensional. To analyze the data, distinguishing various atomic interactions within a protein by interpreting their 3D coordinate values plays a significant role. Since the data is humongous, the essential step is to find ways to interpret the data by generating more efficient algorithms to reduce the dimensionality and developing user-friendly visualization tools to find patterns and trends, which are not usually attainable by traditional methods of data process. The typical allosteric long-range nature of the interactions that lead to large conformational transition, pin-pointing the underlying forces and pathways responsible for the global conformational transition at atomic level is very challenging. To address the problems, Various analytical techniques are performed on the simulation data to better understand the mechanism of protein dynamics at atomic level by developing a new program called Probing Long-distance interactions by Tapping into Paired-Distances (PLITIP), which contains a set of new tools based on analysis of paired distances to remove the interference of the translation and rotation of the protein itself and therefore can capture the absolute changes within the protein. Firstly, we developed a tool called Decomposition of Paired Distances (DPD). This tool generates a distance matrix of all paired residues from our simulation data. This paired distance matrix therefore is not subjected to the interference of the translation or rotation of the protein and can capture the absolute changes within the protein. This matrix is then decomposed by DPD using Principal Component Analysis (PCA) to reduce dimensionality and to capture the largest structural variation. To showcase how DPD works, two protein systems, HIV-1 protease and 14-3-3 σ, that both have tremendous structural changes and conformational transitions as displayed by their MD simulation trajectories. The largest structural variation and conformational transition were captured by the first principal component in both cases. In addition, structural clustering and ranking of representative frames by their PC1 values revealed the long-distance nature of the conformational transition and locked the key candidate regions that might be responsible for the large conformational transitions. Secondly, to facilitate further analysis of identification of the long-distance path, a tool called Pearson Coefficient Spiral (PCP) that generates and visualizes Pearson Coefficient to measure the linear correlation between any two sets of residue pairs is developed. PCP allows users to fix one residue pair and examine the correlation of its change with other residue pairs. Thirdly, a set of visualization tools that generate paired atomic distances for the shortlisted candidate residue and captured significant interactions among them were developed. The first tool is the Residue Interaction Network Graph for Paired Atomic Distances (NG-PAD), which not only generates paired atomic distances for the shortlisted candidate residues, but also display significant interactions by a Network Graph for convenient visualization. Second, the Chord Diagram for Interaction Mapping (CD-IP) was developed to map the interactions to protein secondary structural elements and to further narrow down important interactions. Third, a Distance Plotting for Direct Comparison (DP-DC), which plots any two paired distances at user’s choice, either at residue or atomic level, to facilitate identification of similar or opposite pattern change of distances along the simulation time. All the above tools of PLITIP enabled us to identify critical residues contributing to the large conformational transitions in both HIV-1 protease and 14-3-3σ proteins. Beside the above major project, a side project of developing tools to study protein pseudo-symmetry is also reported. It has been proposed that symmetry provides protein stability, opportunities for allosteric regulation, and even functionality. This tool helps us to answer the questions of why there is a deviation from perfect symmetry in protein and how to quantify it. Visual Analytics Data Visualization Principal Component Analysis Parallel Computing Protein Structure Analysis Molecular Dynamics Simulation Study Spatial-Temporal Data Paired-Distances Pseudo-Symmetry in Proteins
46	Performance of supertree methods for estimating species trees Wang, Yuancheng January 2010 (has links) Phylogenetics is the research of ancestor-descendant relationships among different groups of organisms, for example, species or populations of interest. The datasets involved are usually sequence alignments of various subsets of taxa for various genes. A major task of phylogenetics is often to combine estimated gene trees from many loci sampled from the genes into an overall estimate species tree topology. Eventually, one can construct the tree of life that depicts the ancestor-descendant relationships for all known species around the world. If there is missing data or incomplete sampling in the datasets, then supertree methods can be used to assemble gene trees with different subsets of taxa into an estimated overall species tree topology. In this study, we assume that gene tree discordance is solely due to incomplete lineage sorting under the multispecies coalescent model (Degnan and Rosenberg, 2009). If there is missing data or incomplete sampling in the datasets, then supertree methods can be used to assemble gene trees with different subsets of taxa into an estimated species tree topology. In addition, we examine the performance of the most commonly used supertree method (Wilkinson et al., 2009), namely matrix representation with parsimony (MRP), to explore its statistical properties in this setting. In particular, we show that MRP is not statistically consistent. That is, an estimated species tree topology other than the true species tree topology is more likely to be returned by MRP as the number of gene trees increases. For some situations, using longer branch lengths, randomly deleting taxa or even introducing mutation can improve the performance of MRP so that the matching species tree topology is recovered more often. In conclusion, MRP is a supertree method that is able to handle large amounts of conflict in the input gene trees. However, MRP is not statistically consistent, when using gene trees arise from the multispecies coalescent model to estimate species trees. phylogenetics computational statistics gene tree species tree supertree method simulation study incomplete lineage sorting multispecies coalescent model statistically consistent expected parsimony score pruning schemes mutation model
47	Estimation simplifiée de la variance dans le cas de l’échantillonnage à deux phases Béliveau, Audrey 08 1900 (has links) Dans ce mémoire, nous étudions le problème de l'estimation de la variance pour les estimateurs par double dilatation et de calage pour l'échantillonnage à deux phases. Nous proposons d'utiliser une décomposition de la variance différente de celle habituellement utilisée dans l'échantillonnage à deux phases, ce qui mène à un estimateur de la variance simplifié. Nous étudions les conditions sous lesquelles les estimateurs simplifiés de la variance sont valides. Pour ce faire, nous considérons les cas particuliers suivants : (1) plan de Poisson à la deuxième phase, (2) plan à deux degrés, (3) plan aléatoire simple sans remise aux deux phases, (4) plan aléatoire simple sans remise à la deuxième phase. Nous montrons qu'une condition cruciale pour la validité des estimateurs simplifiés sous les plans (1) et (2) consiste à ce que la fraction de sondage utilisée pour la première phase soit négligeable (ou petite). Nous montrons sous les plans (3) et (4) que, pour certains estimateurs de calage, l'estimateur simplifié de la variance est valide lorsque la fraction de sondage à la première phase est petite en autant que la taille échantillonnale soit suffisamment grande. De plus, nous montrons que les estimateurs simplifiés de la variance peuvent être obtenus de manière alternative en utilisant l'approche renversée (Fay, 1991 et Shao et Steel, 1999). Finalement, nous effectuons des études par simulation dans le but d'appuyer les résultats théoriques. / In this thesis we study the problem of variance estimation for the double expansion estimator and the calibration estimators in the case of two-phase designs. We suggest to use a variance decomposition different from the one usually used in two-phase sampling, which leads to a simplified variance estimator. We look for the necessary conditions for the simplified variance estimators to be appropriate. In order to do so, we consider the following particular cases : (1) Poisson design at the second phase, (2) two-stage design, (3) simple random sampling at each phase, (4) simple random sampling at the second phase. We show that a crucial condition for the simplified variance estimator to be valid in cases (1) and (2) is that the first phase sampling fraction must be negligible (or small). We also show in cases (3) and (4) that the simplified variance estimator can be used with some calibration estimators when the first phase sampling fraction is negligible and the population size is large enough. Furthermore, we show that the simplified estimators can be obtained in an alternative way using the reversed approach (Fay, 1991 and Shao and Steel, 1999). Finally, we conduct some simulation studies in order to validate the theoretical results. Échantillonnage à deux phases Two-phase sampling Estimateur par double dilatation Double expansion estimator Estimateurs de calage Calibration estimators Estimation de la variance Variance estimation Approche renversée Reversed approach Étude par simulation Simulation study
48	Estimation utilisant les polynômes de Bernstein Tchouake Tchuiguep, Hervé 03 1900 (has links) Ce mémoire porte sur la présentation des estimateurs de Bernstein qui sont des alternatives récentes aux différents estimateurs classiques de fonctions de répartition et de densité. Plus précisément, nous étudions leurs différentes propriétés et les comparons à celles de la fonction de répartition empirique et à celles de l'estimateur par la méthode du noyau. Nous déterminons une expression asymptotique des deux premiers moments de l'estimateur de Bernstein pour la fonction de répartition. Comme pour les estimateurs classiques, nous montrons que cet estimateur vérifie la propriété de Chung-Smirnov sous certaines conditions. Nous montrons ensuite que l'estimateur de Bernstein est meilleur que la fonction de répartition empirique en terme d'erreur quadratique moyenne. En s'intéressant au comportement asymptotique des estimateurs de Bernstein, pour un choix convenable du degré du polynôme, nous montrons que ces estimateurs sont asymptotiquement normaux. Des études numériques sur quelques distributions classiques nous permettent de confirmer que les estimateurs de Bernstein peuvent être préférables aux estimateurs classiques. / This thesis focuses on the presentation of the Bernstein estimators which are recent alternatives to conventional estimators of the distribution function and density. More precisely, we study their various properties and compare them with the empirical distribution function and the kernel method estimators. We determine an asymptotic expression of the first two moments of the Bernstein estimator for the distribution function. As the conventional estimators, we show that this estimator satisfies the Chung-Smirnov property under conditions. We then show that the Bernstein estimator is better than the empirical distribution function in terms of mean squared error. We are interested in the asymptotic behavior of Bernstein estimators, for a suitable choice of the degree of the polynomial, we show that the Bernstein estimators are asymptotically normal. Numerical studies on some classical distributions confirm that the Bernstein estimators may be preferable to conventional estimators. Estimation non paramétrique propriétés asymptotiques processus empiriques convergence presque sûre étude par simulation Non-parametric density estimator asymptotic properties empirical processes almost sure limits simulation study
49	Theoretical and practical considerations for implementing diagnostic classification models Kunina-Habenicht, Olga 25 August 2010 (has links) Kognitive Diagnosemodelle (DCMs) sind konfirmatorische probabilistische Modelle mit kategorialen latenten Variablen, die Mehrfachladungsstrukturen erlauben. Sie ermöglichen die Abbildung der Kompetenzen in mehrdimensionalen Profilen, die zur Erstellung informativer Rückmeldungen dienen können. Diese Dissertation untersucht in zwei Anwendungsstudien und einer Simulationsstudie wichtige methodische Aspekte bei der Schätzung der DCMs. In der Arbeit wurde ein neuer Mathematiktest entwickelt basierend auf theoriegeleiteten vorab definierten Q-Matrizen. In den Anwendungsstudien (a) illustrierten wir die Anwendung der DCMs für empirische Daten für den neu entwickelten Mathematiktest, (b) verglichen die DCMs mit konfirmatorischen Faktorenanalysemodellen (CFAs), (c) untersuchten die inkrementelle Validität der mehrdimensionalen Profile und (d) schlugen eine Methode zum Vergleich konkurrierender DCMs vor. Ergebnisse der Anwendungsstudien zeigten, dass die geschätzten DCMs meist einen nicht akzeptablen Modellfit aufwiesen. Zudem fanden wir nur eine vernachlässigbare inkrementelle Validität der mehrdimensionalen Profile nach der Kontrolle der Personenparameter bei der Vorhersage der Mathematiknote. Zusammengenommen sprechen diese Ergebnisse dafür, dass DCMs per se keine zusätzliche Information über die mehrdimensionalen CFA-Modelle hinaus bereitstellen. DCMs erlauben jedoch eine andere Aufbereitung der Information. In der Simulationsstudie wurde die Präzision der Parameterschätzungen in log-linearen DCMs sowie die Sensitivität ausgewählter Indizes der Modellpassung auf verschiedene Formen der Fehlspezifikation der Interaktionsterme oder der Q-Matrix untersucht. Die Ergebnisse der Simulationsstudie zeigen, dass die Parameterwerte für große Stichproben korrekt geschätzt werden, während die Akkuratheit der Parameterschätzungen bei kleineren Stichproben z. T. beeinträchtigt ist. Ein großer Teil der Personen wird in Modellen mit fehlspezifizierten Q-Matrizen falsch klassifiziert. / Cognitive diagnostic classification models (DCMs) have been developed to assess the cognitive processes underlying assessment responses. Current dissertation aims to provide theoretical and practical considerations for estimation of DCMs for educational applications by investigating several important underexplored issues. To avoid problems related to retrofitting of DCMs to an already existing data, test construction of the newly mathematics assessment for primary school DMA was based on a-priori defined Q-matrices. In this dissertation we compared DCMs with established psychometric models and investigated the incremental validity of DCMs profiles over traditional IRT scores. Furthermore, we addressed the issue of the verification of the Q-matrix definition. Moreover, we examined the impact of invalid Q-matrix specification on item, respondent parameter recovery, and sensitivity of selected fit measures. In order to address these issues one simulation study and two empirical studies illustrating applications of several DCMs were conducted. In the first study we have applied DCMs in general diagnostic modelling framework and compared those models to factor analysis models. In the second study we implemented a complex simulation study and investigated the implications of Q-matrix misspecification on parameter recovery and classification accuracy for DCMs in log-linear framework. In the third study we applied results of the simulation study to a practical application based on the data for 2032 students for the DMA. Presenting arguments for additional gain of DCMs over traditional psychometric models remains challenging. Furthermore, we found only a negligible incremental validity of multivariate proficiency profiles compared to the one-dimensional IRT ability estimate. Findings from the simulation study revealed that invalid Q-matrix specifications led to decreased classification accuracy. Information-based fit indices were sensitive to strong model misspecifications. Klassifikation Statistische Modelierung Simulationsstudie Item-Respose-Theorie (IRT) Faktorenanalyse (CFA) Mathematiktest Arithmetik statistical modeling classification simulation study item respose theory (IRT) factor analysis (CFA) arithmetic mathematic test 150 Psychologie 11 Psychologie CM 2500 ddc:150
50	Evaluating the error of measurement due to categorical scaling with a measurement invariance approach to confirmatory factor analysis Olson, Brent 05 1900 (has links) It has previously been determined that using 3 or 4 points on a categorized response scale will fail to produce a continuous distribution of scores. However, there is no evidence, thus far, revealing the number of scale points that may indeed possess an approximate or sufficiently continuous distribution. This study provides the evidence to suggest the level of categorization in discrete scales that makes them directly comparable to continuous scales in terms of their measurement properties. To do this, we first introduced a novel procedure for simulating discretely scaled data that was both informed and validated through the principles of the Classical True Score Model. Second, we employed a measurement invariance (MI) approach to confirmatory factor analysis (CFA) in order to directly compare the measurement quality of continuously scaled factor models to that of discretely scaled models. The simulated design conditions of the study varied with respect to item-specific variance (low, moderate, high), random error variance (none, moderate, high), and discrete scale categorization (number of scale points ranged from 3 to 101). A population analogue approach was taken with respect to sample size (N = 10,000). We concluded that there are conditions under which response scales with 11 to 15 scale points can reproduce the measurement properties of a continuous scale. Using response scales with more than 15 points may be, for the most part, unnecessary. Scales having from 3 to 10 points introduce a significant level of measurement error, and caution should be taken when employing such scales. The implications of this research and future directions are discussed. optimum number of scale points continuous scale discrete scale categorization coarseness measurement error Classical True Score Model simulation study data generation item specific variance random error variance longitudinal measurement invariance Comparative Fit Index Relative Multivariate Kurtosis

Search results