Global ETD Search

1	Investigating Crustal Deformation Associated With The North America-Pacific Plate Boundary In Southern California With GPS Geodesy Spinler, Joshua C. January 2014 (has links) The three largest earthquakes in the last 25 years in southern California occurred on faults located adjacent to the southern San Andreas fault, with the M7.3 1992 Landers and M7.1 1999 Hector Mine earthquakes occurring in the eastern California shear zone (ECSZ) in the Mojave Desert, and the M7.2 2010 El Mayor-Cucapah earthquake occurring along the Laguna Salada fault in northern Baja California, Mexico. The locations of these events near to but not along the southern San Andreas fault (SSAF) is unusual in that the last major event on the SSAF occurred more than 300 years ago, with an estimated recurrence interval of 215± 25 years. The focus of this dissertation is to address the present-day deformation field along the North America-Pacific plate boundary in southern California and northern Baja California, through the analysis of GPS data, and elastic block and viscoelastic earthquake models to determine fault slip rates and rheological properties of the lithosphere in the plate boundary zone. We accomplish this in three separate studies. The first study looks at how strain is partitioned northwards along-strike from the southern San Andreas fault near the Salton Sea. We find that estimates for slip-rates on the southern San Andreas decrease from ~23 mm/yr in the south to ~8 mm/yr as the fault passes through San Gorgonio Pass to the northwest, while ~13-18 mm/yr of slip is partitioned onto NW-SE trending faults of the ECSZ where the Landers and Hector Mine earthquakes occurred. This speaks directly to San Andreas earthquake hazards, as a reduction in the slip rate would require greater time between events to build up enough slip deficit in order to generate a large magnitude earthquake. The second study focuses on inferring the rheological structure beneath the Salton Trough region. This is accomplished through analysis of postseismic deformation observed using a set of the GPS data collected before and after the 2010 El Mayor-Cucapah earthquake. By determining the slip-rates on each of the major crustal faults prior to the earthquake, we are able to model the pre-earthquake velocity field for comparison with velocities measured using sites constructed post-earthquake. We then determine how individual site velocities have changed in the 3 years following the earthquake, with implications for the rate at which the lower crust and upper mantle viscously relax through time. We find that the viscosity of the lower crust is at least an order of magnitude higher than that of the uppermost mantle, and hypothesize that this is due to mafic material emplaced at the base of the crust as the spreading center developed beneath the Salton Trough since about 6 Ma. The final study investigates crustal deformation and fault slip rates for faults in the northern Mojave and southern Walker Lane regions of the ECSZ. Previous geodetic studies estimated slip-rates roughly double those inferred via geological dating methods in this region for NW striking strike-slip faults, but significantly smaller than geologic estimates for the Garlock fault. Through construction of a detailed elastic block model, which selects only active fault structures, and applying a new, dense GPS velocity field in this region, we are able to estimate slip-rates for the strike-slip faults in the ECSZ that are much closer to those reported from geology. Fault-slip rates Postseismic deformation San Andreas Geosciences Block model
2	A Universal Design for Learning (UDL) based Literature Circle (LC) model: effects on higher-order reading comprehension skills and student engagement in diverse sixth-eighth grade classrooms Bendu, Charles Gibao 08 April 2015 (has links) Outcomes related to students’ reading comprehension and higher-order critical thinking skills, and students’ academic and intellectual engagement following the implementation of a Three-Block Model of UDL-based literature circles pedagogical model were investigated. Fifty-nine (59) students attending three suburban public middle schools took part in the study. The study adopted a mixed-design approach to data collection and analysis, with quantitative data collected from all students, and qualitative data collected from a purposively selected sub-sample of 24 students (12 in each of treatment and control classes). Intervention and control groups were assessed pre and post for measures of reading comprehension using classroom-based assessments, which were triangulated by qualitative data from pre and post semi-structured student interviews to explore students’ academic and intellectual engagement. Quantitative data were analyzed using repeated measures MANOVA’s to determine treatment effects for both groups while qualitative data were transcribed and analyzed thematically using a case study approach. Quantitative results showed a small but significant increase in reading comprehension outcomes for proficient and typical readers in treatment groups compared to their counterparts in control classes, and showed a significantly greater increase in reading comprehension outcomes for students in treatment classes who are culturally and linguistically diverse (CLD) and struggling readers. These findings were corroborated by the qualitative results, which showed that students’ academic and intellectual engagement increased in the treatment classes both for proficient readers and struggling readers. Three-Block Model Literature Circles UDL Universal Design
3	Stochastic Block Model Dynamics Nithish Kumar Kumar (10725294) 29 April 2021 (has links) <div>The past few years have seen an increasing focus on fairness and the long-term impact of algorithmic decision making in the context of Machine learning, Artificial Intelligence and other disciplines. In this thesis, we model hiring processes in enterprises and organizations using dynamic mechanism design. Using a stochastic block model to simulate the workings of a hiring process, we study fairness and long-term evolution in the system. </div><div> </div><div> We first present multiple results on a deterministic variant of our model including convergence and an accurate approximate solution describing the state of the deterministic variant after any time period has elapsed. Using the differential equation method, it can be shown that this deterministic variant is in turn an accurate approximation of the evolution of our stochastic block model with high probability.</div><div> </div><div> Finally, we derive upper and lower bounds on the expected state at each time step, and further show that in the limiting case of the long-term, these upper and lower bounds themselves converge to the state evolution of the deterministic system. These results offer conclusions on the long-term behavior of our model, thereby allowing reasoning on how fairness in organizations could be achieved. We conclude that without sufficient, systematic incentives, under-represented groups will wane out from organizations over time.</div> Theoretical Computer Science fairness dynamic systems theory Stochastic block model
4	Classification croisée pour l'analyse de bases de données de grandes dimensions de pharmacovigilance / Coclustering for the analysis of pharmacovigilance massive datasets Robert, Valérie 06 June 2017 (has links) Cette thèse regroupe des contributions méthodologiques à l'analyse statistique des bases de données de pharmacovigilance. Les difficultés de modélisation de ces données résident dans le fait qu'elles produisent des matrices souvent creuses et de grandes dimensions. La première partie des travaux de cette thèse porte sur la classification croisée du tableau de contingence de pharmacovigilance à l’aide du modèle des blocs latents de Poisson normalisé. L'objectif de la classification est d'une part de fournir aux pharmacologues des zones intéressantes plus réduites à explorer de manière plus précise, et d'autre part de constituer une information a priori utilisable lors de l'analyse des données individuelles de pharmacovigilance. Dans ce cadre, nous détaillons une procédure d'estimation partiellement bayésienne des paramètres du modèle et des critères de sélection de modèles afin de choisir le modèle le plus adapté aux données étudiées. Les données étant de grandes dimensions, nous proposons également une procédure pour explorer de manière non exhaustive mais pertinente, l'espace des modèles en coclustering. Enfin, pour mesurer la performance des algorithmes, nous développons un indice de classification croisée calculable en pratique pour un nombre de classes élevé. Les développements de ces outils statistiques ne sont pas spécifiques à la pharmacovigilance et peuvent être utile à toute analyse en classification croisée. La seconde partie des travaux de cette thèse porte sur l'analyse statistique des données individuelles, plus nombreuses mais également plus riches en information. L'objectif est d'établir des classes d'individus selon leur profil médicamenteux et des sous-groupes d'effets et de médicaments possiblement en interaction, palliant ainsi le phénomène de coprescription et de masquage que peuvent présenter les méthodes existantes sur le tableau de contingence. De plus, l'interaction entre plusieurs effets indésirables y est prise en compte. Nous proposons alors le modèle des blocs latents multiple qui fournit une classification croisée simultanée des lignes et des colonnes de deux tableaux de données binaires en leur imposant le même classement en ligne. Nous discutons des hypothèses inhérentes à ce nouveau modèle et nous énonçons des conditions suffisantes de son identifiabilité. Ensuite, nous présentons une procédure d'estimation de ses paramètres et développons des critères de sélection de modèles associés. De plus, un modèle de simulation numérique des données individuelles de pharmacovigilance est proposé et permet de confronter les méthodes entre elles et d'étudier leurs limites. Enfin, la méthodologie proposée pour traiter les données individuelles de pharmacovigilance est explicitée et appliquée à un échantillon de la base française de pharmacovigilance entre 2002 et 2010. / This thesis gathers methodological contributions to the statistical analysis of large datasets in pharmacovigilance. The pharmacovigilance datasets produce sparse and large matrices and these two characteritics are the main statistical challenges for modelling them. The first part of the thesis is dedicated to the coclustering of the pharmacovigilance contingency table thanks to the normalized Poisson latent block model. The objective is on the one hand, to provide pharmacologists with some interesting and reduced areas to explore more precisely. On the other hand, this coclustering remains a useful background information for dealing with individual database. Within this framework, a parameter estimation procedure for this model is detailed and objective model selection criteria are developed to choose the best fit model. Datasets are so large that we propose a procedure to explore the model space in coclustering, in a non exhaustive way but a relevant one. Additionnally, to assess the performances of the methods, a convenient coclustering index is developed to compare partitions with high numbers of clusters. The developments of these statistical tools are not specific to pharmacovigilance and can be used for any coclustering issue. The second part of the thesis is devoted to the statistical analysis of the large individual data, which are more numerous but also provides even more valuable information. The aim is to produce individual clusters according their drug profiles and subgroups of drugs and adverse effects with possible links, which overcomes the coprescription and masking phenomenons, common contingency table issues in pharmacovigilance. Moreover, the interaction between several adverse effects is taken into account. For this purpose, we propose a new model, the multiple latent block model which enables to cocluster two binary tables by imposing the same row ranking. Assertions inherent to the model are discussed and sufficient identifiability conditions for the model are presented. Then a parameter estimation algorithm is studied and objective model selection criteria are developed. Moreover, a numeric simulation model of the individual data is proposed to compare existing methods and study its limits. Finally, the proposed methodology to deal with individual pharmacovigilance data is presented and applied to a sample of the French pharmacovigilance database between 2002 and 2010. Pharmacovigilance Classification croisée Modèle des blocs latents Algorithmes bayésiens Pharmacovigilance Coclustering Latent block model Bayesian algorithms
5	Unsupervised random walk node embeddings for network block structure representation Lin, Christy 25 September 2021 (has links) There has been an explosion of network data in the physical, chemical, biological, computational, and social sciences in the last few decades. Node embeddings, i.e., Euclidean-space representations of nodes in a network, make it possible to apply to network data, tools and algorithms from multivariate statistics and machine learning that were developed for Euclidean-space data. Random walk node embeddings are a class of recently developed node embedding techniques where the vector representations are learned by optimizing objective functions involving skip-bigram statistics computed from random walks on the network. They have been applied to many supervised learning problems such as link prediction and node classification and have demonstrated state-of-the-art performance. Yet, their properties remain poorly understood. This dissertation studies random walk based node embeddings in an unsupervised setting within the context of capturing hidden block structure in the network, i.e., learning node representations that reflect their patterns of adjacencies to other nodes. This doctoral research (i) Develops VEC, a random walk based unsupervised node embedding algorithm, and a series of relaxations, and experimentally validates their performance for the community detection problem under the Stochastic Block Model (SBM). (ii) Characterizes the ergodic limits of the embedding objectives to create non-randomized versions. (iii) Analyzes the embeddings for expected SBM networks and establishes certain concentration properties of the limiting ergodic objective in the large network asymptotic regime. Comprehensive experimental results on real world and SBM random networks are presented to illustrate and compare the distributional and block-structure properties of node embeddings generated by VEC and related algorithms. As a step towards theoretical understanding, it is proved that for the variants of VEC with ergodic limits and convex relaxations, the embedding Grammian of the expected network of a two-community SBM has rank at most 2. Further experiments reveal that these extensions yield embeddings whose distribution is Gaussian-like, centered at the node embeddings of the expected network within each community, and concentrate in the linear degree-scaling regime as the number of nodes increases. / 2023-09-24T00:00:00Z Computer science Community detection Node embeddings Random walk Stochastic block model
6	Impact de l’échantillonnage sur l’inférence de structures dans les réseaux : application aux réseaux d’échanges de graines et à l’écologie / Impact of sampling on structure inference in networks : application to seed exchange networks and to ecology Tabouy, Timothée 30 September 2019 (has links) Dans cette thèse nous nous intéressons à l’étude du modèle à bloc stochastique (SBM) en présence de données manquantes. Nous proposons une classification des données manquantes en deux catégories Missing At Random et Not Missing At Random pour les modèles à variables latentes suivant le modèle décrit par D. Rubin. De plus, nous nous sommes attachés à décrire plusieurs stratégies d’échantillonnages de réseau et leurs lois. L’inférence des modèles de SBM avec données manquantes est faite par l’intermédiaire d’une adaptation de l’algorithme EM : l’EM avec approximation variationnelle. L’identifiabilité de plusieurs des SBM avec données manquantes a pu être démontrée ainsi que la consistance et la normalité asymptotique des estimateurs du maximum de vraisemblance et des estimateurs avec approximation variationnelle dans le cas où chaque dyade (paire de nœuds) est échantillonnée indépendamment et avec même probabilité. Nous nous sommes aussi intéressés aux modèles de SBM avec covariables, à leurs inférence en présence de données manquantes et comment procéder quand les covariables ne sont pas disponibles pour conduire l’inférence. Finalement, toutes nos méthodes ont été implémenté dans un package R disponible sur le CRAN. Une documentation complète sur l’utilisation de ce package a été écrite en complément. / In this thesis we are interested in studying the stochastic block model (SBM) in the presence of missing data. We propose a classification of missing data into two categories Missing At Random and Not Missing At Random for latent variable models according to the model described by D. Rubin. In addition, we have focused on describing several network sampling strategies and their distributions. The inference of SBMs with missing data is made through an adaptation of the EM algorithm : the EM with variational approximation. The identifiability of several of the SBM models with missing data has been demonstrated as well as the consistency and asymptotic normality of the maximum likelihood estimators and variational approximation estimators in the case where each dyad (pair of nodes) is sampled independently and with equal probability. We also looked at SBMs with covariates, their inference in the presence of missing data and how to proceed when covariates are not available to conduct the inference. Finally, all our methods were implemented in an R package available on the CRAN. A complete documentation on the use of this package has been written in addition. Modèle à blocs stochastiques Réseaux Données manquantes Networks Missing data Stochastic Block Model
7	Low-rank Matrix Estimation Fan, Xing 01 January 2024 (has links) (PDF) The first part of this dissertation focuses on matrix-covariate regression models. While they have been studied in many existing works, classical statistical and computational methods for the analysis of the regression coefficient estimation are highly affected by high dimensional matrix-valued covariates. To address these issues, we proposes a framework of matrix-covariate regression models based on a low-rank constraint and an additional regularization for structured signals, with considerations of models of both continuous and binary responses. In the second part, we examine a Mixture Multilayer Stochastic Block Model (MMLSBM), where layers can be grouped into sets of similar networks. Each group of networks is endowed with a unique Stochastic Block Model. The objective is to partition the multilayer network into clusters of similar layers and identify communities within those layers. We present an alternative approach called the Alternating Minimization Algorithm (ALMA), which aims to simultaneously recover the layer partition and estimate the matrices of connection probabilities for the distinct layers. In the last part, we demonstrates the effectiveness of the projected gradient descent algorithm. Firstly, its local convergence rate is independent of the condition number. Secondly, under conditions where the objective function is rank-2r restricted L-smooth and μ-strongly convex, with L/μ < 3, projected gradient descent with appropriate step size converges linearly to the solution. Moreover, a perturbed version of this algorithm effectively navigates away from saddle points, converging to an approximate solution or a second-order local minimizer across a wide range of step sizes. Furthermore, we establish that there are no spurious local minimizes in estimating asymmetric low-rank matrices when the objective function satisfies L/μ < 3. Low-rank matrix stochastic block model clustering generalized linear mode ill-conditioned matrix recovery Mathematics
8	[en] A MIP APPROACH FOR COMMUNITY DETECTION IN THE STOCHASTIC BLOCK MODEL / [pt] UMA ABORDAGEM DE PROGRAMAÇÃO INTEIRA MISTA PARA DETECÇÃO DE COMUNIDADES NO STOCHASTIC BLOCK MODEL BRENO SERRANO DE ARAUJO 04 November 2020 (has links) [pt] O Degree-Corrected Stochastic Block Model (DCSBM) é um modelo popular para geração de grafos aleatórios com estrutura de comunidade, dada uma sequência de graus esperados. O princípio básico de algoritmos que utilizam o DCSBM para detecção de comunidades é ajustar os parâmetros do modelo a dados observados, de forma a encontrar a estimativa de máxima verossimilhança, ou maximum likelihood estimate (MLE), dos parâmetros do modelo. O problema de otimização para o MLE é comumente resolvido por meio de heurísticas. Neste trabalho, propomos métodos de programação matemática, para resolver de forma exata o problema de otimização descrito, e comparamos os métodos propostos com heurísticas baseadas no algoritmo de expectation-maximization (EM). Métodos exatos são uma ferramenta fundamental para a avaliação de heurísticas, já que nos permitem identificar se uma solução heurística é sub-ótima e medir seu gap de otimalidade. / [en] The Degree-Corrected Stochastic Block Model (DCSBM) is a popular model to generate random graphs with community structure given an expected degree sequence. The standard approach of community detection algorithms based on the DCSBM is to search for the model parameters which are the most likely to have produced the observed network data, via maximum likelihood estimation (MLE). Current techniques for the MLE problem are heuristics and therefore do not guarantee convergence to the optimum. We present mathematical programming formulations and exact solution methods that can provably find the model parameters and community assignments of maximum likelihood given an observed graph. We compare the proposed exact methods with classical heuristic algorithms based on expectation-maximization (EM). The solutions given by exact methods give us a principled way of recognizing when heuristic solutions are sub-optimal and measuring how far they are from optimality. [pt] PROGRAMACAO INTEIRA MISTA [pt] APRENDIZADO NAO SUPERVISIONADO [pt] STOCHASTIC BLOCK MODEL [pt] DETECCAO DE COMUNIDADES [pt] BUSCA LOCAL [pt] MACHINE LEARNING [en] MIXED INTEGER PROGRAMMING [en] UNSUPERVISED LEARNING [en] STOCHASTIC BLOCK MODEL [en] COMMUNITY DETECTION [en] LOCAL SEARCH [en] MACHINE LEARNING
9	Estimation et sélection de modèle pour le modèle des blocs latents / Estimation and model selection for the latent block model Brault, Vincent 30 September 2014 (has links) Le but de la classification est de partager des ensembles de données en sous-ensembles les plus homogènes possibles, c'est-à-dire que les membres d'une classe doivent plus se ressembler entre eux qu'aux membres des autres classes. Le problème se complique lorsque le statisticien souhaite définir des groupes à la fois sur les individus et sur les variables. Le modèle des blocs latents définit une loi pour chaque croisement de classe d'objets et de classe de variables, et les observations sont supposées indépendantes conditionnellement au choix de ces classes. Toutefois, il est impossible de factoriser la loi jointe des labels empêchant le calcul de la logvraisemblance et l'utilisation de l'algorithme EM. Plusieurs méthodes et critères existent pour retrouver ces partitions, certains fréquentistes, d'autres bayésiens, certains stochastiques, d'autres non. Dans cette thèse, nous avons d'abord proposé des conditions suffisantes pour obtenir l'identifiabilité. Dans un second temps, nous avons étudié deux algorithmes proposés pour contourner le problème de l'algorithme EM : VEM de Govaert et Nadif (2008) et SEM-Gibbs de Keribin, Celeux et Govaert (2010). En particulier, nous avons analysé la combinaison des deux et mis en évidence des raisons pour lesquelles les algorithmes dégénèrent (terme utilisé pour dire qu'ils renvoient des classes vides). En choisissant des lois a priori judicieuses, nous avons ensuite proposé une adaptation bayésienne permettant de limiter ce phénomène. Nous avons notamment utilisé un échantillonneur de Gibbs dont nous proposons un critère d'arrêt basé sur la statistique de Brooks-Gelman (1998). Nous avons également proposé une adaptation de l'algorithme Largest Gaps (Channarond et al. (2012)). En reprenant leurs démonstrations, nous avons démontré que les estimateurs des labels et des paramètres obtenus sont consistants lorsque le nombre de lignes et de colonnes tendent vers l'infini. De plus, nous avons proposé une méthode pour sélectionner le nombre de classes en ligne et en colonne dont l'estimation est également consistante à condition que le nombre de ligne et de colonne soit très grand. Pour estimer le nombre de classes, nous avons étudié le critère ICL (Integrated Completed Likelihood) dont nous avons proposé une forme exacte. Après avoir étudié l'approximation asymptotique, nous avons proposé un critère BIC (Bayesian Information Criterion) puis nous conjecturons que les deux critères sélectionnent les mêmes résultats et que ces estimations seraient consistantes ; conjecture appuyée par des résultats théoriques et empiriques. Enfin, nous avons comparé les différentes combinaisons et proposé une méthodologie pour faire une analyse croisée de données. / Classification aims at sharing data sets in homogeneous subsets; the observations in a class are more similar than the observations of other classes. The problem is compounded when the statistician wants to obtain a cross classification on the individuals and the variables. The latent block model uses a law for each crossing object class and class variables, and observations are assumed to be independent conditionally on the choice of these classes. However, factorizing the joint distribution of the labels is impossible, obstructing the calculation of the log-likelihood and the using of the EM algorithm. Several methods and criteria exist to find these partitions, some frequentist ones, some bayesian ones, some stochastic ones... In this thesis, we first proposed sufficient conditions to obtain the identifiability of the model. In a second step, we studied two proposed algorithms to counteract the problem of the EM algorithm: the VEM algorithm (Govaert and Nadif (2008)) and the SEM-Gibbs algorithm (Keribin, Celeux and Govaert (2010)). In particular, we analyzed the combination of both and highlighted why the algorithms degenerate (term used to say that it returns empty classes). By choosing priors wise, we then proposed a Bayesian adaptation to limit this phenomenon. In particular, we used a Gibbs sampler and we proposed a stopping criterion based on the statistics of Brooks-Gelman (1998). We also proposed an adaptation of the Largest Gaps algorithm (Channarond et al. (2012)). By taking their demonstrations, we have shown that the labels and parameters estimators obtained are consistent when the number of rows and columns tend to infinity. Furthermore, we proposed a method to select the number of classes in row and column, the estimation provided is also consistent when the number of row and column is very large. To estimate the number of classes, we studied the ICL criterion (Integrated Completed Likelihood) whose we proposed an exact shape. After studying the asymptotic approximation, we proposed a BIC criterion (Bayesian Information Criterion) and we conjecture that the two criteria select the same results and these estimates are consistent; conjecture supported by theoretical and empirical results. Finally, we compared the different combinations and proposed a methodology for co-clustering. Modèle des blocs latents Echantillonneur de Gibbs Critère ICL Approximation variationnelle Latent Block Model Gibbs Sampling Variational Approximation
10	How do principals support implementation of an inclusive school reform? Epp, Brent A. 17 March 2015 (has links) This qualitative study examines how principals support the implementation of the Three-Block Model of Universal Design for Learning (Katz, 2012a), a framework for inclusive school reform. The ways that principals can support inclusive practice may include the way they use systems and structures that fall under their control (Katz, 2012a). Instructional leadership also plays a crucial part in implementing inclusive school reform (Leithwood & Riehl, 2005). Data were collected through semi-structured interviews with five Manitoba principals involved in implementation of the Three-Block Model of UDL. Principals were asked about leadership and how they manage systems and structures under their control. Recommendations for practice are made, including the need for the school to be organized to support inclusive practice, for principals to make developing people a key task, and for principals to be highly involved in classroom instruction within the school. Inclusion Diversity Universal Design for Learning Appropriate Educational Programming Disability Principal Instructional Leadership Three-Block Model UDL

Search results