Global ETD Search

1	Predictive Accuracy of Linear Models with Ordinal Regressors Modin Larsson, Jim January 2016 (has links) This paper considers four approaches to ordinal predictors in linear regression to evaluate how these contrast with respect to predictive accuracy. The two most typical treatments, namely, dummy coding and classic linear regression on assigned level scores are compared with two improved methods; penalized smoothed coefficients and a generalized additive model with cubic splines. A simulation study is conducted to assess all on the basis of predictive performance. Our results show that the dummy based methods surpass the numeric at low sample sizes. Although, as sample size increases, differences between the methods diminish. Tendencies of overfitting are identified among the dummy methods. We conclude by stating that the choice of method not only ought to be context driven, but done in the light of all characteristics. variable classification predictive performance model specification discretized continuous data
2	Power Analysis of Continuous Data Capture in BeePi, a Solar- Powered Multi-Sensor Electronic Beehive Monitoring System for Langstroth Beehives Shah, Keval 01 May 2017 (has links) This thesis describes the power analysis of the electronic beehive monitoring system. The electronic beehive monitoring system was made to work either with a UB12120 12V 12Ah standard lead-acid battery or an Anker (TM) Astro E7 5V lithium-ion battery to analyze the power requirements. These batteries are recharged by Renogy 50Watt 12 Volt Monocrystalline Solar Panel. Power analysis is performed using both batteries to calculate system’s efficiency. The performed power analysis indicates that the Anker (TM) Astro E7 26800mAh 5V lithium-ion battery runs approximately 6 hours more than the lead acid battery. Moreover, the lithium-ion battery is compact, has a lighter weight, is more efficient, and has a longer cycle life. Using lithium-ion batteries will likely result in fewer hardware components and a smaller environmental footprint. power analysis continuous data capture beepi beehive monitoring system Computer Sciences
3	Adjusting for covariates in zero-inflated gamma and zero-inflated log-normal models for semicontinuous data Mills, Elizabeth Dastrup 01 May 2013 (has links) Semicontinuous data consist of a combination of a point-mass at zero and a positive skewed distribution. This type of non-negative data distribution is found in data from many fields, but presents unique challenges for analysis. Specifically, these data cannot be analyzed using positive distributions, but distributions that are unbounded are also likely a poor fit. Two-part models incorporate both the zero values from semicontinuous data and the positive continuous values. In this dissertation, we compare zero-inflated gamma (ZIG) and zero-inflated log-normal (ZILN) two-part models. For both of these models, the probability that an outcome is non-zero is modeled via logistic regression. Then the distribution of the non-zero outcomes is modeled via gamma regression with a log-link for ZIG regression and via log-normal regression for ZILN. In this dissertation we propose tests which combine the two parts of the ZIG and ZILN models in meaningful ways for performing a two group comparison. Then we compare these tests in terms of observed Type 1 error rates and power levels under both correctly specified and misspecified ZIG and ZILN models. Tests falling under two main hypotheses are examined. First, we look at two-part tests which come from a two-part hypothesis of no difference between the two groups in terms of the probability of non-zero values and in terms of the the mean of the non-zero values. The second type of tests are mean-based tests. These combine the two parts of the model in ways related to the overall group means of the semicontinuous variable. When not adjusting for covariates, two tests are developed based on a difference of means (DM) and a ratio of means (RM). When adjusting for covariates, tests using mean-based hypotheses are developed which marginalize over the values of the adjusting covariates. Under the adjusting framework, two ratio of means statistics are proposed and examined, an average of the subject specific ratio of means (RMSS) and a ratio of the marginal group means (RMMAR). Simulations are used to compare Type 1 error and power for these tests and standard two group comparison tests. Simulation results show that when ZIG and ZILN models are misspecified and the coefficient of variation (CoV) and/or sample size is large, there are differences in Type 1 error and power results between the misspecified and correctly specified models. Specifically, when ZILN data with high CoV or sample size are analyzed as ZIG, Type 1 error rates are prohibitively high. On the other hand, when ZIG data are analyzed as ZILN under these scenarios, power levels are much lower for ZILN analyses than for ZIG analyses. Examination of Q-Q plots show, however, that in these settings, distinguishing between ZIG and ZILN data can be relatively straightforward. When the coefficient of variation is small it is harder to distinguish between ZIG and ZILN models, but the differences between Type 1 error rates and power levels for misspecified or correctly specified models is also slight. Finally, we use the proposed methods to analyze a data set involving Parkinson's disease (PD) and driving. A number of these methods show that PD subjects exhibit poorer lane keeping ability than control subjects. consonant effects dissonant effects semi-continuous data two-part tests zero-inflated gamma zero-inflated log-normal Biostatistics
4	Constraining 3D Petroleum Reservoir Models to Petrophysical Data, Local Temperature Observations, and Gridded Seismic Attributes with the Ensemble Kalman Filter (EnKF) Zagayevskiy, Yevgeniy Unknown Date No description available. Ensemble Kalman Filter EnKF Continuous Data Integration Petroleum Reservoir Characterization Geostatistical Modeling Petroelastic Model Temperature Data Assimilation
5	Překážkové modely v neživotním pojištění / Hurdle models in non-life insurance Tian, Cheng January 2018 (has links) A number of articles only present hurdle models for count data. we are motivated to present hurdle models for semi-continuous data. Because semi- continuous data is also commonly seen in non-life insurance. The thesis deals with the parameterization of various hurdle models for semi-continuous data besides for count data in non-life insurance. Two components of a hurdle model are modeled separately. A hurdle component is modeled by a logistic regression. For a semi-continuous data, a continuous component is modeled by several various regressions. Parameters of each component are estimated through maximum likelihood estimation. Model selection is mentioned before theoretical approaches are applied on the vehicle insurance data. Finally, we get some predicted values based on the fitted models. The prediction gives insurance companies a general idea on setting premium but not accurate. 1
6	Continuous data assimilation for Navier-Stokes-alpha model = Assimilação contínua de dados para o modelo Navier-Stokes-alpha / Assimilação contínua de dados para o modelo Navier-Stokes-alpha Albanez, Débora Aparecida Francisco, 1984- 04 October 2014 (has links) Orientadores: Milton da Costa Lopes Filho, Helena Judith Nussenzveig Lopes / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica / Made available in DSpace on 2018-08-25T00:41:15Z (GMT). No. of bitstreams: 1 Albanez_DeboraAparecidaFrancisco_D.pdf: 3117782 bytes, checksum: 4f8e30c3d217ed3a6d26e9924d4df7ab (MD5) Previous issue date: 2014 / Resumo: Motivados pela existênca de um número finito de parâmetros determinantes (graus de liberdade), tais como modos, nós e médias espaciais locais para sistemas dinâmicos dissipativos, principalmente as equações de Navier-Stokes, apresentamos nesta tese um novo algoritmo de assimilação contínua de dados para o modelo tridimensional das equações Navier-Stokes-alpha, o qual consiste na introdução de um tipo geral de operador interpolante de aproximação (construído a partir de medições observacionais) dentro das equações de Navier-Stokes-alpha. O principal resultado garante condições sob a resolução espacial de dimensão finita dos dados coletados, suficientes para que a solução aproximada, construída a partir desses dados coletados, convirja para a referente solução que não conhecemos (realidade física) no tempo. Essas condições são dadas em termos de alguns parâmetros físicos, tais como a viscosidade cinemática, o tamanho do domínio e o termo de força / Abstract: Motivated by the presence of the finite number of determining parameters (degrees of freedom) such as modes, nodes and local spatial averages for dissipative dynamical systems, specially Navier-Stokes equations, we present in this thesis a new continuous data assimilation algorithm for the three-dimensional Navier-Stokes-alpha model, which consists of introducing a general type of approximation interpolation operator, (that is constructed from observational measurements), into the Navier-Stokes-alpha equations. The main result provides conditions on the finite-dimensional spatial resolution of the collected data, sufficient to guarantee that the approximating solution, that is obtained from these collected data, converges to the unkwown reference solution (physical reality) over time. These conditions are given in terms of some physical parameters, such as kinematic viscosity, the size of the domain and the forcing term / Doutorado / Matematica / Doutora em Matemática Modos determinantes Elementos de volume Nós determinantes Assimilação contínua de dados Determining modes Volume elements Determining nodes Continuous data assimilation
7	Mise au point de techniques de traitement de données en continu pour l’identification des composantes de débit à l’exutoire des bassins versants urbains : étude de cas des bassins versants Django Rheinhardt et Ecully / Development of processing techniques for continuous data to identify flow components at the outlet of urban catchments : case study of Django Reinhardt and Ecully watershed Dorval, Farah Altagracia 20 June 2011 (has links) L’objectif de cette thèse est de développer, tester et valider des méthodes, techniques et outils permettant de traiter et décomposer les hydrogrammes en temps sec et en temps de pluie, dans le but de comprendre, de représenter et de prédire les dynamiques liées à ces composantes de débits sur des bassins versants urbanisés. Les données en continu de temps sec recueillies sur ces deux bassins versants Chassieu et Ecully ont été traitées à partir de la méthode de traitement des signaux bruités par ondelettes, puis analysées. L’utilisation de ces méthodes et l’analyse des données en continu a permis de mettre en évidence des composantes périodiques intra et inter journalières dans les débits mesurés. Ces composantes ont ensuite été caractérisées puis ont servi de base pour l’élaboration d’une typologie des hydrogrammes de temps sec relatif à chaque site d’étude. Des méthodes, techniques et outils de traitement, d’analyse de séries de données et de calage de modèles pluie-débit ont également été utilisés et deux modèles pluie-débit ont été proposés pour représenter : (i.) la composante liée aux eaux de ruissellement pour les deux sites d’études et (ii.) la composante d’eaux parasites d’infiltration événementielle. La typologie des hydrogrammes de temps sec ainsi que les modèles de production de flux d’eaux en périodes pluvieuses ont été implémentés dans une plate-forme de modélisation hydrologique appelée « Hydrobox ». Les débits simulés ont ensuite été confrontés aux débits mesurés. Les résultats de comparaison montrent l’intérêt de prendre en compte la signature particulière portée par chaque composante dans le but d’améliorer la compréhension et la représentation des dynamiques liées aux processus hydrologiques intervenant sur des bassins versants urbanisés. / The objective of this thesis is to develop, test and validate methods, techniques and tools which can process and decompose hydrographs in order to understand, represent and predict the dynamics associated with these flow components in urbanized watersheds. The development of the methodology is based on rainfall and runoff data including qualitatives measures of the flow rate (conductivity, pH and turbidity) continuously acquired as part of the Field Observatory for Urban Hydrology (OTHU) for two watersheds in Lyon: Django Reinhardt (Chassieu) and Ecully. The continuous data collected in dry weather period from these two watersheds were analyzed using wavelets transforms. These methods combined to signal treatments analysis helped to reveal periodic component in the measured flows. These components were then characterized and used as a basis for developing a typology of hydrographs of dry weather period for each study site. Methods, techniques and tools for processing and analyzing of data sets and calibrating of rainfall-runoff models have been used to propose two models which represent respectively: (i) the component related to the runoff contribution for the two study sites and (ii.) the component related to parasitic water infiltration. The typology of hydrographs for dry weather period, the rainfall-runoff model and the infiltration-inflow model were implemented in a platform for hydrological modeling called “Hydrobox”. The simulated and the measured flow values were then compared. The comparison results show the importance of taking into account the particular signature carried by each component in order to improve the understanding and representing the dynamics related to hydrological processes in urbanized watersheds. Environnement Ecologie Génie des eaux Hydrologie urbaine Bassin versant Modélisation hydrologique distribuée Processus hydrologique Eaux urbaines Calibration Continuous data Flow components Hydrological model Urban wastewater 628.210 72
8	Design för översikt av kontinuerliga dataflöden : En studie om informationsgränssnitt för energimätning till hjälp för fastighetsbolag Pettersson, Emma, Karlsson, Johanna January 2018 (has links) Programvaror och gränssnitt är idag en naturlig del av vår vardag. Att ta fram användbara och framgångsrika gränssnitt är i företagens intresse då det kan leda till nöjdare och fler kunder. Problemformulering i den här rapporten bygger på användarundersökningar som genomförts på ett energipresenterade informationsgränssnitt som används av personer i fastighetsbranschen. Företaget som äger programvaran genomförde en enkätundersökning, i den indikerades att programvarans användbarhet behövde utvecklas och detta gavs i uppgift till projektgruppen att vidareutveckla. Vidareutvecklingen baseras på Delone och McLeans (2003) Information system success model samt begreppen informationsdesign, användbarhet och featuritis. Utifrån dessa skapades den teoretiska bakgrund som låg till grund för de kvalitativa intervjuerna och frågeformulär som togs fram. Den teoretiska bakgrunden låg dessutom till grund för de gränssnittsförslag som slutligen togs fram i projektet (Se figur 4). Resultatet av undersökningen visade att användare och supportpersonal hade förhållandevis olika upplevelser av Programvaran. Andra slutsatser som kunde dras om hur ett informationsgränssnitt ska designas för att fungera som stöd för användaren var följande. Det ska följa konventionella designmönster som ska vara konsekvent genom hela programvaran. De ska använda ett anpassat och tydligt språk och antingen vara så tydlig och intuitiv att alla verkligen kan förstå programvaran eller ha en bra och tydlig manual. / Software and interfaces are today a natural part of our everyday lives. Developing useful and successful interfaces is in business interest as it can lead to more satisfied customers. The problem in this report is based on user surveys conducted on an energy-presented information interface used by individuals in the real estate industry. The company that owns the software conducted a survey, indicating that the software usability needed to develop, and this was assigned to the project team to further develop. Further development is based on Delone and McLeans (2003) Information System Success Model as well as the terms information design, usability and featuritis. Based on these, the theoretical background used was the basis for the qualitative interviews and questionnaires that were presented. The theoretical background provided the basis for the interface proposals that were finally presented in the project (See Figure 6). The results of the survey showed that users and support staff had relatively different experiences of the software. The other conclusions that could be drawn about how an information interface should be designed to serve as support for the user were the following, it should follow conventional design patterns. The design should be consistent throughout the software, it should use an adapted and clear language, and either be so clear and intuitive that anyone can understand the software or offer a clear manual. Information Interfaces Continuous Data Flow Real Estate Industry Information System Success Model Usability Information Design Featuritis Informationsgränssnitt kontinuerliga dataflöden fastighetsbranschen information system success model användbarhet informationsdesign featuritis Human Computer Interaction
9	Statistical inference for joint modelling of longitudinal and survival data Li, Qiuju January 2014 (has links) In longitudinal studies, data collected within a subject or cluster are somewhat correlated by their very nature and special cares are needed to account for such correlation in the analysis of data. Under the framework of longitudinal studies, three topics are being discussed in this thesis. In chapter 2, the joint modelling of multivariate longitudinal process consisting of different types of outcomes are discussed. In the large cohort study of UK north Stafforshire osteoarthritis project, longitudinal trivariate outcomes of continuous, binary and ordinary data are observed at baseline, year 3 and year 6. Instead of analysing each process separately, joint modelling is proposed for the trivariate outcomes to account for the inherent association by introducing random effects and the covariance matrix G. The influence of covariance matrix G on statistical inference of fixed-effects parameters has been investigated within the Bayesian framework. The study shows that by joint modelling the multivariate longitudinal process, it can reduce the bias and provide with more reliable results than it does by modelling each process separately. Together with the longitudinal measurements taken intermittently, a counting process of events in time is often being observed as well during a longitudinal study. It is of interest to investigate the relationship between time to event and longitudinal process, on the other hand, measurements taken for the longitudinal process may be potentially truncated by the terminated events, such as death. Thus, it may be crucial to jointly model the survival and longitudinal data. It is popular to propose linear mixed-effects models for the longitudinal process of continuous outcomes and Cox regression model for survival data to characterize the relationship between time to event and longitudinal process, and some standard assumptions have been made. In chapter 3, we try to investigate the influence on statistical inference for survival data when the assumption of mutual independence on random error of linear mixed-effects models of longitudinal process has been violated. And the study is conducted by utilising conditional score estimation approach, which provides with robust estimators and shares computational advantage. Generalised sufficient statistic of random effects is proposed to account for the correlation remaining among the random error, which is characterized by the data-driven method of modified Cholesky decomposition. The simulation study shows that, by doing so, it can provide with nearly unbiased estimation and efficient statistical inference as well. In chapter 4, it is trying to account for both the current and past information of longitudinal process into the survival models of joint modelling. In the last 15 to 20 years, it has been popular or even standard to assume that longitudinal process affects the counting process of events in time only through the current value, which, however, is not necessary to be true all the time, as recognised by the investigators in more recent studies. An integral over the trajectory of longitudinal process, along with a weighted curve, is proposed to account for both the current and past information to improve inference and reduce the under estimation of effects of longitudinal process on the risk hazards. A plausible approach of statistical inference for the proposed models has been proposed in the chapter, along with real data analysis and simulation study. 519.5
10	Étapes préliminaires à l’élaboration de systèmes d’aide au diagnostic automatisé de l’hypoxémie aigüe pédiatrique Sauthier, Michaël Sébastien 08 1900 (has links) L’insuffisance respiratoire hypoxémique aigüe (IRHA) est une des causes les plus fréquentes d’admission aux soins intensifs pédiatriques. Elle est liée à plusieurs mécanismes dont le plus grave est l’œdème pulmonaire lésionnel conduisant au syndrome de détresse respiratoire aigüe (SDRA) pédiatrique qui représente 5-10 % des patients admis aux soins intensifs. Actuellement, les recommandations internationales de prise en charge de l’IRHA et du SDRA sont sous-appliquées du fait d’un défaut de diagnostic ou d’un diagnostic tardif. Ceci est probablement en partie responsable d’une ventilation mécanique prolongée dans le SDRA pédiatrique. Afin d’améliorer les critères d’évaluation de l’IRHA chez les enfants et éventuellement leur devenir, les 3 objectifs de cette thèse sont d’améliorer le diagnostic précoce d’IRHA chez l’enfant, informatiser un score de gravité de défaillance d’organes (score PELOD-2) utilisable comme critère de jugement principal en recherche en remplacement de la mortalité qui est faible dans cette population et prédire la ventilation prolongée chez la population la plus fragile, les nouveau-nés. Pour réaliser ces objectifs, nous avons : 1) optimisé une base de données haute résolution temporelle unique au monde, 2) validé un indice continu d’oxygénation utilisable en temps réel et robuste à toutes les valeurs de saturations pulsées en oxygène, 3) validé une version informatisée du score PELOD-2 utilisable comme critère de jugement principal en recherche, 4) développé un modèle prédictif d’IRHA persistante dû à l’influenza et 5) proposé une définition de la ventilation prolongée en pédiatrie applicable quel que soit l’âge et le terme de l’enfant et 6) étudié le devenir des nouveau-nés ayant une ventilation prolongée et proposé un modèle prédictif du sous-groupe le plus grave. Les méthodes utilisées à travers ces différentes études ont associé la science des données massives pour le regroupement, la synchronisation et la normalisation des données continues. Nous avons également utilisé les statistiques descriptives, la régression linéaire et logistique, les forêts aléatoires et leurs dérivés, l’apprentissage profond et l’optimisation empirique d’équations mathématiques pour développer et valider des modèles prédictifs. L’interprétation des modèles et l’importance de chaque variable ont été quantifiées soit par l’analyse de leurs coefficients (statistiques conventionnelles) soit par permutation ou masquage des variables dans le cas de modèles d’apprentissage automatique. En conclusion, l’ensemble de ce travail, soit la reconnaissance et la pronostication automatique de l’IRHA chez l’enfant vont me permettre de développer, de valider et d’implanter un système d’aide à la décision en temps réel pour l’IRHA en pédiatrie. / Acute hypoxemic respiratory failure (AHRF) is one of the most frequent causes of admission to pediatric intensive care units. It is related to several mechanisms, the most serious of which is lesional pulmonary edema leading to pediatric acute respiratory distress syndrome (ARDS), which accounts for 5–10% of patients admitted to intensive care. Currently, international guidelines for the management of ARDS are under-implemented due to failure to diagnose or late diagnosis. This is probably partly responsible for prolonged mechanical ventilation in pediatric ARDS. In order to improve the criteria for assessing AHRF in children and possibly their outcome, we aimed to improve the early diagnosis of ARDS in children, to automate an organ failure severity score (PELOD-2 score) that can be used as a primary endpoint in research to replace mortality, which is low in this population, and to predict prolonged ventilation in the most fragile population, neonates. To achieve these objectives, we have: 1) optimized a unique high temporal resolution database, 2) validated a continuous oxygenation index usable in real time and robust to all values of pulsed oxygen saturation, 3) validated a computerized version of the PELOD-2 score usable as a primary outcome in research, 4) developed a predictive model of persistent AHRF due to influenza and 5) proposed a definition of prolonged ventilation in pediatrics applicable regardless of the age and term of the child and 6) studied the outcome of newborns with prolonged ventilation and proposed a predictive model of the most severe subgroup. The methods used across these different studies combined big data science for clustering, synchronization, and normalization of continuous data. We also used descriptive statistics, linear and logistic regression, random forests and their derivatives, deep learning, and empirical optimization of mathematical equations to develop and validate predictive models. The interpretation of the models and the importance of each variable were quantified either by analyzing their coefficients (conventional statistics) or by permuting or masking the variables in the case of machine learning models. In conclusion, all this work, i.e. the recognition and automatic prognosis of AHRF in children will allow me to develop, validate and implement a real-time decision support system for AHRF in pediatrics. enfant insuffisance respiratoire hypoxémie données massives données continues children respiratory failure hypoxemia big data continuous data high temporal resolution database computer decision support

Search results