Spelling suggestions: "subject:"data 2analysis"" "subject:"data 3analysis""
181 |
Characterization and mitigation of radiation damage on the Gaia Astrometric FieldBrown, Scott William January 2011 (has links)
In November 2012, the European Space Agency (ESA) is planning to launch Gaia, a mission designed to measure with microarcsecond accuracy the astrometric properties of over a billion stars. Microarcsecond astrometry requires extremely accurate positional measurements of individual stellar transits on the focal plane, which can be disrupted by radiation-induced Charge Transfer Inefficiency (CTI). Gaia will suffer radiation damage, impacting on the science performance, which has led to a series of Radiation Campaigns (RCs) being carried out by industry to investigate these issues. The goal of this thesis is to rigorously assess these campaigns and facilitate how to deal with CTI in the data processing. We begin in Chapter 1 by giving an overview of astrometry and photometry, introducing the concept of stellar parallax, and establishing why observing from space is paramount for performing global, absolute astrometry. As demonstrated by Hipparcos, the concept is sound. After reviewing the Gaia payload and discussing how astrometric and photometric parameters are determined in practice, we introduce the issue of radiation-induced CTI and how it may be dealt with. The on board mitigating strategies are investigated in detail in Chapter 2. Here we analyse the effects of radiation damage as a function of magnitude with and without a diffuse optical background, charge injection and the use of gates, and also discover a number of calibration issues. Some of these issues are expected to be removed during flight testing, others will have to be dealt with as part of the data processing, e.g. CCD stitches and the charge injection tail. In Chapter 3 we turn to look at the physical properties of a Gaia CCD. Using data from RC2 we probe the density of traps (i.e. damaged sites) in each pixel and, for the first time, measure the Full Well Capacity of the Supplementary Buried Channel, a part of every Gaia pixel that constrains the passage of faint signals away from the bulk of traps throughout the rest of the pixel. The Data Processing and Analysis Consortium (DPAC) is currently adopting a 'forward modelling' approach to calibrate radiation damage in the data processing. This incorporates a Charge Distortion Model (CDM), which is investigated in Chapter 4. We find that although the CDM performs well there are a number of degeneracies in the model parameters, which may be probed further by better experimental data and a more realistic model. Another way of assessing the performance of a CDM is explored in Chapter 5. Using a Monte Carlo approach we test how well the CDM can extract accurate image parameters. It is found that the CDM must be highly robust to achieve a moderate degree of accuracyand that the fitting is limited by assigning finite window sizes to the image shapes. Finally, in Chapter 6 we summarise our findings on the campaign analyses, the on-board mitigating strategies and on how well we are currently able to handle radiation damage in the data processing.
|
182 |
Metody geomarketingu / Geomarketing methodsVoráč, Michal January 2014 (has links)
Aim of this application-oriented master's thesis is to prove a benefits from using data analysis techniques connected with geodata processing to support business decisions. As a conclusion two solutions are given which are more attractive then starting situation. Since the first solution proposed is oriented on giving maximum success ratio while considering transactions, the second one is oriented on business value of each transaction. In this thesis R programming language is widely used together with ArcGIS Online in its final part.
|
183 |
Možnosti využitia Business Intelligence nástrojov v cloude / Possibilities of using Business Intelligence tools in the cloudRoman, Martin January 2012 (has links)
Thesis is devoted on Business Intelligence tools in cloud, as one of the main trends of this area. BI tools in cloud have become widely used after companies begin to realize importance of data analysis for getting a competitive advantage. The high cost for implementation of a BI caused, that companies started discovering tools which can be implementing and operating with significantly lower costs and outside of their own infrastructure. The practical part of the thesis is focused on analysis and comparison currently using BI cloud tools. The analysis is based on practical examples which should test selected solutions for using at corporate sector of small and medium enterprises. Each of selected solutions is analyzed from several points of view and rated. Beside analysis is the next main goal of the thesis evaluation of best solution for companies and define potentials benefits and limitations which can follow implementation and operation.
|
184 |
Análise de dados utilizando a medida de tempo de consenso em redes complexas / Data anlysis using the consensus time measure for complex networksJean Pierre Huertas Lopez 30 March 2011 (has links)
Redes são representações poderosas para muitos sistemas complexos, onde vértices representam elementos do sistema e arestas representam conexões entre eles. Redes Complexas podem ser definidas como grafos de grande escala que possuem distribuição não trivial de conexões. Um tópico importante em redes complexas é a detecção de comunidades. Embora a detecção de comunidades tenha revelado bons resultados na análise de agrupamento de dados com grupos de diversos formatos, existem ainda algumas dificuldades na representação em rede de um conjunto de dados. Outro tópico recente é a caracterização de simplicidade em redes complexas. Existem poucos trabalhos nessa área, no entanto, o tema tem muita relevância, pois permite analisar a simplicidade da estrutura de conexões de uma região de vértices, ou de toda a rede. Além disso, mediante a análise de simplicidade de redes dinâmicas no tempo, é possível conhecer como vem se comportando a evolução da rede em termos de simplicidade. Considerando a rede como um sistema dinâmico de agentes acoplados, foi proposto neste trabalho uma medida de distância baseada no tempo de consenso na presença de um líder em uma rede acoplada. Utilizando essa medida de distância, foi proposto um método de detecção de comunidades para análise de agrupamento de dados, e um método de análise de simplicidade em redes complexas. Além disso, foi proposto uma técnica de construção de redes esparsas para agrupamento de dados. Os métodos têm sido testados com dados artificiais e reais, obtendo resultados promissores / Networks are powerful representations for many complex systems, where nodes represent elements of the system and edges represent connections between them. Complex networks can be defined as graphs with no trivial distribution of connections. An important topic in complex networks is the community detection. Although the community detection have reported good results in the data clustering analysis with groups of different formats, there are still some dificulties in the representation of a data set as a network. Another recent topic is the characterization of simplicity in complex networks. There are few studies reported in this area, however, the topic has much relevance, since it allows analyzing the simplicity of the structure of connections between nodes of a region or connections of the entire network. Moreover, by analyzing simplicity of dynamic networks in time, it is possible to know the behavior in the network evolution in terms of simplicity. Considering the network as a coupled dynamic system of agents, we proposed a distance measure based on the consensus time in the presence of a leader in a coupled network. Using this distance measure, we proposed a method for detecting communities to analyze data clustering, and a method for simplicity analysis in complex networks. Furthermore, we propose a technique to build sparse networks for data clustering. The methods have been tested with artificial and real data, obtaining promising results
|
185 |
Identifying the factors that affect the severity of vehicular crashes by driver ageTollefson, John Dietrich 01 December 2016 (has links)
Vehicular crashes are the leading cause of death for young adult drivers, however, very little life course research focuses on drivers in their 20s. Moreover, most data analyses of crash data are limited to simple correlation and regression analysis. This thesis proposes a data-driven approach and usage of machine-learning techniques to further enhance the quality of analysis.
We examine over 10 years of data from the Iowa Department of Transportation by transforming all the data into a format suitable for data analysis. From there, the ages of drivers present in the crash are discretized depending on the ages of drivers present for better analysis. In doing this, we hope to better discover the relationship between driver age and factors present in a given crash.
We use machine learning algorithms to determine important attributes for each age group with the goal of improving predictivity of individual methods. The general format of this thesis follows a Knowledge Discovery workflow, preprocessing and transforming the data into a usable state, from which we perform data mining to discover results and produce knowledge.
We hope to use this knowledge to improve the predictivity of different age groups of drivers with around 60 variables for most sets as well as 10 variables for some. We also explore future directions this data could be analyzed in.
|
186 |
A joint model of an internal time-dependent covariate and bivariate time-to-event data with an application to muscular dystrophy surveillance, tracking and research network dataLiu, Ke 01 December 2015 (has links)
Joint modeling of a single event time response with a longitudinal covariate dates back to the 1990s. The three basic types of joint modeling formulations are selection models, pattern mixture models and shared parameter models. The shared parameter models are most widely used. One type of a shared parameter model (Joint Model I) utilizes unobserved random effects to jointly model a longitudinal sub-model and a survival sub-model to assess the impact of an internal time-dependent covariate on the time-to-event response.
Motivated by the Muscular Dystrophy Surveillance, Tracking and Research Network (MD STARnet), we constructed a new model (Joint Model II), to jointly analyze correlated bivariate time-to-event responses associated with an internal time-dependent covariate in the Frequentist paradigm. This model exhibits two distinctive features: 1) a correlation between bivariate time-to-event responses and 2) a time-dependent internal covariate in both survival models. Developing a model that sufficiently accommodates both characteristics poses a challenge. To address this challenge, in addition to the random variables that account for the association between the time-to-event responses and the internal time-dependent covariate, a Gamma frailty random variable was used to account for the correlation between the two event time outcomes. To estimate the model parameters, we adopted the Expectation-Maximization (EM) algorithm. We built a complete joint likelihood function with respect to both latent variables and observed responses. The Gauss-Hermite quadrature method was employed to approximate the two-dimensional integrals in the E-step of the EM algorithm, and the maximum profile likelihood type of estimation method was implemented in the M-step. The bootstrap method was then applied to estimate the standard errors of the estimated model parameters. Simulation studies were conducted to examine the finite sample performance of the proposed methodology. Finally, the proposed method was applied to MD STARnet data to assess the impact of shortening fractions and steroid use on the onsets of scoliosis and mental health issues.
|
187 |
Panel data analysis of fuel price elasticities to vehicle-miles traveled for first year participants of the national evaluation of a mileage-based road user charge studyHatz, Charles Nicholas, II 01 July 2011 (has links)
The impact of fuel price changes can be seen in practically all sectors of the United States economy. Fuel prices directly and indirectly influence the daily life of most Americans. The national economy as well as the high standard of living we have come to enjoy in the United States is run on gasoline. Since the late 1990's the days of cheap oil and $1.00 gallons of gas are clearly over, understanding the influences of fuel price is more important now than ever. Since 1998 regular gasoline prices have increased $0.22 per gallon per year on average through the present with little evidence suggesting this trend will slow down or reverse substantially. The drastic and permanent change to the status quo of fuel prices has potentially rendered traditional knowledge of fuel price elasticities inapplicable to current analysis. Obtaining accurate measures of fuel price elasticities is important as it is used as a measure of personal mobility and can be related to the quality of life the public is experiencing. Price elasticities are also used in determining the future revenue available for surface transportation projects. Traditionally, short-run fuel price elasticities are thought to be inelastic allowing transportation agencies to ignore short-run fuel price changes to some degree when creating future projects and evaluating its economic feasibility. By using driving data collected from The National Evaluation of a Mileage-based Road User Study the fuel price elasticity of vehicle-miles traveled (VMT), as well as the sensitivity of gas prices relative to a historical high price, were estimated for the first year study participants using a panel data set approach with linear regression. The short-run fuel price elasticity of VMT was determined to be -1.71 with a range of -1.93 and -1.48. The elasticities found were significantly higher than the average short-run fuel price elasticity of -0.45 but can be rationalized by the impact poor economic conditions as well as the historically high fuel prices experienced prior to the researches time table had on the individuals driving behavior. The results suggest current short-run elasticities are not inelastic, if this trend continues transportation agencies must re-evaluate how they predict the future funding available for surface transportation projects.
|
188 |
Vitesses de convergence en inférence géométrique / Rates of Convergence for Geometric InferenceAamari, Eddie 01 September 2017 (has links)
Certains jeux de données présentent des caractéristiques géométriques et topologiques non triviales qu'il peut être intéressant d'inférer.Cette thèse traite des vitesses non-asymptotiques d'estimation de différentes quantités géométriques associées à une sous-variété M ⊂ RD. Dans chaque cas, on dispose d'un n-échantillon i.i.d. de loi commune P ayant pour support M. On étudie le problème d'estimation de la sous-variété M pour la perte donnée par la distance de Hausdorff, du reach τM, de l'espace tangent TX M et de la seconde forme fondamentale I I MX, pour X ∈ M à la fois déterministe et aléatoire.Les vitesses sont données en fonction la taille $n$ de l'échantillon, de la dimension intrinsèque de M ainsi que de sa régularité.Dans l'analyse, on obtient des résultats de stabilité pour des techniques de reconstruction existantes, une procédure de débruitage ainsi que des résultats sur la géométrie du reach τM. Une extension du lemme d'Assouad est exposée, permettant l'obtention de bornes inférieures minimax dans des cadres singuliers. / Some datasets exhibit non-trivial geometric or topological features that can be interesting to infer.This thesis deals with non-asymptotic rates for various geometric quantities associated with submanifolds M ⊂ RD. In all the settings, we are given an i.i.d. n-sample with common distribution P having support M. We study the optimal rates of estimation of the submanifold M for the loss given by the Hausdorff metric, of the reach τM, of the tangent space TX M and the second fundamental form I I MX, for X ∈ M both deterministic and random.The rates are given in terms of the sample size n, the instrinsic dimension of M, and its smoothness.In the process, we obtain stability results for existing reconstruction techniques, a denoising procedure and results on the geometry of the reach τM. An extension of Assouad's lemma is presented, allowing to derive minimax lower bounds in singular frameworks.
|
189 |
High-dimensional statistical data integrationJanuary 2019 (has links)
archives@tulane.edu / Modern biomedical studies often collect multiple types of high-dimensional data on a common set of objects. A representative model for the integrative analysis of multiple data types is to decompose each data matrix into a low-rank common-source matrix generated by latent factors shared across all data types, a low-rank distinctive-source matrix corresponding to each data type, and an additive noise matrix. We propose a novel decomposition method, called the decomposition-based generalized canonical correlation analysis, which appropriately defines those matrices by imposing a desirable orthogonality constraint on distinctive latent factors that aims to sufficiently capture the common latent factors. To further delineate the common and distinctive patterns between two data types, we propose another new decomposition method, called the common and distinctive pattern analysis. This method takes into account the common and distinctive information between the coefficient matrices of the common latent factors. We develop consistent estimation approaches for both proposed decompositions under high-dimensional settings, and demonstrate their finite-sample performance via extensive simulations. We illustrate the superiority of proposed methods over the state of the arts by real-world data examples obtained from The Cancer Genome Atlas and Human Connectome Project. / 1 / Zhe Qu
|
190 |
Three Essays on Firm Responses to Climate ChangeJanuary 2020 (has links)
abstract: Evidence is mounting to address and reverse the effects of environmental neglect. Perhaps the greatest evidence for needing environmental stewardship originates from the ever-increasing extreme weather events ranging from the deadly wildfires scorching Greece and California to the extreme heatwaves in Japan. Scientists have concluded that the probability and severity for about two thirds of such extreme natural events that occurred between 2004 and 2018 is contributed by rising global temperatures.
Operations management literature regarding environmental issues have typically focused on the “win-win” approach with a multitude of papers investigating a link between sustainability and firm performance. This dissertation seeks to take a different approach by investigating firm responses to climate change. The first two essays explore firm emissions goals and the last essay investigates firm emissions performance.
The first essay identifies firm determinants of greenhouse gas (GHG) reduction targets. The essay leverages Behavioral Theory of the Firm (BTOF) and argues for two additional determinants, Data Stratification and Science-Based Targets, unique to GHG emissions. Utilizing system generalized method of moments on a dataset from Carbon Disclosure Project for years 2011-2017, the paper finds partial confirmation for BTOF and support for the two additional determinants of firm GHG emission goals.
The second essay is an exploratory study that seeks to understand factors for firm participation in the Science-Based Targets (SBT) initiative by combining both primary and secondary data analysis. The study is a working paper with primary data still needing to be completed. Secondary data analysis begins with a review of the literature which suggested four potential factors: ISO 14001 certification, Customer Engagement, Emission Credit Purchases, and presence of Absolute Emissions Targets. Preliminary results using panel logistic regression suggest that Emissions Credit Purchases and Absolute Emissions Targets influence SBT participation.
The third essay seeks to understand whether stakeholder pressure drives firm GHG emissions reductions. This relies on Stakeholder Theory and classification schemes proposed in Management literature to divide stakeholders, based on their relationship with the firm, into three groups: primary, secondary, and public. Random effects estimation results provide evidence for primary and public stakeholder pressure impacting firm GHG emissions. / Dissertation/Thesis / Doctoral Dissertation Business Administration 2020
|
Page generated in 0.0759 seconds