Spelling suggestions: "subject:"2analysis off data"" "subject:"2analysis off mata""
41 |
Panel Regression Models for Causal Analysis in Structural Equation Modeling: Recent Developments and ApplicationsAndersen, Henrik Kenneth Bent Axel 08 September 2022 (has links)
Establishing causal relationships is arguably the most important task of the social sciences. While the relationship between the social sciences and the concept of causality has been rocky, the randomized experiment gives us a concrete definition of a causal effect as the difference in outcomes due to the researcher's intervention. However, many interesting questions cannot be easily examined using experiments. Feasibility and ethics limit the use of randomized experiments in some situations and retrospective questions, i.e., working from the observed outcome to uncover the cause, require a different logic. Observational studies in which we observe pairs of variables without any intervention lend themselves to such situations but come with many difficulties. That is, it is not immediately clear whether an observed relationship between two variables is due to a true causal effect, or whether the relationship is due to other common causes.
Panel data describe repeated observations of the same units over time. They offer a powerful framework for approaching causal questions with observational data. Panel analysis allows us to essentially use each unit as their own control. In an experiment, random assignment to either treatment and control group makes both groups equal on all characteristics. Similarly, if we compare the same individual pre- and post-treatment, then the two are equal at least on the things that do not change over time, such as sex, date of birth, nationality, etc.
Structural equation modeling (SEM) is a group of statistical methods for assessing relationships between variables, often at the latent (unobserved) variable level. The use of SEM for panel analysis allows for a great deal of flexibility. Latent variables can be incorporated to account for measurement error and rule out alternative models.
This dissertation focuses on the use of panel data in SEM for causal analysis. It comprises an introduction, four main chapters and a conclusion.
After a short introduction (Chapter 1) outlining the goals and scope of the dissertation, Chapter 2 provides an overview of the topic of causality in the social sciences. Since the randomized experiment is often not feasible in social research, special emphasis has been placed on non-experimental, i.e., observational data. The chapter outlines some competing views on causality with non-experimental data, then discusses the two currently dominant frameworks for causal analysis, potential outcomes and directed graphs. It goes on to outline empirical methods and notes their compatibility with SEM.
Chapter 3 discusses how panel data can be used to deal with unobserved time-invariant heterogeneity, i.e., stable characteristics that might normally confound analyses. It attempts to show in detail how basic panel regression in SEM works. It also discusses some issues that are not normally addressed outside of SEM, e.g., measurement error in observed variables, effects that change over time, model comparisons, etc. This discussion of the more basic panel regression setup provides a sort of basis for the more complex discussion in the following chapters.
Chapter 4 compares and contrasts several ways to model dynamic processes, where the outcome at a particular point in time may affect future outcomes or even the presumed cause later on. It shows that popular recently proposed modeling techniques have much do to with their older counterparts. In fact, the newer modeling techniques do not seem to offer benefit with regards to estimating the causal effects of interest. The chapter focuses on arguably common situations in which the newer techniques may have serious drawbacks.
Chapter 5 provides an applied example. It looks to better assess the causal effect of environmental attitudes on environmental behaviour (mobility, consumption, willingness to sacrifice). It touches on many of the aspects from the previous chapters, including the use of latent variables for constructs that are not directly observable, unobserved time-invariant confounders, state dependence (feedback from outcome to outcome), and reverse causality (feedback from outcome to cause). It shows that failure to account for time-invariant confounders leads to biased estimates of the effect of attitudes on behaviour. After controlling for these factors, the effects disappear in terms of mobility and consumption behaviour: when a person's attitudes become more positive, their behaviour does not become more environmentally-friendly. There is, however, a fairly robust effect of attitudes on willingness to sacrifice, even after controlling for unobserved time-invariant confounders, state dependence and reverse causality. This suggests changing attitudes do affect willingness to make sacrifices, holding potential time-invariant confounders, outcome to outcome feedback (essentially habits), as well as some time-varying confounders constant.
Finally, Chapter 6 summarizes the previous chapters and provides an outlook for future work.:1. Introduction
2. Causal Inference in the Social Sciences
3. A Closer Look at Random and Fixed Effects Panel Regression in Structural Equation Modeling Using lavaan
4. Equivalent Approaches to Dealing with Unobserved Heterogeneity in Cross-Lagged Panel Models?
5. Re-Examining the Effect of Environmental Attitudes on Behaviour in a Panel Setting
6. Conclusion
|
42 |
Geo-L: Topological Link Discovery for Geospatial Linked Data Made EasyZinke-Wehlmann, Christian, Kirschenbaum, Amit 04 May 2023 (has links)
Geospatial linked data are an emerging domain, with growing interest in research and the industry. There is an increasing number of publicly available geospatial linked data resources, which can also be interlinked and easily integrated with private and industrial linked data on the web. The present paper introduces Geo-L, a system for the discovery of RDF spatial links based on topological relations. Experiments show that the proposed system improves state-of-the-art spatial linking processes in terms of mapping time and accuracy, as well as concerning resources retrieval efficiency and robustness.
|
43 |
Statistics preserving spatial interpolation methods for missing precipitation dataUnknown Date (has links)
Deterministic and stochastic weighting methods are commonly used methods for estimating missing precipitation rain gauge data based on values recorded at neighboring gauges. However, these spatial interpolation methods seldom check for their ability to preserve site and regional statistics. Such statistics and primarily defined by spatial correlations and other site-to-site statistics in a region. Preservation of site and regional statistics represents a means of assessing the validity of missing precipitation estimates at a site. This study evaluates the efficacy of traditional interpolation methods for estimation of missing data in preserving site and regional statistics. New optimal spatial interpolation methods intended to preserve these statistics are also proposed and evaluated in this study. Rain gauge sites in the state of Kentucky are used as a case study, and several error and performance measures are used to evaluate the trade-offs in accuracy of estimation and preservation of site and regional statistics. / by Husayn El Sharif. / Thesis (M.S.C.S.)--Florida Atlantic University, 2012. / Includes bibliography. / Mode of access: World Wide Web. / System requirements: Adobe Reader.
|
44 |
Analýza burzovních dat / Analysis of Stock Exchange DataPrajer, Jiří January 2007 (has links)
The thesis describes the stock exchange environment, the system and its basic operating principles. The thesis further focuses on the stock exchange data and its analysis. The author describes the development of the technical analysis; he mentions the classical theory and the classical graphical methods, the modern graphical methods, the technical indicators and finally the latest analytical methods, the so-called Artificial Intelligence. The research focuses on the real stock market prediction using the artificial intelligence methods and knowledge of the modern technical analysis.
|
45 |
Studying the Oligomerization of the Kinase Domain of Ephrin type-B Receptor 2 Using Analytical Ultracentrifugation and Development of a Program for Analysis of Acquired DataLundberg, Alexander January 2014 (has links)
Ephrin type-B receptor 2 (EphB2) is a receptor tyrosine kinase which phosphorylates proteins and thereby regulates cell migration, vascular development, axon guidance synaptic plasticity, and formation of borders between tissues. It has been seen overexpressed in several cancers, which make it an interesting protein to study. In this thesis EphB2 kinase domain (KD) and juxtamembrane segment with kinase domain (JMS-KD) have been expressed, purified and studied using analytical ultracentrifugation to evaluate the oligomerisation of the KD and how the double mutation S677/680A affects this. A program for data analysis have been written and used for analysis of the acquired data. The values of the dissociation constant were 2.94±1.04 mM for KD wild type and 3.46±2.26 mM for JMS-KD wild type have been calculated. Due to varied problems with the measurements no data was acquired on the double mutant, and not enough data was gained to draw any conclusions. Additional experiments will be needed to understand the oligomerisation of this intriguing protein.
|
46 |
Structural condition monitoring and damage identification with artificial neural networkBakhary, Norhisham January 2009 (has links)
Many methods have been developed and studied to detect damage through the change of dynamic response of a structure. Due to its capability to recognize pattern and to correlate non-linear and non-unique problem, Artificial Neural Networks (ANN) have received increasing attention for use in detecting damage in structures based on vibration modal parameters. Most successful works reported in the application of ANN for damage detection are limited to numerical examples and small controlled experimental examples only. This is because of the two main constraints for its practical application in detecting damage in real structures. They are: 1) the inevitable existence of uncertainties in vibration measurement data and finite element modeling of the structure, which may lead to erroneous prediction of structural conditions; and 2) enormous computational effort required to reliably train an ANN model when it involves structures with many degrees of freedom. Therefore, most applications of ANN in damage detection are limited to structure systems with a small number of degrees of freedom and quite significant damage levels. In this thesis, a probabilistic ANN model is proposed to include into consideration the uncertainties in finite element model and measured data. Rossenblueth's point estimate method is used to reduce the calculations in training and testing the probabilistic ANN model. The accuracy of the probabilistic model is verified by Monte Carlo simulations. Using the probabilistic ANN model, the statistics of the stiffness parameters can be predicted which are used to calculate the probability of damage existence (PDE) in each structural member. The reliability and efficiency of this method is demonstrated using both numerical and experimental examples. In addition, a parametric study is carried out to investigate the sensitivity of the proposed method to different damage levels and to different uncertainty levels. As an ANN model requires enormous computational effort in training the ANN model when the number of degrees of freedom is relatively large, a substructuring approach employing multi-stage ANN is proposed to tackle the problem. Through this method, a structure is divided to several substructures and each substructure is assessed separately with independently trained ANN model for the substructure. Once the damaged substructures are identified, second-stage ANN models are trained for these substructures to identify the damage locations and severities of the structural ii element in the substructures. Both the numerical and experimental examples are used to demonstrate the probabilistic multi-stage ANN methods. It is found that this substructuring ANN approach greatly reduces the computational effort while increasing the damage detectability because fine element mesh can be used. It is also found that the probabilistic model gives better damage identification than the deterministic approach. A sensitivity analysis is also conducted to investigate the effect of substructure size, support condition and different uncertainty levels on the damage detectability of the proposed method. The results demonstrated that the detectibility level of the proposed method is independent of the structure type, but dependent on the boundary condition, substructure size and uncertainty level.
|
47 |
Search for the production of a Higgs boson in association with top quarks and decaying into a b-quark pair and b-jet identification with the ATLAS experiment at LHC / Recherche du boson de Higgs produit en association avec des quarks top dans le canal de désintégration bb et identification des jets de saveur b dans l’expérience Atlas au LHCCalvet, Thomas 08 November 2017 (has links)
En Juillet 2012, les expériences ATLAS et CMS annoncent la découverte d'une nouvelle particule de masse 125 GeV, compatible avec le boson de Higgs prédit par le Modèle Standard. Pour établir la nature de ce boson de Higgs et la comparer au Modèle Standard, il est nécessaire de mesurer le complage du boson de Higgs au fermions. En particulier le quark top possède le plus fort couplage de Yukawa avec le boson de Higgs. Ce couplage est accessible par le processus de production d'un boson de Higgs en association avec une paire de quarks tops (ttH). Cette thèse présente la recherche d'évènement ttH où le boson de Higgs se désintègre en deux quark b dans les données du Run 2 recueillies en 2015 et 2016 par le détecteur ATLAS. La composition du bruit de fond ainsi que la mesure du signal ttH dans les données sont obtenues à partir d'un ajustemement statistique des prédictions aux données. Le bruit de fond tt+jets étant la plus grande source d'incertitudes sur le signal, une attention particulière est portée à sa description.La détection des jets issus de quarks b, appelé b-tagging, est primordiale pour l'analyse ttH(H->bb) dont l'état final contient quatre quarks b. Afin d'améliorer la compréhension des performances des algorithmes de b-tagging pour le Run 2, la définition des jets de saveur b dans les simulations Monte Carlo est revisitée. Les algorithmes standards du b-tagging ne permettant pas la différenciation des jets contenant un ou deux quarks b, une methode spécifique à été développée et est présentée dans cette thèse. / In July 2012, the ATLAS and CMS experiments announced the discovery of a new particle, with a mass about 125 GeV, compatible with the Standard Model Higgs boson. In order to assess if the observed particle is the one predicted by the Standard Model, the couplings if this Higgs boson to fermions have to be measured. In particular, the top quark has the strongest Yukawa coupling to the Higgs boson. The associated production of a Higgs boson with a pair of top quarks (ttH) gives a direct access to this coupling. The ttH process is accessible for the first time in the Run 2 of the LHC thanks to an upgrade of the detector and the increase of the center of mass energy to 13 TeV. This thesis presents the search for ttH events with the Higgs boson decaying to a pair of b-quarks using data collected by the ATLAS detector in 2015 and 2016. The description of the background and the extraction of the ttH signal in data are obtained by a statistical matching on predictions to data. In particular the tt+jets background is the main limitation to signal sensitivity and is scrutinized.The identification of jets originating from b-quarks, called b-tagging, is a vital input to the search of ttH(H->bb) events because of the four b-quarks in the final state. For Run 2 the definition of b-flavoured-jets in Monte Carlo simulations is revisited to improve the understanding of b-tagging algorithms and their performance. Standard b-tagging algorithms do not separate jets originating from a single b-quark from those originating from two b-quarks. Thus a specific method has been developed and is reviewed in this thesis.
|
48 |
Možnosti využití otevřených dat pro Competitive Intelligence / Possibilities of using open data for Competitive IntelligenceKolinger, Martin January 2015 (has links)
The diploma thesis addresses the issue of open data in relation to the scientific field of Competitive Intelligence. The focus is primarily on assessing the impact of open data in connection with the corporate sector and the number of obstacles that arise when obtaining data made accessible by the public sector. The main objective of the work is to provide the reader with clear information about the level of development of the public sector and the perception of openness by the corporate environment on a national and international scale, and the positive and negative impacts that accompany the process of opening the data. This, therefore, helps to acquire complex knowledge of the development and current level of the open data initiative. The objective was achieved by exploring the environment and conducting a survey in the corporate segment of small and medium-size companies engaged in software development, which via the applications they created, became notional propagators of the idea of public sector openness among ordinary citizens. Researched and obtained information, along with the conducted survey and questionnaire, as well as an analysis of the PESTLE environment - thus resulting in discovering the circumstances that influence the development of the issue of open data in the public sector and the possibility of their use by the private sector - provide the major contribution to this thesis. The first part of this thesis deals with the basic terms and principles of the examined sector, including a detailed description of various frameworks and the mutual relationships between the terms. The following part adds some information from the legislative environment on a national scale and outlines the issue of open licenses and the relations between them. The next part of this diploma thesis offers a closer look into the field of open data, principles of disclosure, life-cycle, benefits and the risks that accompany this initiative. Finally, to make the thesis complete, a few examples were added. The final part of the thesis includes a discussion on the impact of open data on the corporate sector in relation to the CI sector and an analysis of the attitudes of the corporate sector towards the level of openness of the public sector by conducting a questionnaire survey. The result of the analysis includes the findings on the attitudes, conditions and needs of the corporate sector towards the public sector in regard to access to open agendas. The data obtained can, for example, be used as supporting material for further investigation.
|
49 |
Zpracování zákaznických dat a jejich využití / Processing and utilizing of customer dataBartelová, Jana January 2012 (has links)
The topic of this master dissertation is data mining of customer data for marketing purposes within an enterprise. The information resulting from this process is then used to create targeted marketing campaigns. Nowadays, identifying and exploiting customer's needs is vital for any enterprise. With that in mind, the theoretical part of this dissertation is focused primarily on different methods of data analysis such as segmentation, profiling, customer scoring and determining customer value. A significant segment of this part focuses on web analysis, which studies customer's web browsing behaviour. The practical part of this dissertation is based on a case study of a specific e-shop. The case study identifies and solves problems of emailing realization. Solving these problems using Silverpop Engage brings new opportunities for emailing. The main goal of this dissertation is to show new opportunities of utilizing behavioural data for e-mailing campaigns execution.
|
50 |
Automatizovaná syntéza stromových struktur z reálných dat / Automated Synthesis of Tree Structures from Real DataŽeliar, Dušan January 2019 (has links)
This masters thesis deals with analysis of tree structure data. The aim of this thesis is to design and implement a tool for automated detection of relation among samples of read data considering their three structure and node values. Output of the tool is a prescription for automated synthesis of data for testing purposes. The tool is a part of Testos platform developed at FIT BUT.
|
Page generated in 0.0633 seconds