Spelling suggestions: "subject:"complexity ( analysis )"" "subject:"komplexity ( analysis )""
11 |
Uma proposta para medição de complexidade e estimação de custos de segurança em procedimentos de tecnologia da informação / An approach to measure the complexity and estimate the cost associated to Information Technology Security ProceduresMoura, Giovane Cesar Moreira January 2008 (has links)
Segurança de TI tornou-se nos últimos anos uma grande preocupação para empresas em geral. Entretanto, não é possível atingir níveis satisfatórios de segurança sem que estes venham acompanhados tanto de grandes investimentos para adquirir ferramentas que satisfaçam os requisitos de segurança quanto de procedimentos, em geral, complexos para instalar e manter a infra-estrutura protegida. A comunidade científica propôs, no passado recente, modelos e técnicas para medir a complexidade de procedimentos de configuração de TI, cientes de que eles são responsáveis por uma parcela significativa do custo operacional, freqüentemente dominando o total cost of ownership. No entanto, apesar do papel central de segurança neste contexto, ela não foi objeto de investigação até então. Para abordar este problema, neste trabalho aplica-se um modelo de complexidade proposto na literatura para mensurar o impacto de segurança na complexidade de procedimentos de TI. A proposta deste trabalho foi materializada através da implementação de um protótipo para análise de complexidade chamado Security Complexity Analyzer (SCA). Como prova de conceito e viabilidade de nossa proposta, o SCA foi utilizado para avaliar a complexidade de cenários reais de segurança. Além disso, foi conduzido um estudo para investigar a relação entre as métricas propostas no modelo de complexidade e o tempo gasto pelo administrador durante a execução dos procedimentos de segurança, através de um modelo quantitativo baseado em regressão linear, com o objetivo de prever custos associados à segurança. / IT security has become over the recent years a major concern for organizations. However, it doest not come without large investments on both the acquisition of tools to satisfy particular security requirements and complex procedures to deploy and maintain a protected infrastructure. The scientific community has proposed in the recent past models and techniques to estimate the complexity of configuration procedures, aware that they represent a significant operational cost, often dominating total cost of ownership. However, despite the central role played by security within this context, it has not been subject to any investigation to date. To address this issue, we apply a model of configuration complexity proposed in the literature in order to be able to estimate security impact on the complexity of IT procedures. Our proposal has been materialized through a prototypical implementation of a complexity scorer system called Security Complexity Analyzer (SCA). To prove concept and technical feasibility of our proposal, we have used SCA to evaluate real-life security scenarios. In addition, we have conducted a study in order to investigate the relation between the metrics proposed in the model and the time spent by the administrator while executing security procedures, with a quantitative model built using multiple regression analysis, in order to predict the costs associated to security.
|
12 |
Algorithmes d'optimisation sans dérivées à caractère probabiliste ou déterministe : analyse de complexité et importance en pratique / Derivative-free optimization methods based on probabilistic and deterministic properties : complexity analysis and numerical relevanceRoyer, Clément 04 November 2016 (has links)
L'utilisation d'aspects aléatoires a contribué de façon majeure aux dernières avancées dans le domaine de l'optimisation numérique; cela est dû en partie à la recrudescence de problèmes issus de l'apprentissage automatique (machine learning). Dans un tel contexte, les algorithmes classiques d'optimisation non linéaire, reposant sur des principes déterministes, se révèlent en effet bien moins performants que des variantes incorporant de l'aléatoire. Le coût de ces dernières est souvent inférieur à celui de leurs équivalents déterministes; en revanche, il peut s'avérer difficile de maintenir les propriétés théoriques d'un algorithme déterministe lorsque de l'aléatoire y est introduit. Effectuer une analyse de complexité d'une telle méthode est un procédé très répandu dans ce contexte. Cette technique permet déstimer la vitesse de convergence du schéma considéré et par là même d'établir une forme de convergence de celui-ci. Les récents travaux sur ce sujet, en particulier pour des problèmes d'optimisation non convexes, ont également contribué au développement de ces aspects dans le cadre déterministe, ceux-ci apportant en effet un éclairage nouveau sur le comportement des algorithmes. Dans cette thèse, on s'intéresse à l'amélioration pratique d'algorithmes d'optimisation sans dérivées à travers l'introduction d'aléatoire, ainsi qu'à l'impact numérique des analyses de complexité. L'étude se concentre essentiellement sur les méthodes de recherche directe, qui comptent parmi les principales catégories d'algorithmes sans dérivées; cependant, l'analyse sous-jacente est applicable à un large éventail de ces classes de méthodes. On propose des variantes probabilistes des propriétés requises pour assurer la convergence des algorithmes étudiés, en mettant en avant le gain en efficacité induit par ces variantes: un tel gain séxplique principalement par leur coût très faible en évaluations de fonction. Le cadre de base de notre analyse est celui de méthodes convergentes au premier ordre, que nous appliquons à des problèmes sans ou avec contraintes linéaires. Les bonnes performances obtenues dans ce contexte nous incitent par la suite à prendre en compte des aspects d'ordre deux. A partir des propriétés de complexité des algorithmes sans dérivées, on développe de nouvelles méthodes qui exploitent de l'information du second ordre. L'analyse de ces procédures peut être réalisée sur un plan déterministe ou probabiliste: la deuxième solution nous permet d'étudier de nouveaux aspects aléatoires ainsi que leurs conséquences sur l'éfficacité et la robustesse des algorithmes considérés. / Randomization has had a major impact on the latest developments in the field of numerical optimization, partly due to the outbreak of machine learning applications. In this increasingly popular context, classical nonlinear programming algorithms have indeed been outperformed by variants relying on randomness. The cost of these variants is usually lower than for the traditional schemes, however theoretical guarantees may not be straightforward to carry out from the deterministic to the randomized setting. Complexity analysis is a useful tool in the latter case, as it helps in providing estimates on the convergence speed of a given scheme, which implies some form of convergence. Such a technique has also gained attention from the deterministic optimization community thanks to recent findings in the nonconvex case, as it brings supplementary indicators on the behavior of an algorithm. In this thesis, we investigate the practical enhancement of deterministic optimization algorithms through the introduction of random elements within those frameworks, as well as the numerical impact of their complexity results. We focus on direct-search methods, one of the main classes of derivative-free algorithms, yet our analysis applies to a wide range of derivative-free methods. We propose probabilistic variants on classical properties required to ensure convergence of the studied methods, then enlighten their practical efficiency induced by their lower consumption of function evaluations. Firstorder concerns form the basis of our analysis, which we apply to address unconstrained and linearly-constrained problems. The observed gains incite us to additionally take second-order considerations into account. Using complexity properties of derivative-free schemes, we develop several frameworks in which information of order two is exploited. Both a deterministic and a probabilistic analysis can be performed on these schemes. The latter is an opportunity to introduce supplementary probabilistic properties, together with their impact on numerical efficiency and robustness.
|
13 |
Uma proposta para medição de complexidade e estimação de custos de segurança em procedimentos de tecnologia da informação / An approach to measure the complexity and estimate the cost associated to Information Technology Security ProceduresMoura, Giovane Cesar Moreira January 2008 (has links)
Segurança de TI tornou-se nos últimos anos uma grande preocupação para empresas em geral. Entretanto, não é possível atingir níveis satisfatórios de segurança sem que estes venham acompanhados tanto de grandes investimentos para adquirir ferramentas que satisfaçam os requisitos de segurança quanto de procedimentos, em geral, complexos para instalar e manter a infra-estrutura protegida. A comunidade científica propôs, no passado recente, modelos e técnicas para medir a complexidade de procedimentos de configuração de TI, cientes de que eles são responsáveis por uma parcela significativa do custo operacional, freqüentemente dominando o total cost of ownership. No entanto, apesar do papel central de segurança neste contexto, ela não foi objeto de investigação até então. Para abordar este problema, neste trabalho aplica-se um modelo de complexidade proposto na literatura para mensurar o impacto de segurança na complexidade de procedimentos de TI. A proposta deste trabalho foi materializada através da implementação de um protótipo para análise de complexidade chamado Security Complexity Analyzer (SCA). Como prova de conceito e viabilidade de nossa proposta, o SCA foi utilizado para avaliar a complexidade de cenários reais de segurança. Além disso, foi conduzido um estudo para investigar a relação entre as métricas propostas no modelo de complexidade e o tempo gasto pelo administrador durante a execução dos procedimentos de segurança, através de um modelo quantitativo baseado em regressão linear, com o objetivo de prever custos associados à segurança. / IT security has become over the recent years a major concern for organizations. However, it doest not come without large investments on both the acquisition of tools to satisfy particular security requirements and complex procedures to deploy and maintain a protected infrastructure. The scientific community has proposed in the recent past models and techniques to estimate the complexity of configuration procedures, aware that they represent a significant operational cost, often dominating total cost of ownership. However, despite the central role played by security within this context, it has not been subject to any investigation to date. To address this issue, we apply a model of configuration complexity proposed in the literature in order to be able to estimate security impact on the complexity of IT procedures. Our proposal has been materialized through a prototypical implementation of a complexity scorer system called Security Complexity Analyzer (SCA). To prove concept and technical feasibility of our proposal, we have used SCA to evaluate real-life security scenarios. In addition, we have conducted a study in order to investigate the relation between the metrics proposed in the model and the time spent by the administrator while executing security procedures, with a quantitative model built using multiple regression analysis, in order to predict the costs associated to security.
|
14 |
Uma proposta para medição de complexidade e estimação de custos de segurança em procedimentos de tecnologia da informação / An approach to measure the complexity and estimate the cost associated to Information Technology Security ProceduresMoura, Giovane Cesar Moreira January 2008 (has links)
Segurança de TI tornou-se nos últimos anos uma grande preocupação para empresas em geral. Entretanto, não é possível atingir níveis satisfatórios de segurança sem que estes venham acompanhados tanto de grandes investimentos para adquirir ferramentas que satisfaçam os requisitos de segurança quanto de procedimentos, em geral, complexos para instalar e manter a infra-estrutura protegida. A comunidade científica propôs, no passado recente, modelos e técnicas para medir a complexidade de procedimentos de configuração de TI, cientes de que eles são responsáveis por uma parcela significativa do custo operacional, freqüentemente dominando o total cost of ownership. No entanto, apesar do papel central de segurança neste contexto, ela não foi objeto de investigação até então. Para abordar este problema, neste trabalho aplica-se um modelo de complexidade proposto na literatura para mensurar o impacto de segurança na complexidade de procedimentos de TI. A proposta deste trabalho foi materializada através da implementação de um protótipo para análise de complexidade chamado Security Complexity Analyzer (SCA). Como prova de conceito e viabilidade de nossa proposta, o SCA foi utilizado para avaliar a complexidade de cenários reais de segurança. Além disso, foi conduzido um estudo para investigar a relação entre as métricas propostas no modelo de complexidade e o tempo gasto pelo administrador durante a execução dos procedimentos de segurança, através de um modelo quantitativo baseado em regressão linear, com o objetivo de prever custos associados à segurança. / IT security has become over the recent years a major concern for organizations. However, it doest not come without large investments on both the acquisition of tools to satisfy particular security requirements and complex procedures to deploy and maintain a protected infrastructure. The scientific community has proposed in the recent past models and techniques to estimate the complexity of configuration procedures, aware that they represent a significant operational cost, often dominating total cost of ownership. However, despite the central role played by security within this context, it has not been subject to any investigation to date. To address this issue, we apply a model of configuration complexity proposed in the literature in order to be able to estimate security impact on the complexity of IT procedures. Our proposal has been materialized through a prototypical implementation of a complexity scorer system called Security Complexity Analyzer (SCA). To prove concept and technical feasibility of our proposal, we have used SCA to evaluate real-life security scenarios. In addition, we have conducted a study in order to investigate the relation between the metrics proposed in the model and the time spent by the administrator while executing security procedures, with a quantitative model built using multiple regression analysis, in order to predict the costs associated to security.
|
15 |
Commodities agrícolas do agronegócio brasileiro : análise multifractal e análise da complexidade diante da crise financeira mundial subprime 2008/2009JALE, Jader da Silva 01 June 2015 (has links)
Submitted by Mario BC (mario@bc.ufrpe.br) on 2017-04-19T13:01:07Z
No. of bitstreams: 1
Jader da Silva Jale.pdf: 9845063 bytes, checksum: a4099ac7d2600e6b68a1349ce0d9bc53 (MD5) / Made available in DSpace on 2017-04-19T13:01:07Z (GMT). No. of bitstreams: 1
Jader da Silva Jale.pdf: 9845063 bytes, checksum: a4099ac7d2600e6b68a1349ce0d9bc53 (MD5)
Previous issue date: 2015-06-01 / The growth of the world economy, driven by emerging countries, especially China, has
generated signi cant changes in the commodities market since 2002. The commodity
prices have shown a signi cant increase, reecting the erce conditions of supply and
demand for these products, driven by the climatic phenomena that have negatively
afected the supply, and by the demand growth rate. The global nancial crisis began
in the US market, and eventually turned out the worst global nancial crisis since 1929
(the break of the New York Stock Exchange). The bankruptcy of Lehman Brothers
investment bank on September 15, 2008 marks the transformation of the international
nancial crisis, after which in Brazil there was a great reduction of international credit,
accompanied by a sharp increase of the dollar exchange rate. Considering that the
agricultural sector is of fundamental importance to the economic health, being a major
investor in environmental and rural technologies, Brazil can not succumb to the idea
of a slowdown in this sector, as in 2008 the Brazilian agribusiness represented 36.7%
of exports, generating 37% of jobs, and 28% of gross domestic product (GDP). This
work investigates the returns asynchrony and the behavior of the cross-correlations for
six agricultural Brazilian agribusiness commodities, for the period prior to the global
nancial crisis (2006-2009), and after the crisis (2010-2014). The Cross-Sample Entropy
method was used for quantifying the asynchrony among the commodity returns
series. In addition, the methods Multifractal Detrended Cross-Correlation Analysis
(MF-DCCA), Multifractal Detrended Fluctuation Analysis (MF-DFA) and Detrended
Cross-Correlation Analysis (DCCA) were used to investigate cross-correlations and
auto correlations in the returns series. The results of multifractal analysis show that
for all time series, the multifractality decreased after the global nancial crisis, indicating
smaller range of the scale invariant fluctuations, except for Cotton, which exhibits
precisely the opposite behavior. Based on the obtained results, it can be concluded that
the multifractal analysis and the complexity analysis can be useful in the studies of the
dynamics of the Brazilian agribusiness, given its importance within the global economic
scenario, for adoption of monetary and scal policies by the responsible economic
agents, or by the federal government. / O crescimento da economia mundial, impulsionado por países emergentes, principalmente a China, gerou mudanças relevantes no mercado de commodities a partir de 2002. Observou-se uma mudança nos preços das commodities, que mostraram uma elevação expressiva, mostrando condições acirradas entre oferta e demanda desses produtos, impulsionadas pela existência de problemas climáticos que afetaram negativamente a oferta e pelo ritmo de crescimento da demanda. A crise financeira mundial iniciou-se
no mercado americano e acabou se tornando a pior crise financeira mundial desde 1929 (quebra da bolsa de Nova York). A falência do banco de investimento Lehman Brothers no dia 15 de setembro de 2008 marca a transformação da crise financeira internacional, e após isso, ocorre uma grande redução do crédito internacional e o dólar dispara no Brasil. Considerando que o setor agrícola é de fundamental importância
para a sanidade econômica e por ser um grande investidor em tecnologias ambiental e rural, o Brasil não pode sucumbir a idéia uma desaceleração neste setor, pois o agronegócio brasileiro representou, em 2008, 36.7% das exportações brasileiras, geração de 37% dos empregos e 28% do Produto Interno Bruto (PIB). Neste trabalho investigou-se a assincronia, a transferência de informação e o comportamento das correlações cruzadas dos retornos de seis commodities agrícolas do agronegócio brasileiro, para os períodos anteriores (2006-2009) e posteriores a crise financeira mundial (2010-2014). Utilizou-se o método Cross-Sample Entropy para quantificar a assincronia entre todas as séries de retornos das commodities. Utilizou-se os métodos Multifractal Detrended Cross-Correlation Analysis (MF-DCCA), Multifractal Detrended Fluctuation Analysis (MF-DFA) e Detrended Cross-Correlation Analysis (DCCA) para investigar correlações cruzadas e auto correlações. Os resultados da análise multifractal mostram que para todas as séries temporais, a multifractalidade diminuiu após a crise financeira mundial, indicando menor variedade do tamanho das flutuações que apresentam invariância de escala, exceto o algodão, que apresentou comportamento contrário. Com base nos resultados obtidos, pode-se concluir que a análise multifractal e a análise de complexidade podem ser úteis nos estudos da dinâmica do agronegócio brasileiro, dada
a sua importância, diante do cenário econômico mundial seja para adoção de políticas monetárias e fiscal dos órgãos responsáveis, agentes econômicos ou pelo governo federal.
|
16 |
Real-time Wind Direction Filtering for Sailboat Race TrackingNielsen, Emil January 2015 (has links)
In this paper, an algorithm that calculates the direction of the wind from the directions of sailors during fleet races is proposed. The algorithm is based on a 1-D spatial convolution and it is named Convolution Based Direction Filtering (CBDF). The CBDF-algorithm is used in the TracTrac race client that broadcasts sailboat races in real-time. The fact that the proposed algorithm is polynomial makes it suitable, to be used as a real-time application inside TracTrac, even for large fleets. More concretely, we show that the time complexity of the CBDF-algorithm is O(n2), in the worst-case, where n > 0 is the number of boats in competition. It is also shown that in more realistic sailing scenarios, the CBDF-algorithm is in fact a linear algorithm.
|
17 |
Complexity Analysis of Physiological Time Series with Applications to Neonatal Sleep Electroencephalogram SignalsLi, Chang 08 March 2013 (has links)
No description available.
|
18 |
Modélisation 3D automatique d'environnements : une approche éparse à partir d'images prises par une caméra catadioptrique / Automatic 3d modeling of environments : a sparse approach from images taken by a catadioptric cameraYu, Shuda 03 June 2013 (has links)
La modélisation 3d automatique d'un environnement à partir d'images est un sujet toujours d'actualité en vision par ordinateur. Ce problème se résout en général en trois temps : déplacer une caméra dans la scène pour prendre la séquence d'images, reconstruire la géométrie, et utiliser une méthode de stéréo dense pour obtenir une surface de la scène. La seconde étape met en correspondances des points d'intérêts dans les images puis estime simultanément les poses de la caméra et un nuage épars de points 3d de la scène correspondant aux points d'intérêts. La troisième étape utilise l'information sur l'ensemble des pixels pour reconstruire une surface de la scène, par exemple en estimant un nuage de points dense.Ici nous proposons de traiter le problème en calculant directement une surface à partir du nuage épars de points et de son information de visibilité fournis par l'estimation de la géométrie. Les avantages sont des faibles complexités en temps et en espace, ce qui est utile par exemple pour obtenir des modèles compacts de grands environnements comme une ville. Pour cela, nous présentons une méthode de reconstruction de surface du type sculpture dans une triangulation de Delaunay 3d des points reconstruits. L'information de visibilité est utilisée pour classer les tétraèdres en espace vide ou matière. Puis une surface est extraite de sorte à séparer au mieux ces tétraèdres à l'aide d'une méthode gloutonne et d'une minorité de points de Steiner. On impose sur la surface la contrainte de 2-variété pour permettre des traitements ultérieurs classiques tels que lissage, raffinement par optimisation de photo-consistance ... Cette méthode a ensuite été étendue au cas incrémental : à chaque nouvelle image clef sélectionnée dans une vidéo, de nouveaux points 3d et une nouvelle pose sont estimés, puis la surface est mise à jour. La complexité en temps est étudiée dans les deux cas (incrémental ou non). Dans les expériences, nous utilisons une caméra catadioptrique bas coût et obtenons des modèles 3d texturés pour des environnements complets incluant bâtiments, sol, végétation ... Un inconvénient de nos méthodes est que la reconstruction des éléments fins de la scène n'est pas correcte, par exemple les branches des arbres et les pylônes électriques. / The automatic 3d modeling of an environment using images is still an active topic in Computer Vision. Standard methods have three steps : moving a camera in the environment to take an image sequence, reconstructing the geometry of the environment, and applying a dense stereo method to obtain a surface model of the environment. In the second step, interest points are detected and matched in images, then camera poses and a sparse cloud of 3d points corresponding to the interest points are simultaneously estimated. In the third step, all pixels of images are used to reconstruct a surface of the environment, e.g. by estimating a dense cloud of 3d points. Here we propose to generate a surface directly from the sparse point cloud and its visibility information provided by the geometry reconstruction step. The advantages are low time and space complexities ; this is useful e.g. for obtaining compact models of large and complete environments like a city. To do so, a surface reconstruction method by sculpting 3d Delaunay triangulation of the reconstructed points is proposed.The visibility information is used to classify the tetrahedra in free-space and matter. Then a surface is extracted thanks to a greedy method and a minority of Steiner points. The 2-manifold constraint is enforced on the surface to allow standard surface post-processing such as denoising, refinement by photo-consistency optimization ... This method is also extended to the incremental case : each time a new key-frame is selected in the input video, new 3d points and camera pose are estimated, then the reconstructed surface is updated.We study the time complexity in both cases (incremental or not). In experiments, a low-cost catadioptric camera is used to generate textured 3d models for complete environments including buildings, ground, vegetation ... A drawback of our methods is that thin scene components cannot be correctly reconstructed, e.g. tree branches and electric posts.
|
19 |
Generalized N-body problems: a framework for scalable computationRiegel, Ryan Nelson 13 January 2014 (has links)
In the wake of the Big Data phenomenon, the computing world has seen a number of computational paradigms developed in response to the sudden need to process ever-increasing volumes of data. Most notably, MapReduce has proven quite successful in scaling out an extensible class of simple algorithms to even hundreds of thousands of nodes. However, there are some tasks---even embarrassingly parallelizable ones---that neither MapReduce nor any existing automated parallelization framework is well-equipped to perform. For instance, any computation that (naively) requires consideration of all pairs of inputs becomes prohibitively expensive even when parallelized over a large number of worker nodes.
Many of the most desirable methods in machine learning and statistics exhibit these kinds of all-pairs or, more generally, all-tuples computations; accordingly, their application in the Big Data setting may seem beyond hope. However, a new algorithmic strategy inspired by breakthroughs in computational physics has shown great promise for a wide class of computations dubbed generalized N-body problems (GNBPs). This strategy, which involves the simultaneous traversal of multiple space-partitioning trees, has been applied to a succession of well-known learning methods, accelerating each asymptotically and by orders of magnitude. Examples of these include all-k-nearest-neighbors search, k-nearest-neighbors classification, k-means clustering, EM for mixtures of Gaussians, kernel density estimation, kernel discriminant analysis, kernel machines, particle filters, the n-point correlation, and many others. For each of these problems, no overall faster algorithms are known. Further, these dual- and multi-tree algorithms compute either exact results or approximations to within specified error bounds, a rarity amongst fast methods.
This dissertation aims to unify a family of GNBPs under a common framework in order to ease implementation and future study. We start by formalizing the problem class and then describe a general algorithm, the generalized fast multipole method (GFMM), capable of solving all problems that fit the class, though with varying degrees of speedup. We then show O(N) and O(log N) theoretical run-time bounds that may be obtained under certain conditions. As a corollary, we derive the tightest known general-dimensional run-time bounds for exact all-nearest-neighbors and several approximated kernel summations.
Next, we implement a number of these algorithms in a commercial database, empirically demonstrating dramatic asymptotic speedup over their conventional SQL implementations. Lastly, we implement a fast, parallelized algorithm for kernel discriminant analysis and apply it to a large dataset (40 million points in 4D) from the Sloan Digital Sky Survey, identifying approximately one million quasars with high accuracy. This exceeds the previous largest catalog of quasars in size by a factor of ten and has since been used in a follow-up study to confirm the existence of dark energy.
|
20 |
Feature Selection under Multicollinearity & Causal Inference on Time SeriesBhattacharya, Indranil January 2017 (has links) (PDF)
In this work, we study and extend algorithms for Sparse Regression and Causal Inference problems. Both the problems are fundamental in the area of Data Science.
The goal of regression problem is to nd out the \best" relationship between an output variable and input variables, given samples of the input and output values. We consider sparse regression under a high-dimensional linear model with strongly correlated variables, situations which cannot be handled well using many existing model selection algorithms. We study the performance of the popular feature selection algorithms such as LASSO, Elastic Net, BoLasso, Clustered Lasso as well as Projected Gradient Descent algorithms under this setting in terms of their running time, stability and consistency in recovering the true support. We also propose a new feature selection algorithm, BoPGD, which cluster the features rst based on their sample correlation and do subsequent sparse estimation using a bootstrapped variant of the projected gradient descent method with projection on the non-convex L0 ball. We attempt to characterize the efficiency and consistency of our algorithm by performing a host of experiments on both synthetic and real world datasets.
Discovering causal relationships, beyond mere correlation, is widely recognized as a fundamental problem. The Causal Inference problems use observations to infer the underlying causal structure of the data generating process. The input to these problems is either a multivariate time series or i.i.d sequences and the output is a Feature Causal Graph where the nodes correspond to the variables and edges capture the direction of causality. For high dimensional datasets, determining the causal relationships becomes a challenging task because of the curse of dimensionality. Graphical modeling of temporal data based on the concept of \Granger Causality" has gained much attention in this context. The blend of Granger methods along with model selection techniques, such as LASSO, enables efficient discovery of a \sparse" sub-set of causal variables in high dimensional settings. However, these temporal causal methods use an input parameter, L, the maximum time lag. This parameter is the maximum gap in time between the occurrence of the output phenomenon and the causal input stimulus. How-ever, in many situations of interest, the maximum time lag is not known, and indeed, finding the range of causal e ects is an important problem. In this work, we propose and evaluate a data-driven and computationally efficient method for Granger causality inference in the Vector Auto Regressive (VAR) model without foreknowledge of the maximum time lag. We present two algorithms Lasso Granger++ and Group Lasso Granger++ which not only constructs the
hypothesis feature causal graph, but also simultaneously estimates a value of maxlag (L) for each variable by balancing the trade-o between \goodness of t" and \model complexity".
|
Page generated in 0.0764 seconds