Global ETD Search

251	A NEW INDEPENDENCE MEASURE AND ITS APPLICATIONS IN HIGH DIMENSIONAL DATA ANALYSIS Ke, Chenlu 01 January 2019 (has links) This dissertation has three consecutive topics. First, we propose a novel class of independence measures for testing independence between two random vectors based on the discrepancy between the conditional and the marginal characteristic functions. If one of the variables is categorical, our asymmetric index extends the typical ANOVA to a kernel ANOVA that can test a more general hypothesis of equal distributions among groups. The index is also applicable when both variables are continuous. Second, we develop a sufficient variable selection procedure based on the new measure in a large p small n setting. Our approach incorporates marginal information between each predictor and the response as well as joint information among predictors. As a result, our method is more capable of selecting all truly active variables than marginal selection methods. Furthermore, our procedure can handle both continuous and discrete responses with mixed-type predictors. We establish the sure screening property of the proposed approach under mild conditions. Third, we focus on a model-free sufficient dimension reduction approach using the new measure. Our method does not require strong assumptions on predictors and responses. An algorithm is developed to find dimension reduction directions using sequential quadratic programming. We illustrate the advantages of our new measure and its two applications in high dimensional data analysis by numerical studies across a variety of settings. High dimensional data analysis Independence Reproducing Kernel Hilbert Space Sufficient Dimension Reduction Sufficient Variable Selection Categorical Data Analysis Multivariate Analysis Statistics and Probability
252	Development of an Unified Message Broker for Financial Applications in the Context of PSD2 : Capitalizing from PSD2 Through Data Retrieval and Analysis Johansson, Fredrik January 2017 (has links) The EU has recently proposed a new directive called PSD2, which requires banks to define APIs to the systems, allowing third party providers to write financial applications and access customers’ data inside the banks, with the permission of the customer. PSD2 becomes effective on 18th of January 2018 and is expected to open up the market and remove the monopoly and ownership of customers from banks. As it is a directive, all countries inside the EU and countries covered by the European economic area such as Norway,Switzerland, Liechtenstein and Iceland are affected. The business opportunity, which arises due to the directive, is an opportunity to find a way to be able to take the initiative and possibly monetize from the caused situation.The report presents a qualitative approach to develop a proof of concept with the purpose to display how an actor can create a solution that acts as a source of information and performs analysis to come up with valuable insights of consumers’ behaviour. The insights gained from this master thesis, open up new paths for innovation and competition on the market between actors providing similar services. / EU har nyligen publicerat ett nytt direktiv vid namn PSD2, som kräver att banker definierar ett API for deras system, ett API som ger tredjepartsutvecklar möjligheter att skapa finansiella applikationer och får tillgång till kunders information innuti bankens system, med kundes tillstånd. PSD2 träder i kraft från och med 18:e Januari 2018 och förväntas att öppna upp marknaden och upplösa det monopol och ägarskap banker har på sina kunder idag. I och med att det är ett direktiv, påverkar det alla länder inom EU och länder inom EUs ekonomiska gränser, så som Norge, Schweiz, Liechtenstein och Island. Affärsmöjligheten somöppnas på grund av direktivet, är en möjlighet att ta innitiativ och förhoppningsvis monetärisera från situationen.Rapporten presenterar en kvalitativ tillvägagång för att utveckla ett ”proof of concept”, med målet att visa hur en aktör kan skapa en lösning som agerar som källa för information och gör analyser på denna källa för att komma fram till värdefulla insikter om kunders beteende. Rapporten öppnar upp för nya vägar för innovation och konkurrans på marknaden mellan aktörer som tillhandahåller liknande tjänster. PSD2 GDPR RESTful webservice Spark Data Analysis fintech PSD2 GDPR RESTful webservice Spark Data Analysis fintech Computer and Information Sciences Data- och informationsvetenskap
253	Nonlinear Hierarchical Models for Longitudinal Experimental Infection Studies Singleton, Michael David 01 January 2015 (has links) Experimental infection (EI) studies, involving the intentional inoculation of animal or human subjects with an infectious agent under controlled conditions, have a long history in infectious disease research. Longitudinal infection response data often arise in EI studies designed to demonstrate vaccine efficacy, explore disease etiology, pathogenesis and transmission, or understand the host immune response to infection. Viral loads, antibody titers, symptom scores and body temperature are a few of the outcome variables commonly studied. Longitudinal EI data are inherently nonlinear, often with single-peaked response trajectories with a common pre- and post-infection baseline. Such data are frequently analyzed with statistical methods that are inefficient and arguably inappropriate, such as repeated measures analysis of variance (RM-ANOVA). Newer statistical approaches may offer substantial gains in accuracy and precision of parameter estimation and power. We propose an alternative approach to modeling single-peaked, longitudinal EI data that incorporates recent developments in nonlinear hierarchical models and Bayesian statistics. We begin by introducing a nonlinear mixed model (NLMM) for a symmetric infection response variable. We employ a standard NLMM assuming normally distributed errors and a Gaussian mean response function. The parameters of the model correspond directly to biologically meaningful properties of the infection response, including baseline, peak intensity, time to peak and spread. Through Monte Carlo simulation studies we demonstrate that the model outperforms RM-ANOVA on most measures of parameter estimation and power. Next we generalize the symmetric NLMM to allow modeling of variables with asymmetric time course. We implement the asymmetric model as a Bayesian nonlinear hierarchical model (NLHM) and discuss advantages of the Bayesian approach. Two illustrative applications are provided. Finally we consider modeling of viral load. For several reasons, a normal-errors model is not appropriate for viral load. We propose and illustrate a Bayesian NLHM with the individual responses at each time point modeled as a Poisson random variable with the means across time points related through a Tricube mean response function. We conclude with discussion of limitations and open questions, and a brief survey of broader applications of these models. Bayesian nonilnear hierchical model infection response viral challenge simulation study longitudinal data analysis Biostatistics Epidemiology Veterinary Infectious Diseases
254	Low-dimensional data analysis and clustering by means of Delaunay triangulation / Analyse et clustering de données en basse dimension par triangulation de Delaunay Razafindramanana, Octavio 05 December 2014 (has links) Les travaux présentés et discutés dans cette thèse ont pour objectif de proposer plusieurs solutions au problème de l’analyse et du clustering de nuages de points en basse dimension. Ces solutions s’appuyent sur l’analyse de triangulations de Delaunay. Deux types d’approches sont présentés et discutés. Le premier type suit une approche en trois-passes classique: 1) la construction d’un graphe de proximité contenant une information topologique, 2) la construction d’une information statistique à partir de ce graphe et 3) la suppression d’éléments inutiles au regard de cette information statistique. L’impact de différentes measures sur le clustering ainsi que sur la reconnaissance de caractères est discuté. Ces mesures s’appuyent sur l’exploitation du complexe simplicial et non pas uniquement sur celle du graphe. Le second type d’approches est composé d’approches en une passe extrayant des clusters en même temps qu’une triangulation de Delaunay est construite. / This thesis aims at proposing and discussing several solutions to the problem of low-dimensional point cloudanalysis and clustering. These solutions are based on the analysis of the Delaunay triangulation.Two types of approaches are presented and discussed. The first one follows a classical three steps approach:1) the construction of a proximity graph that embeds topological information, 2) the construction of statisticalinformation out of this graph and 3) the removal of pointless elements regarding this information. The impactof different simplicial complex-based measures, i.e. not only based on a graph, is discussed. Evaluation is madeas regards point cloud clustering quality along with handwritten character recognition rates. The second type ofapproaches consists of one-step approaches that derive clustering along with the construction of the triangulation. Analyse de données Clustering Graphes de proximité Triangulation de Delaunay Analyse de données topologique Data analysis Point cloud clustering Proximity graph Delaunay triangulation Topological data analysis
255	An investigation of average stable ranks : On plane geometric objects and financial transaction data / En undersökning av den genomsnittliga stabila rangen hos plana geometriska figurer och finansiella transaktioner Odelius, Linn January 2020 (has links) This thesis concerns the topological features of plane geometric shapes and financial transaction data. Topological properties of the data such as homology groups and their stable ranks are analysed. It is investigated how to mathematically describe differences between data sets and it is found that stable ranks can be used to capture these differences. Sub sampling is introduced as a way to apply stochastic methods to geometric structures. It is found that the average stable rank can be used to differentiate data sets. Furthermore, the sensitivity of average stable ranks to random noise is explored and it is studied how a single point changes the average stable ranks of geometric shapes and financial transaction data. A method to incorporate categorical data within the analysis is introduced. The theory is applied to financial transaction data with the objective to understand if there are topological differences between fraudulent and legit transactions which can be used to classify them. / I denna uppsats analyseras finansiell transaktionsdata samt plana geometriska objekt med hjälp av verktyg inom Topologisk Dataanalys. Topologiska egenskaper såsom homologi samt stabil rang analyseras och det undersöks hur en matematiskt kan beskriva skillnaden mellan geometriska objekt. Det visar sig att simplistiska komplex och dess motsvarande stabila rang kan användas för att beskriva dessa skillnader. Det undersöks även hur stokastiska metoder kan appliceras på geometrisk data och begreppet genomsnittlig stabil rang introduceras. Känsligheten för brus hos den genomsnittliga stabila rangen undersöks för plana objekt och det undersöks hur den genomsnittliga stabila rangen av en datamängd ändras om en datapunkt läggs till. En metod för att beskriva avstånd på kategorisk data introduceras eftersom analysen av stabil rang kräver ett definierat avstånd mellan datapunkter. Det undersöks huruvida det finns topologiska skillnader mellan bedrägliga och icke-bedrägliga transaktioner, samt om det finns skillnader mellan olika typer av bedrägliga transaktioner. Topological data analysis TDA Mathematics Stable rank Topology Data analysis Financial transactions Topologisk dataanalys Topologi TDA Matematik Teoretisk Matematik Dataanalys Finans Stabil rang Mathematics Matematik
256	Statistical Analysis of Structured High-dimensional Data Sun, Yizhi 05 October 2018 (has links) High-dimensional data such as multi-modal neuroimaging data and large-scale networks carry excessive amount of information, and can be used to test various scientific hypotheses or discover important patterns in complicated systems. While considerable efforts have been made to analyze high-dimensional data, existing approaches often rely on simple summaries which could miss important information, and many challenges on modeling complex structures in data remain unaddressed. In this proposal, we focus on analyzing structured high-dimensional data, including functional data with important local regions and network data with community structures. The first part of this dissertation concerns the detection of ``important'' regions in functional data. We propose a novel Bayesian approach that enables region selection in the functional data regression framework. The selection of regions is achieved through encouraging sparse estimation of the regression coefficient, where nonzero regions correspond to regions that are selected. To achieve sparse estimation, we adopt compactly supported and potentially over-complete basis to capture local features of the regression coefficient function, and assume a spike-slab prior to the coefficients of the bases functions. To encourage continuous shrinkage of nearby regions, we assume an Ising hyper-prior which takes into account the neighboring structure of the bases functions. This neighboring structure is represented by an undirected graph. We perform posterior sampling through Markov chain Monte Carlo algorithms. The practical performance of the proposed approach is demonstrated through simulations as well as near-infrared and sonar data. The second part of this dissertation focuses on constructing diversified portfolios using stock return data in the Center for Research in Security Prices (CRSP) database maintained by the University of Chicago. Diversification is a risk management strategy that involves mixing a variety of financial assets in a portfolio. This strategy helps reduce the overall risk of the investment and improve performance of the portfolio. To construct portfolios that effectively diversify risks, we first construct a co-movement network using the correlations between stock returns over a training time period. Correlation characterizes the synchrony among stock returns thus helps us understand whether two or multiple stocks have common risk attributes. Based on the co-movement network, we apply multiple network community detection algorithms to detect groups of stocks with common co-movement patterns. Stocks within the same community tend to be highly correlated, while stocks across different communities tend to be less correlated. A portfolio is then constructed by selecting stocks from different communities. The average return of the constructed portfolio over a testing time period is finally compared with the SandP 500 market index. Our constructed portfolios demonstrate outstanding performance during a non-crisis period (2004-2006) and good performance during a financial crisis period (2008-2010). / PHD / High dimensional data, which are composed by data points with a tremendous number of features (a.k.a. attributes, independent variables, explanatory variables), brings challenges to statistical analysis due to their “high-dimensionality” and complicated structure. In this dissertation work, I consider two types of high-dimension data. The first type is functional data in which each observation is a function. The second type is network data whose internal structure can be described as a network. I aim to detect “important” regions in functional data by using a novel statistical model, and I treat stock market data as network data to construct quality portfolios efficiently Bayesian Variable Selection Community Detection Compactly Supported Basis Functional Data Analysis Ising Prior MCMC Network Data Analysis Portfolio Theory Region Selection
257	Acoustic emission monitoring of fiber reinforced bridge panels Flannigan, James Christopher January 1900 (has links) Master of Science / Department of Mechanical and Nuclear Engineering / Youqi Wang / Two fiber reinforced polymer (FRP) bridge deck specimens were analyzed by means of acoustic emission (AE) monitoring during a series of loading cycles performed at various locations on the composite sandwich panels' surfaces. These panels were subjected to loads that were intended to test their structural response and characteristics without exposing them to a failure scenario. This allowed the sensors to record multiple data sets without fear of having to be placed on multiple panels that could have various characteristics that alter the signals recorded. The objective throughout the analysis ias to determine how the acoustic signals respond to loading cycles and various events can affect the acoustical data. In the process of performing this examination several steps were taken including threshold application, data collection, and sensor location analysis. The thresholds are important for lowering the size of the files containing the data, while keeping important information that could determine structurally significant information. Equally important is figuring out where and how the sensors should be placed on the panels in the first place in relation to other sensors, panel features and supporting beams. The data was subjected to analysis involving the response to applied loads, joint effects and failure analysis. Using previously developed techniques the information gathered was also analyzed in terms of what type of failure could be occurring within the structure itself. This somewhat aided in the analysis after an unplanned failure event occurred to determine what cause or causes might have lead to the occurrence. The basic analyses were separated into four sets, starting with the basic analysis to determine basic correlations to the loads applied. This was followed by joint and sensor location analyses, both of which took place using a two panel setup. The last set was created upon matrix failure of the panel and the subsequent investigation. Fiber Reinforced Polymer Acoustic Emission Bridge Deck Data Analysis Engineering, Civil (0543) Engineering, Mechanical (0548)
258	The Development and the Evaluation of a Quasi-Real Time Decision Aid Tool Leite, Nelson Paiva Oliveira, Lopes, Leonardo Mauricio de Faria, Walter, Fernando 10 1900 (has links) ITC/USA 2009 Conference Proceedings / The Forty-Fifth Annual International Telemetering Conference and Technical Exhibition / October 26-29, 2009 / Riviera Hotel & Convention Center, Las Vegas, Nevada / In an experimental flight test campaign, the usage of a real time Ground Telemetry System (GTS) provides mandatory support for three basic essential services: a) Safety storage of Flight Tests Instrumentation (FTI) data, in the occurrence of a critical aircraft failure; b) Monitoring of critical flight safety parameters to avoid the occurrence of accidents; and c) Monitoring of selected parameters that validates all tests points. At the operational side the test ranges typically works in two phases: a) In real time where the GTS crew performs test validation and test point selection with Telemetry data; and b) In post mission where the engineering crew performs data analysis and reduction with airborne recorded data. This process is time consuming because recorded data has to be downloaded, converted to Engineering Units (EU), sliced, filtered and processed. The main reason for the usage of this less efficient process relies in the fact that the real time Telemetry data is less reliable as compared to recorded data (i.e. it contains more noise and some dropouts). With the introduction of new technologies (i.e. i-NET) the telemetry link could be very reliable, so the GTS could perform data reduction analysis immediately after the receipt of all valid tests points, while the aircraft is still flying in a quasi-real time environment. To achieve this goal the Brazilian Flight Test Group (GEEV) along with EMBRAER and with the support of Financiadora de Estudos e Projetos (FINEP) started the development of a series of Decision Aid Tools that performs data reduction analysis into the GTS in quasi-real time. This paper presents the development and the evaluation of a tool used in Air Data System Calibration Flight Tests Campaign. The application receives the Telemetry data over either a TCP/IP or a SCRAMnet Network, performs data analysis and test point validation in real time and when all points are gathered it performs the data reduction analysis and automatically creates HTML formatted tests reports. The tool evaluation was carried out with the instruction flights for the 2009 Brazilian Flight Test School (CEV). The results present a great efficiency gain for the overall Flight Test Campaign. Flight Tests Air Data Calibration Data Analysis Telemetry Decision Aid Tool
259	Designing an Object-Oriented Data Processing Network Yang, Hsueh-szu, Sadia, Nathan, Kupferschmidt, Benjamin 10 1900 (has links) ITC/USA 2008 Conference Proceedings / The Forty-Fourth Annual International Telemetering Conference and Technical Exhibition / October 27-30, 2008 / Town and Country Resort & Convention Center, San Diego, California / There are many challenging aspects to processing data from a modern high-performance data acquisition system. The sheer diversity of data formats and protocols makes it very difficult to create a data processing application that can properly decode and display all types of data. Many different tools need to be harnessed to process and display all types of data. Each type of data needs to be displayed on the correct type of display. In particular, it is very hard to synchronize the display of different types of data. This tends to be an error prone, complex and very time-consuming process. This paper discusses a solution to the problem of decoding and displaying many different types of data in the same system. This solution is based on the concept of a linked network of data processing nodes. Each node performs a particular task in the data decoding and/or analysis process. By chaining these nodes together in the proper sequence, we can define a complex decoder from a set of simple building blocks. This greatly increases the flexibility of the data visualization system while allowing for extensive code reuse. Ground Station Software Data Processing Data Analysis Data Reporting and Exporting Object Oriented
260	An investigation of the effect of the European currency union (Euro) on sectoral trade : an application of the gravity model of trade Awa, Ruth January 2015 (has links) The introduction of the single currency (Euro) in Europe has been referred to as the ‘world’s largest economic experiment’ and has led to major research on the effects of the adoption of a common currency on economic activity with considerable emphasis on its effect on trade flows at the macroeconomic level. However, the investigation of the euro effect on individual sectors has received very little attention and this provides the motivation for the research. The main contribution of this thesis is to the sectoral analysis of the single currency’s effect on bi-lateral trade flows, specifically the effects on the transport equipment manufacturing sector. In order to achieve this, a comparison of the different estimation methods applied in the gravity model literature will be employed to investigate this effect and to identify the factors affecting trade in this sector. This study uses a panel data set which comprises the most recent information on bilateral trade for the EU15 countries from 1990 to 2008. This research aims to build on the results obtained in previous studies by employing a more refined empirical methodology and associated tests. The purpose of the tests is to ensure that the euro’s effect on trade is isolated from the other pro- trade policies of the European integration processes, particularly the introduction of the Single Market. The desirable feature of this approach is that, while other studies limit their attention to a particular issue (zero trade flow, time trend, sectoral analysis, cross-correlation, etc.), very few, if any, apply a selection of techniques. Overall, the results demonstrate that the single currency’s effect on trade in this sector is limited with only the fixed effects formulation with year dummy variables showing a significant positive effect of the euro. An obvious policy implication for countries looking to adopt a single currency is that they should be cautious regarding the potential for growth in intra-bloc trade in a particular sector, although they will benefit from the on-going process of integration. 657

Search results