121 |
Text mining Twitter social media for Covid-19 : Comparing latent semantic analysis and latent Dirichlet allocationSheikha, Hassan January 2020 (has links)
In this thesis, the Twitter social media is data mined for information about the covid-19 outbreak during the month of March, starting from the 3’rd and ending on the 31’st. 100,000 tweets were collected from Harvard’s opensource data and recreated using Hydrate. This data is analyzed further using different Natural Language Processing (NLP) methodologies, such as termfrequency inverse document frequency (TF-IDF), lemmatizing, tokenizing, Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Furthermore, the results of the LSA and LDA algorithms is reduced dimensional data that will be clustered using clustering algorithms HDBSCAN and K-Means for later comparison. Different methodologies are used to determine the optimal parameters for the algorithms. This is all done in the python programing language, as there are libraries for supporting this research, the most important being scikit-learn. The frequent words of each cluster will then be displayed and compared with factual data regarding the outbreak to discover if there are any correlations. The factual data is collected by World Health Organization (WHO) and is then visualized in graphs in ourworldindata.org. Correlations with the results are also looked for in news articles to find any significant moments to see if that affected the top words in the clustered data. The news articles with good timelines used for correlating incidents are that of NBC News and New York Times. The results show no direct correlations with the data reported by WHO, however looking into the timelines reported by news sources some correlation can be seen with the clustered data. Also, the combination of LDA and HDBSCAN yielded the most desireable results in comparison to the other combinations of the dimnension reductions and clustering. This was much due to the use of GridSearchCV on LDA to determine the ideal parameters for the LDA models on each dataset as well as how well HDBSCAN clusters its data in comparison to K-Means.
|
122 |
Popis Restricted Boltzmann machine metody ve vztahu se statistickou fyzikou a jeho následné využití ve zpracování spektroskopických dat / Interconnection of Restricted Boltzmann machine method with statistical physics and its implementation in the processing of spectroscopic dataVrábel, Jakub January 2019 (has links)
Práca sa zaoberá spojeniami medzi štatistickou fyzikou a strojovým učením s dôrazom na základné princípy a ich dôsledky. Ďalej sa venuje obecným vlastnostiam spektroskopických dát a ich zohľadnení pri pokročilom spracovaní dát. Začiatok práce je venovaný odvodeniu partičnej sumy štatistického systému a štúdiu Isingovho modelu pomocou "mean field" prístupu. Následne, popri základnom úvode do strojového učenia, je ukázaná ekvivalencia medzi Isingovým modelom a Hopfieldovou sieťou - modelom strojového učenia. Na konci teoretickej časti je z Hopfieldovej siete odvodený model Restricted Boltzmann Machine (RBM). Vhodnosť použitia RBM na spracovanie spektroskopických dát je diskutovaná a preukázaná na znížení dimenzie týchto dát. Výsledky sú porovnané s bežne používanou Metódou Hlavných Komponent (PCA), spolu so zhodnotením prístupu a možnosťami ďalšieho zlepšovania.
|
123 |
Swap Book Hedging using Stochastic Optimisation with Realistic Risk FactorsNordin, Rickard, Mårtensson, Emil January 2021 (has links)
Market makers such as large banks are exposed to market risk in fixed income by acting as a counterparty for customers that enter swap contracts. This master thesis addresses the problem of creating a cost-effective hedge for a realistic swap book of a market maker in a multiple yield curve setting. The proposed hedge model is the two-stage stochastic optimisation problem created by Blomvall and Hagenbjörk (2020). Systematic term structure innovations (components) are estimated using six different component models including principal component analysis (PCA), independent component analysis (ICA) and rotations of principal components. The component models are evaluated with a statistical test that uses daily swap rate observations from the European swap market. The statistical test shows that for both FRA and IRS contracts, a rotation of regular principal components is capable of a more accurate description of swap rate innovations than regular PCA. The hedging model is applied to an FRA and an IRS swap book separately, with daily rebalancing, over the period 2013-06-21 to 2021-05-11. The model produces a highly effective hedge for the tested component methods. However, replacing the PCA components with improved components does not improve the hedge. The study is conducted in collaboration with two other master theses, each done at separate banks. This thesis is done in collaboration with Swedbank and the simulated swap book is based on the exposure of a typical swap book at Swedbank, which is why the European swap market is studied.
|
124 |
Parallel Algorithms for Machine LearningMoon, Gordon Euhyun 02 October 2019 (has links)
No description available.
|
125 |
A Homogenized Bending Theory for Prestrained PlatesBöhnlein, Klaus, Neukamm, Stefan, Padilla-Garza, David, Sander, Oliver 22 February 2024 (has links)
The presence of prestrain can have a tremendous effect on the mechanical behavior of slender structures. Prestrained elastic plates show spontaneous bending in equilibrium—a property that makes such objects relevant for the fabrication of active and functionalmaterials. In this paperwe studymicroheterogeneous, prestrained plates that feature non-flat equilibriumshapes. Our goal is to understand the relation between the properties of the prestrained microstructure and the global shape of the plate in mechanical equilibrium. To this end, we consider a three-dimensional, nonlinear elasticity model that describes a periodic material that occupies a domain with small thickness. We consider a spatially periodic prestrain described in the form of a multiplicative decomposition of the deformation gradient.By simultaneous homogenization and dimension reduction, we rigorously derive an effective plate model as a Γ-limit for vanishing thickness and period. That limit has the form of a nonlinear bending energy with an emergent spontaneous curvature term. The homogenized properties of the bending model (bending stiffness and spontaneous curvature) are characterized by corrector problems. For a model composite—a prestrained laminate composed of isotropic materials—we investigate the dependence of the homogenized properties on the parameters of the model composite. Secondly, we investigate the relation between the parameters of the model composite and the set of shapes with minimal bending energy. Our study reveals a rather complex dependence of these shapes on the composite parameters. For instance, the curvature and principal directions of these shapes depend on the parameters in a nonlinear and discontinuous way; for certain parameter regions we observe uniqueness and non-uniqueness of the shapes. We also observe size effects: The geometries of the shapes depend on the aspect ratio between the plate thickness and the composite period. As a second application of our theory, we study a problem of shape programming: We prove that any target shape (parametrized by a bending deformation) can be obtained (up to a small tolerance) as an energy minimizer of a composite plate, which is simple in the sense that the plate consists of only finitely many grains that are filled with a parametrized composite with a single degree of freedom.
|
126 |
Proteomics and Machine Learning for Pulmonary Embolism Risk with Protein MarkersAwuah, Yaa Amankwah 01 December 2023 (has links) (PDF)
This thesis investigates protein markers linked to pulmonary embolism risk using proteomics and statistical methods, employing unsupervised and supervised machine learning techniques. The research analyzes existing datasets, identifies significant features, and observes gender differences through MANOVA. Principal Component Analysis reduces variables from 378 to 59, and Random Forest achieves 70% accuracy. These findings contribute to our understanding of pulmonary embolism and may lead to diagnostic biomarkers. MANOVA reveals significant gender differences, and applying proteomics holds promise for clinical practice and research.
|
127 |
Measuring Group Separability in Geometrical Space for Evaluation of Pattern Recognition and Dimension Reduction AlgorithmsAcevedo, Aldo, Duran, Claudio, Kuo, Ming-Ju, Ciucci, Sara, Schroeder, Michael, Cannistraci, Carlo Vittorio 22 January 2024 (has links)
Evaluating group separability is fundamental to pattern recognition. A plethora of dimension reduction (DR) algorithms has been developed to reveal the emergence of geometrical patterns in a lowdimensional space, where high-dimensional sample similarities are approximated by geometrical distances. However, statistical measures to evaluate the group separability attained by DR representations are missing. Traditional cluster validity indices (CVIs) might be applied in this context, but they present multiple limitations because they are not specifically tailored for DR. Here, we introduce a new rationale called projection separability (PS), which provides a methodology expressly designed to assess the group separability of data samples in a DR geometrical space. Using this rationale, we implemented a new class of indices named projection separability indices (PSIs) based on four statistical measures: Mann-Whitney U-test p-value, Area Under the ROC-Curve, Area Under the Precision-Recall Curve, and Matthews Correlation Coeffcient. The PSIs were compared to six representative cluster validity indices and one geometrical separability index using seven nonlinear datasets and six different DR algorithms. The results provide evidence that the implemented statistical-based measures designed on the basis of the PS rationale are more accurate than the other indices and can be adopted not only for evaluating and comparing group separability of DR results but also for fine-tuning DR algorithms' hyperparameters. Finally, we introduce a second methodological innovation termed trustworthiness, a statistical evaluation that accounts for separability uncertainty and associates to the measure of each index a p-value that expresses the significance level in comparison to a null model.
|
128 |
Adaptive Mixture Estimation and Subsampling PCALiu, Peng January 2009 (has links)
No description available.
|
129 |
Bending models of nematic liquid crystal elastomers: Gamma-convergence results in nonlinear elasticityGriehl, Max 22 May 2024 (has links)
We consider thin bodies made from elastomers and nematic liquid crystal elastomers. Starting from a nonlinear 3d hyperelastic model, and using the Gamma-convergence method, we derive lower dimensional models for 2d and 1d. The limit models describe the interplay between free liquid crystal orientations and bending deformations.:1 Introduction
1.1 Main results and structure of the text
1.2 Survey of the literature
1.2.1 Dimension reduction in nonlinear elasticity
1.2.2 Relation to other bending regime results in detail
1.2.3 Relation to other Gamma-convergence results of LCEs
2 Liquid crystal elastomers
2.1 Properties
2.2 Modeling
3 Rods
3.1 Setup and statement of analytical main results
3.1.1 The 3d-model and assumptions
3.1.2 The effective 1d-model
3.1.3 The Gamma-convergence result without boundary conditions
3.1.4 Boundary conditions for y
3.1.5 Weak and strong anchoring of n
3.1.6 Definition and properties of the effective coefficients
3.2 Numerical 1d-model exploration
3.3 Dimensional analysis and scalings
3.3.1 Non-dimensionalization and rescaling
3.3.2 Scaling assumptions
3.3.3 Dimensional analysis and applicability of the 1d-model
3.4 Smooth approximation of framed curves
3.5 Proofs
3.5.1 Compactness: proofs of Theorem 3.1.3 (a) and Proposition 3.1.4 (a)
3.5.2 Lower bound: proof of Theorem 3.1.3 (b) . . . . . . . . . . . . 68
3.5.3 Upper bound: proofs of Theorem 3.1.3 (c) and Proposition 3.1.4 (b)
3.5.4 Anchoring: proof of Proposition 3.1.5
3.5.5 Properties of the effective coefficients
4 Plates
4.1 Setup and statement of analytical main results
4.1.1 The 3d-model and assumptions
4.1.2 The effective 2d-model
4.1.3 The Gamma-convergence result without boundary conditions
4.1.4 Definition and properties of the effective coefficients
4.1.5 Boundary conditions for y
4.1.6 Weak and strong anchoring of n
4.2 Analytical and numerical 2d-model exploration
4.2.1 Analytical 2d-model exploration
4.2.2 Numerical 2d-model exploration
4.3 Dimensional analysis and scalings
4.3.1 Non-dimensionalization and rescaling
4.3.2 Scaling assumptions
4.3.3 Dimensional analysis and applicability
4.4 Geometry and approximation of bending deformations
4.4.1 Proofs of the geometric properties in the smooth case
4.4.2 Proof for the smooth approximations
4.5 Proofs
4.5.1 Compactness: proofs of Theorems 4.1.1 (a) and 4.1.8 (a)
4.5.2 Lower bound: proof of Theorem 4.1.1 (b)
4.5.3 Upper bound: proofs of Theorem 4.1.1 (c) and Theorem 4.1.8 (b)
4.5.4 Properties of the effective coefficients
4.5.5 Anchorings
4.5.6 Approximation of nonlinear strains: proof of Proposition 4.5.4
5 Conclusions and outlooks
Bibliography
|
130 |
Tracer transport in fractured porous media : Homogenization, dimension reduction, and simulation of a coupled system of adsorption-diffusion-convection equationsAgenorwoth, Samuel January 2024 (has links)
We propose derivations of several models of adsorption-convection-diffusion-type describing transport in fractured porous media and simulate numerically some of them. As starting point, we consider a basic scenario where the tracer (i.e. the chemical substance of interest) is supposed to cross an heterogeneous porous media made of a regular part and a fissure. The fissure is in our case a straight thin layer fracture. We focus exclusively on reducing the dimension of the fracture to a line, while aiming to derive the correct limit equations and transmission conditions. We employ formal two-scale homogenization asymptotics to derive reduced effective models. The proposed reduced effective models can become useful tools for the engineering community as they can be approximated easily numerically.
|
Page generated in 0.0351 seconds