131 |
High Dimensional Data Methods in Industrial Organization Type Discrete Choice ModelsLopez Gomez, Daniel Felipe 11 August 2022 (has links)
No description available.
|
132 |
A Hug for Humanity: Metamodernism and Masculinity on Television in Ted LassoKoford, Kennedy Lesley 06 December 2022 (has links)
The Apple TV+ sports comedy Ted Lasso has been a hit among fans, who flock to the show and its positive messages. The show offers a refreshing tone, one that promotes positivity and optimism even when faced with the reality of a cynical world. By using the analytical perspective of metamodernism to understand its popularity, scholars and fans alike can gain a deeper understanding of core message of the show, which mimics the oscillation between the cynical and the sincere. Fundamental scholars in the emerging study of metamodernism, such as Robin Van Den Akker, Timotheus Vermeulen, and Alison Gibbons will be examined further in this paper to define metamodernism and see how it can be used to gain deeper meaning from Ted Lasso. Metamodernism will then be used to further examine how masculinity operates within the show. Scholars Robert Hanke, Lynn C. Spangler, and Amanda D. Lotz will be used to establish the scholarship that already surrounds the subject of men in television. Ted Lasso showcases competing notions of masculinity in a balance that will be called the "metamodern masculine" in this paper. This masculinity mimics the traits seen within metamodernism and represents a progressive masculinity, one that balances the hegemonic and new masculinities discussed by the scholars. The themes of the show, when examined using metamodernism, point to a hopeful future, falling in line with a pattern of emerging media and speaking to a need for hope among audiences and society.
|
133 |
Basis Risk in Variable AnnuitiesLi, Wenchu, 0009-0008-5877-6350 08 1900 (has links)
This dissertation provides a comprehensive and practical analysis of basis risk in the U.S. variable annuity market and examines effective fund mapping strategies to mitigate the level of basis risk while controlling for the associated transaction costs. Variable annuities are personal savings and investment products with long-term guarantees that expose life insurers to extensive financial risks. Liabilities associated with VA guarantees are the largest liability component faced by U.S. life insurers and have raised concerns to VA providers and regulators. And the hedging performance of these guarantee liabilities is impeded by the existence of basis risk.
I look into 1,892 registered VA-underlying mutual funds and two VA separate accounts to estimate the basis risk faced by U.S. VA providers at the individual fund level and the separate account level. To evaluate the degree to which basis risk can be mitigated, I consider various proxy instrument sets and assess different variable selection models. The LASSO regression is shown to be most effective at identifying the most suitable (combination of) mapping instruments that minimize basis risk, compared to other test-based and screening-based models. I supplement it with the Sure Independence Screening (SIS) procedure to further limit the number of instruments requested in the hedging strategies, and modify it by introducing the diff LASSO regression to restrict the changes in instrument allocations across rebalancing periods and, therefore, control for transaction costs.
I show that VA providers can reduce their exposure to basis risk by applying data analytic techniques in their mapping process, by hedging with ETFs instead of futures contracts, and through diversification at the separate account level. Combining the traditional fund mapping method with the machine learning algorithm, the proposed portfolio mapping strategy is efficient at reducing basis risk in VA separate accounts while controlling for the tractability and transaction costs of the mapping and hedging procedure, and is practical to incorporate newly-developed VA funds, as well as the varying compositions of separate accounts. Overall, this study presents that U.S. VA providers have the ability to mitigate basis risk to a greater extent than the limited literature on this topic has suggested. / Business Administration/Risk Management and Insurance
|
134 |
Ribosomally Synthesized and Post-Translationally Modified Peptides as Potential Scaffolds for Peptide EngineeringBursey, Devan 01 March 2019 (has links)
Peptides are small proteins that are crucial in many biological pathways such as antimicrobial defense, hormone signaling, and virulence. They often exhibit tight specificity for their targets and therefore have great therapeutic potential. Many peptide-based therapeutics are currently available, and the demand for this type of drug is expected to continue to increase. In order to satisfy the growing demand for peptide-based therapeutics, new engineering approaches to generate novel peptides should be developed. Ribosomally synthesized and post-translationally modified peptides (RiPPs) are a group of peptides that have the potential to be effective scaffolds for in vivo peptide engineering projects. These natural RiPP peptides are enzymatically endowed with post-translational modifications (PTMs) that result in increased stability and greater target specificity. Many RiPPs, such as microcin J25 and micrococcin, can tolerate considerable amino acid sequence randomization while still being capable of receiving unique post-translational modifications. This thesis describes how we successfully engineered E. coli to produce the lasso peptide microcin J25 using a two-plasmid inducible expression system. In addition, we characterized the protein-protein interactions between PTM enzymes in the synthesis of micrococcin. The first step in micrococcin synthesis is the alteration of cysteines to thiazoles on the precursor peptide TclE. This step is accomplished by three proteins: TclI, TclJ, and TclN. We found that a 4-membered protein complex is formed consisting of TclI, TclJ, TclN, and TclE. Furthermore, the TclI protein functions as a central adaptor joining two other enzymes in the Tcl pathway with the substrate peptide.
|
135 |
Comparing Variable Selection Algorithms On Logistic Regression – A SimulationSINGH, KEVIN January 2021 (has links)
When we try to understand why some schools perform worse than others, if Covid-19 has struck harder on some demographics or whether income correlates with increased happiness, we may turn to regression to better understand how these variables are correlated. To capture the true relationship between variables we may use variable selection methods in order to ensure that the variables which have an actual effect have been included in the model. Choosing the right model for variable selection is vital. Without it there is a risk of including variables which have little to do with the dependent variable or excluding variables that are important. Failing to capture the true effects would paint a picture disconnected from reality and it would also give a false impression of what reality really looks like. To mitigate this risk a simulation study has been conducted to find out what variable selection algorithms to apply in order to make more accurate inference. The different algorithms being tested are stepwise regression, backward elimination and lasso regression. Lasso performed worst when applied to a small sample but performed best when applied to larger samples. Backward elimination and stepwise regression had very similar results.
|
136 |
Learning Sparse Graphs for Data PredictionRommedahl, David, Lindström, Martin January 2020 (has links)
Graph structures can often be used to describecomplex data sets. In many applications, the graph structureis not known but must be inferred from data. Furthermore, realworld data is often naturally described by sparse graphs. Inthis project, we have aimed at recreating the results describedin previous work, namely to learn a graph that can be usedfor prediction using an ℓ1-penalised LASSO approach. We alsopropose different methods for learning and evaluating the graph. We have evaluated the methods on synthetic data and real-worldSwedish temperature data. The results show that we are unableto recreate the results of the previous research team, but wemanage to learn sparse graphs that could be used for prediction. Further work is needed to verify our results. / Grafstrukturer kan ofta användas för att beskriva komplex data. I många tillämpningar är grafstrukturen inte känd, utan måste läras från data. Vidare beskrivs verklig data ofta naturligt av glesa grafer. I detta projekt har vi försökt återskapa resultaten från ett tidigare forskningsarbete, nämligen att lära en graf som kan användas för prediktion med en ℓ1pennaliserad LASSO-metod. Vi föreslår även andra metoder för inlärning och utvärdering av grafen. Vi har testat metoderna på syntetisk data och verklig temperaturdata från Sverige. Resultaten visar att vi inte kan återskapa de tidigare forskarnas resultat, men vi lyckas lära in glesa grafer som kan användas för prediktion. Ytterligare arbete krävs för att verifiera våra resultat. / Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
|
137 |
STATISTICAL METHODS FOR VARIABLE SELECTION IN THE CONTEXT OF HIGH-DIMENSIONAL DATA: LASSO AND EXTENSIONSYang, Xiao Di 10 1900 (has links)
<p>With the advance of technology, the collection and storage of data has become routine. Huge amount of data are increasingly produced from biological experiments. the advent of DNA microarray technologies has enabled scientists to measure expressions of tens of thousands of genes simultaneously. Single nucleotide polymorphism (SNP) are being used in genetic association with a wide range of phenotypes, for example, complex diseases. These high-dimensional problems are becoming more and more common. The "large p, small n" problem, in which there are more variables than samples, currently a challenge that many statisticians face. The penalized variable selection method is an effective method to deal with "large p, small n" problem. In particular, The Lasso (least absolute selection and shrinkage operator) proposed by Tibshirani has become an effective method to deal with this type of problem. the Lasso works well for the covariates which can be treated individually. When the covariates are grouped, it does not work well. Elastic net, group lasso, group MCP and group bridge are extensions of the Lasso. Group lasso enforces sparsity at the group level, rather than at the level of the individual covariates. Group bridge, group MCP produces sparse solutions both at the group level and at the level of the individual covariates within a group. Our simulation study shows that the group lasso forces complete grouping, group MCP encourages grouping to a rather slight extent, and group bridge is somewhere in between. If one expects that the proportion of nonzero group members to be greater than one-half, group lasso maybe a good choice; otherwise group MCP would be preferred. If one expects this proportion to be close to one-half, one may wish to use group bridge. A real data analysis example is also conducted for genetic variation (SNPs) data to find out the associations between SNPs and West Nile disease.</p> / Master of Science (MSc)
|
138 |
Knowledge-fused Identification of Condition-specific Rewiring of Dependencies in Biological NetworksTian, Ye 30 September 2014 (has links)
Gene network modeling is one of the major goals of systems biology research. Gene network modeling targets the middle layer of active biological systems that orchestrate the activities of genes and proteins. Gene network modeling can provide critical information to bridge the gap between causes and effects which is essential to explain the mechanisms underlying disease. Among the network construction tasks, the rewiring of relevant network structure plays critical roles in determining the behavior of diseases. To systematically characterize the selectively activated regulatory components and mechanisms, the modeling tools must be able to effectively distinguish significant rewiring from random background fluctuations. While differential dependency networks cannot be constructed by existing knowledge alone, effective incorporation of prior knowledge into data-driven approaches can improve the robustness and biological relevance of network inference. Existing studies on protein-protein interactions and biological pathways provide constantly accumulated rich domain knowledge. Though novel incorporation of biological prior knowledge into network learning algorithms can effectively leverage domain knowledge, biological prior knowledge is neither condition-specific nor error-free, only serving as an aggregated source of partially-validated evidence under diverse experimental conditions. Hence, direct incorporation of imperfect and non-specific prior knowledge in specific problems is prone to errors and theoretically problematic.
To address this challenge, we propose a novel mathematical formulation that enables incorporation of prior knowledge into structural learning of biological networks as Gaussian graphical models, utilizing the strengths of both measurement data and prior knowledge. We propose a novel strategy to estimate and control the impact of unavoidable false positives in the prior knowledge that fully exploits the evidence from data while obtains "second opinion" by efficient consultations with prior knowledge. By proposing a significance assessment scheme to detect statistically significant rewiring of the learned differential dependency network, our method can assign edge-specific p-values and specify edge types to indicate one of six biological scenarios. The data-knowledge jointly inferred gene networks are relatively simple to interpret, yet still convey considerable biological information. Experiments on extensive simulation data and comparison with peer methods demonstrate the effectiveness of knowledge-fused differential dependency network in revealing the statistically significant rewiring in biological networks, leveraging data-driven evidence and existing biological knowledge, while remaining robust to the false positive edges in the prior knowledge.
We also made significant efforts in disseminating the developed method tools to the research community. We developed an accompanying R package and Cytoscape plugin to provide both batch processing ability and user-friendly graphic interfaces. With the comprehensive software tools, we apply our method to several practically important biological problems to study how yeast response to stress, to find the origin of ovarian cancer, and to evaluate the drug treatment effectiveness and other broader biological questions. In the yeast stress response study our findings corroborated existing literatures. A network distance measurement is defined based on KDDN and provided novel hypothesis on the origin of high-grade serous ovarian cancer. KDDN is also used in a novel integrated study of network biology and imaging in evaluating drug treatment of brain tumor. Applications to many other problems
also received promising biological results. / Ph. D.
|
139 |
Bridging Machine Learning and Experimental Design for Enhanced Data Analysis and OptimizationGuo, Qing 19 July 2024 (has links)
Experimental design is a powerful tool for gathering highly informative observations using a small number of experiments. The demand for smart data collection strategies is increasing due to the need to save time and budget, especially in online experiments and machine learning. However, the traditional experimental design method falls short in systematically assessing changing variables' effects. Specifically within Artificial Intelligence (AI), the challenge lies in assessing the impacts of model structures and training strategies on task performances with a limited number of trials. This shortfall underscores the necessity for the development of novel approaches. On the other side, the optimal design criterion has typically been model-based in classic design literature, which leads to restricting the flexibility of experimental design strategies. However, machine learning's inherent flexibility can empower the estimation of metrics efficiently using nonparametric and optimization techniques, thereby broadening the horizons of experimental design possibilities.
In this dissertation, the aim is to develop a set of novel methods to bridge the merits between these two domains: 1) applying ideas from statistical experimental design to enhance data efficiency in machine learning, and 2) leveraging powerful deep neural networks to optimize experimental design strategies.
This dissertation consists of 5 chapters. Chapter 1 provides a general introduction to mutual information, fractional factorial design, hyper-parameter tuning, multi-modality, etc. In Chapter 2, I propose a new mutual information estimator FLO by integrating techniques from variational inference (VAE), contrastive learning, and convex optimization. I apply FLO to broad data science applications, such as efficient data collection, transfer learning, fair learning, etc. Chapter 3 introduces a new design strategy called multi-layer sliced design (MLSD) with the application of AI assurance. It focuses on exploring the effects of hyper-parameters under different models and optimization strategies. Chapter 4 investigates classic vision challenges via multimodal large language models by implicitly optimizing mutual information and thoroughly exploring training strategies. Chapter 5 concludes this proposal and discusses several future research topics. / Doctor of Philosophy / In the digital age, artificial intelligence (AI) is reshaping our interactions with technology through advanced machine learning models. These models are complex, often opaque mechanisms that present challenges in understanding their inner workings. This complexity necessitates numerous experiments with different settings to optimize performance, which can be costly. Consequently, it is crucial to strategically evaluate the effects of various strategies on task performance using a limited number of trials. The Design of Experiments (DoE) offers invaluable techniques for investigating and understanding these complex systems efficiently. Moreover, integrating machine learning models can further enhance the DoE. Traditionally, experimental designs pre-specify a model and focus on finding the best strategies for experimentation. This assumption can restrict the adaptability and applicability of experimental designs. However, the inherent flexibility of machine learning models can enhance the capabilities of DoE, unlocking new possibilities for efficiently optimizing experimental strategies through an information-centric approach. Moreover, the information-based method can also be beneficial in other AI applications, including self-supervised learning, fair learning, transfer learning, etc. The research presented in this dissertation aims to bridge machine learning and experimental design, offering new insights and methodologies that benefit both AI techniques and DoE.
|
140 |
Analyse en composantes indépendantes avec une matrice de mélange éparseBillette, Marc-Olivier 06 1900 (has links)
L'analyse en composantes indépendantes (ACI) est une méthode d'analyse statistique qui consiste à exprimer les données observées (mélanges de sources) en une transformation linéaire de variables latentes (sources) supposées non gaussiennes et mutuellement indépendantes. Dans certaines applications, on suppose que les mélanges de sources peuvent être groupés de façon à ce que ceux appartenant au même groupe soient fonction des mêmes sources. Ceci implique que les coefficients de chacune des colonnes de la matrice de mélange peuvent être regroupés selon ces mêmes groupes et que tous les coefficients de certains de ces groupes soient nuls. En d'autres mots, on suppose que la matrice de mélange est éparse par groupe. Cette hypothèse facilite l'interprétation et améliore la précision du modèle d'ACI. Dans cette optique, nous proposons de résoudre le problème d'ACI avec une matrice de mélange éparse par groupe à l'aide d'une méthode basée sur le LASSO par groupe adaptatif, lequel pénalise la norme 1 des groupes de coefficients avec des poids adaptatifs. Dans ce mémoire, nous soulignons l'utilité de notre méthode lors d'applications en imagerie cérébrale, plus précisément en imagerie par résonance magnétique. Lors de simulations, nous illustrons par un exemple l'efficacité de notre méthode à réduire vers zéro les groupes de coefficients non-significatifs au sein de la matrice de mélange. Nous montrons aussi que la précision de la méthode proposée est supérieure à celle de l'estimateur du maximum de la vraisemblance pénalisée par le LASSO adaptatif dans le cas où la matrice de mélange est éparse par groupe. / Independent component analysis (ICA) is a method of statistical analysis where the main goal is to express the observed data (mixtures) in a linear transformation of latent variables (sources) believed to be non-Gaussian and mutually independent. In some applications, the mixtures can be grouped so that the mixtures belonging to the same group are function of the same sources. This implies that the coefficients of each column of the mixing matrix can be grouped according to these same groups and that all the coefficients of some of these groups are zero. In other words, we suppose that the mixing matrix is sparse per group. This assumption facilitates the interpretation and improves the accuracy of the ICA model. In this context, we propose to solve the problem of ICA with a sparse group mixing matrix by a method based on the adaptive group LASSO. The latter penalizes the 1-norm of the groups of coefficients with adaptive weights. In this thesis, we point out the utility of our method in applications in brain imaging, specifically in magnetic resonance imaging. Through simulations, we illustrate with an example the effectiveness of our method to reduce to zero the non-significant groups of coefficients within the mixing matrix. We also show that the accuracy of the proposed method is greater than the one of the maximum likelihood estimator with an adaptive LASSO penalization in the case where the mixing matrix is sparse per group.
|
Page generated in 0.0546 seconds