Spelling suggestions: "subject:"[een] REGULARIZATION"" "subject:"[enn] REGULARIZATION""
251 |
Knowledge-fused Identification of Condition-specific Rewiring of Dependencies in Biological NetworksTian, Ye 30 September 2014 (has links)
Gene network modeling is one of the major goals of systems biology research. Gene network modeling targets the middle layer of active biological systems that orchestrate the activities of genes and proteins. Gene network modeling can provide critical information to bridge the gap between causes and effects which is essential to explain the mechanisms underlying disease. Among the network construction tasks, the rewiring of relevant network structure plays critical roles in determining the behavior of diseases. To systematically characterize the selectively activated regulatory components and mechanisms, the modeling tools must be able to effectively distinguish significant rewiring from random background fluctuations. While differential dependency networks cannot be constructed by existing knowledge alone, effective incorporation of prior knowledge into data-driven approaches can improve the robustness and biological relevance of network inference. Existing studies on protein-protein interactions and biological pathways provide constantly accumulated rich domain knowledge. Though novel incorporation of biological prior knowledge into network learning algorithms can effectively leverage domain knowledge, biological prior knowledge is neither condition-specific nor error-free, only serving as an aggregated source of partially-validated evidence under diverse experimental conditions. Hence, direct incorporation of imperfect and non-specific prior knowledge in specific problems is prone to errors and theoretically problematic.
To address this challenge, we propose a novel mathematical formulation that enables incorporation of prior knowledge into structural learning of biological networks as Gaussian graphical models, utilizing the strengths of both measurement data and prior knowledge. We propose a novel strategy to estimate and control the impact of unavoidable false positives in the prior knowledge that fully exploits the evidence from data while obtains "second opinion" by efficient consultations with prior knowledge. By proposing a significance assessment scheme to detect statistically significant rewiring of the learned differential dependency network, our method can assign edge-specific p-values and specify edge types to indicate one of six biological scenarios. The data-knowledge jointly inferred gene networks are relatively simple to interpret, yet still convey considerable biological information. Experiments on extensive simulation data and comparison with peer methods demonstrate the effectiveness of knowledge-fused differential dependency network in revealing the statistically significant rewiring in biological networks, leveraging data-driven evidence and existing biological knowledge, while remaining robust to the false positive edges in the prior knowledge.
We also made significant efforts in disseminating the developed method tools to the research community. We developed an accompanying R package and Cytoscape plugin to provide both batch processing ability and user-friendly graphic interfaces. With the comprehensive software tools, we apply our method to several practically important biological problems to study how yeast response to stress, to find the origin of ovarian cancer, and to evaluate the drug treatment effectiveness and other broader biological questions. In the yeast stress response study our findings corroborated existing literatures. A network distance measurement is defined based on KDDN and provided novel hypothesis on the origin of high-grade serous ovarian cancer. KDDN is also used in a novel integrated study of network biology and imaging in evaluating drug treatment of brain tumor. Applications to many other problems
also received promising biological results. / Ph. D.
|
252 |
[en] A THEORY BASED, DATA DRIVEN SELECTION FOR THE REGULARIZATION PARAMETER FOR LASSO / [pt] SELECIONANDO O PARÂMETRO DE REGULARIZAÇÃO PARA O LASSO: BASEADO NA TEORIA E NOS DADOSDANIEL MARTINS COUTINHO 25 March 2021 (has links)
[pt] O presente trabalho apresenta uma nova forma de selecionar o parâmetro
de regularização do LASSO e do adaLASSO. Ela é baseada na teoria e
incorpora a estimativa da variância do ruído. Nós mostramos propriedades
teóricas e simulações Monte Carlo que o nosso procedimento é capaz de lidar
com mais variáveis no conjunto ativo do que outras opções populares para a
escolha do parâmetro de regularização. / [en] We provide a new way to select the regularization parameter for the
LASSO and adaLASSO. It is based on the theory and incorporates an estimate
of the variance of the noise. We show theoretical properties of the procedure
and Monte Carlo simulations showing that it is able to handle more variables
in the active set than other popular options for the regularization parameter.
|
253 |
Studies on Subword-based Low-Resource Neural Machine Translation: Segmentation, Encoding, and Decoding / サブワードに基づく低資源ニューラル機械翻訳に関する研究:分割、符号化、及び復号化Haiyue, Song 25 March 2024 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第25423号 / 情博第861号 / 新制||情||144(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)特定教授 黒橋 禎夫, 教授 河原 達也, 教授 西野 恒 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
254 |
Approche unifiée multidimensionnelle du problème d'identification acoustique inverse / Unified multidimensional approach to the inverse problem for acoustic source identificationLe Magueresse, Thibaut 11 February 2016 (has links)
La caractérisation expérimentale de sources acoustiques est l'une des étapes essentielles pour la réduction des nuisances sonores produites par les machines industrielles. L'objectif de la thèse est de mettre au point une procédure complète visant à localiser et à quantifier des sources acoustiques stationnaires ou non sur un maillage surfacique par la rétro-propagation d'un champ de pression mesuré par un réseau de microphones. Ce problème inverse est délicat à résoudre puisqu'il est généralement mal-conditionné et sujet à de nombreuses sources d'erreurs. Dans ce contexte, il est capital de s'appuyer sur une description réaliste du modèle de propagation acoustique direct. Dans le domaine fréquentiel, la méthode des sources équivalentes a été adaptée au problème de l'imagerie acoustique dans le but d'estimer les fonctions de transfert entre les sources et l'antenne, en prenant en compte le phénomène de diffraction des ondes autour de l'objet d'intérêt. Dans le domaine temporel, la propagation est modélisée comme un produit de convolution entre la source et une réponse impulsionnelle décrite dans le domaine temps-nombre d'onde. Le caractère sous-déterminé du problème acoustique inverse implique d'utiliser toutes les connaissances a priori disponibles sur le champ sources. Il a donc semblé pertinent d'employer une approche bayésienne pour résoudre ce problème. Des informations a priori disponibles sur les sources acoustiques ont été mises en équation et il a été montré que la prise en compte de leur parcimonie spatiale ou de leur rayonnement omnidirectionnel pouvait améliorer significativement les résultats. Dans les hypothèses formulées, la solution du problème inverse s'écrit sous la forme régularisée de Tikhonov. Le paramètre de régularisation a été estimé par une approche bayésienne empirique. Sa supériorité par rapport aux méthodes communément utilisées dans la littérature a été démontrée au travers d'études numériques et expérimentales. En présence de fortes variabilités du rapport signal à bruit au cours du temps, il a été montré qu'il est nécessaire de mettre à jour sa valeur afin d'obtenir une solution satisfaisante. Finalement, l'introduction d'une variable manquante au problème reflétant la méconnaissance partielle du modèle de propagation a permis, sous certaines conditions, d'améliorer l'estimation de l'amplitude complexe des sources en présence d'erreurs de modèle. Les développements proposés ont permis de caractériser, in situ, la puissance acoustique rayonnée par composant d'un groupe motopropulseur automobile par la méthode de la focalisation bayésienne dans le cadre du projet Ecobex. Le champ acoustique cyclo-stationnaire généré par un ventilateur automobile a finalement été analysé par la méthode d'holographie acoustique de champ proche temps réel. / Experimental characterization of acoustic sources is one of the essential steps for reducing noise produced by industrial machinery. The aim of the thesis is to develop a complete procedure to localize and quantify both stationary and non-stationary sound sources radiating on a surface mesh by the back-propagation of a pressure field measured by a microphone array. The inverse problem is difficult to solve because it is generally ill-conditioned and subject to many sources of error. In this context, it is crucial to rely on a realistic description of the direct sound propagation model. In the frequency domain, the equivalent source method has been adapted to the acoustic imaging problem in order to estimate the transfer functions between the source and the antenna, taking into account the wave scattering. In the time domain, the propagation is modeled as a convolution product between the source and an impulse response described in the time-wavenumber domain. It seemed appropriate to use a Bayesian approach to use all the available knowledge about sources to solve this problem. A priori information available about the acoustic sources have been equated and it has been shown that taking into account their spatial sparsity or their omnidirectional radiation could significantly improve the results. In the assumptions made, the inverse problem solution is written in the regularized Tikhonov form. The regularization parameter has been estimated by an empirical Bayesian approach. Its superiority over methods commonly used in the literature has been demonstrated through numerical and experimental studies. In the presence of high variability of the signal to noise ratio over time, it has been shown that it is necessary to update its value to obtain a satisfactory solution. Finally, the introduction of a missing variable to the problem reflecting the partial ignorance of the propagation model could improve, under certain conditions, the estimation of the complex amplitude of the sources in the presence of model errors. The proposed developments have been applied to the estimation of the sound power emitted by an automotive power train using the Bayesian focusing method in the framework of the Ecobex project. The cyclo-stationary acoustic field generated by a fan motor was finally analyzed by the real-time near-field acoustic holography method.
|
255 |
The impact of a curious type of smoothness conditions on convergence rates in l1-regularizationBot, Radu Ioan, Hofmann, Bernd 31 January 2013 (has links) (PDF)
Tikhonov-type regularization of linear and nonlinear ill-posed problems in abstract spaces under sparsity constraints gained relevant attention in the past years. Since under some weak assumptions all regularized solutions are sparse if the l1-norm is used as penalty term, the l1-regularization was studied by numerous authors although the non-reflexivity of the Banach space l1 and the fact that such penalty functional is not strictly convex lead to serious difficulties. We consider the case that the sparsity assumption is narrowly missed. This means that the solutions may have an infinite number of nonzero but fast decaying components. For that case we formulate and prove convergence rates results for the l1-regularization of nonlinear operator equations. In this context, we outline the situations of Hölder rates and of an exponential decay of the solution components.
|
256 |
The Filippov moments solution on the intersection of two and three manifoldsDifonzo, Fabio Vito 07 January 2016 (has links)
In this thesis, we study the Filippov moments solution for differential equations with discontinuous right-hand side. In particular, our aim is to define a suitable Filippov sliding vector field on a co-dimension $2$ manifold $\Sigma$, intersection of two co-dimension $1$ manifolds with linearly independent normals, and then to study the dynamics provided by this selection. More specifically, we devote Chapter 1 to motivate our interest in this subject, presenting several problems from control theory, non-smooth dynamics, vehicle motion and neural networks. We then introduce the co-dimension $1$ case and basic notations, from which we set up, in the most general context, our specific problem. In Chapter 2 we propose and compare several approaches in selecting a Filippov sliding vector field for the particular case of $\Sigma$ nodally attractive: amongst these proposals, in Chapter 3 we focus on what we called \emph{moments solution}, that is the main and novel mathematical object presented and studied in this thesis. There, we extend the validity of the moments solution to $\Sigma$ attractive under general sliding conditions, proving interesting results about the smoothness of the Filippov sliding vector field on $\Sigma$, tangential exit at first-order exit points, uniqueness at potential exit points among all other admissible solutions. In Chapter 4 we propose a completely new and different perspective from which one can look at the problem: we study minimum variation solutions for Filippov sliding vector fields in $\R^{3}$, taking advantage of the relatively easy form of the Euler-Lagrange equation provided by the analysis, and of the orbital equivalence that we have in the eventuality $\Sigma$ does not have any equilibrium points on it; we further remove this assumption and extend our results. In Chapter 5 examples and numerical implementations are given, with which we corroborate our theoretical results and show that selecting a Filippov sliding vector field on $\Sigma$ without the required properties of smoothness and exit at first-order exit points ends up dynamics that make no sense, developing undesirable singularities. Finally, Chapter 6 presents an extension of the moments method to co-dimension $3$ and higher: this is the first result which provides a unique admissible solution for this problem.
|
257 |
Using regularization for error reduction in GRACE gravity estimationSave, Himanshu Vijay 02 June 2010 (has links)
The Gravity Recovery and Climate Experiment (GRACE) is a joint
National Aeronautics and Space Administration / Deutsches Zentrum für Luftund
Raumfahrt (NASA/DLR) mission to map the time-variable and mean
gravity field of the Earth, and was launched on March 17, 2002. The nature
of the gravity field inverse problem amplifies the noise in the data that creeps
into the mid and high degree and order harmonic coefficients of the earth's
gravity fields for monthly variability, making the GRACE estimation problem
ill-posed. These errors, due to the use of imperfect models and data noise, are
manifested as peculiar errors in the gravity estimates as north-south striping
in the monthly global maps of equivalent water heights.
In order to reduce these errors, this study develops a methodology
based on Tikhonov regularization technique using the L-curve method in combination
with orthogonal transformation method. L-curve is a popular aid for determining a suitable value of the regularization parameter when solving
linear discrete ill-posed problems using Tikhonov regularization. However, the
computational effort required to determine the L-curve can be prohibitive for
a large scale problem like GRACE. This study implements a parameter-choice
method, using Lanczos bidiagonalization that is a computationally inexpensive
approximation to L-curve called L-ribbon. This method projects a large
estimation problem on a problem of size of about two orders of magnitude
smaller. Using the knowledge of the characteristics of the systematic errors in
the GRACE solutions, this study designs a new regularization matrix that reduces
the systematic errors without attenuating the signal. The regularization
matrix provides a constraint on the geopotential coefficients as a function of its
degree and order. The regularization algorithms are implemented in a parallel
computing environment for this study. A five year time-series of the candidate
regularized solutions show markedly reduced systematic errors without any
reduction in the variability signal compared to the unconstrained solutions.
The variability signals in the regularized series show good agreement with the
hydrological models in the small and medium sized river basins and also show
non-seasonal signals in the oceans without the need for post-processing. / text
|
258 |
Regularization Methods for Predicting an Ordinal Response using Longitudinal High-dimensional Genomic DataHou, Jiayi 25 November 2013 (has links)
Ordinal scales are commonly used to measure health status and disease related outcomes in hospital settings as well as in translational medical research. Notable examples include cancer staging, which is a five-category ordinal scale indicating tumor size, node involvement, and likelihood of metastasizing. Glasgow Coma Scale (GCS), which gives a reliable and objective assessment of conscious status of a patient, is an ordinal scaled measure. In addition, repeated measurements are common in clinical practice for tracking and monitoring the progression of complex diseases. Classical ordinal modeling methods based on the likelihood approach have contributed to the analysis of data in which the response categories are ordered and the number of covariates (p) is smaller than the sample size (n). With the emergence of genomic technologies being increasingly applied for obtaining a more accurate diagnosis and prognosis, a novel type of data, known as high-dimensional data where the number of covariates (p) is much larger than the number of samples (n), are generated. However, corresponding statistical methodologies as well as computational software are lacking for analyzing high-dimensional data with an ordinal or a longitudinal ordinal response. In this thesis, we develop a regularization algorithm to build a parsimonious model for predicting an ordinal response. In addition, we utilize the classical ordinal model with longitudinal measurements to incorporate the cutting-edge data mining tool for a comprehensive understanding of the causes of complex disease on both the molecular level and environmental level. Moreover, we develop the corresponding R package for general utilization. The algorithm was applied to several real datasets as well as to simulated data to demonstrate the efficiency in variable selection and precision in prediction and classification. The four real datasets are from: 1) the National Institute of Mental Health Schizophrenia Collaborative Study; 2) the San Diego Health Services Research Example; 3) A gene expression experiment to understand `Decreased Expression of Intelectin 1 in The Human Airway Epithelium of Smokers Compared to Nonsmokers' by Weill Cornell Medical College; and 4) the National Institute of General Medical Sciences Inflammation and the Host Response to Burn Injury Collaborative Study.
|
259 |
Graph-based Regularization in Machine Learning: Discovering Driver Modules in Biological NetworksGao, Xi 01 January 2015 (has links)
Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel's law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today's scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, and train classification models that serve two goals: 1) class prediction for previously unseen samples; 2) knowledge discovery of the underlying causes of class differences. Application of our methods in genetic studies can help scientist take advantage of existing biological networks, generate diagnosis with higher accuracy, and discover the driver networks behind the differences. We proposed three new graph-based regularization algorithms. Graph Connectivity Constrained AdaBoost algorithm combines a connectivity module, a deletion function, and a model retraining procedure with the AdaBoost classifier. Graph-regularized Linear Programming Support Vector Machine integrates penalty term based on submodular graph cut function into linear classifier's objective function. Proximal Graph LogisticBoost adds lasso and graph-based penalties into logistic risk function of an ensemble classifier. Results of tests of our models on simulated biological datasets show that the proposed methods are able to produce accurate, sparse classifiers, and can help discover true genetic differences between phenotypes.
|
260 |
Modifikace stochastických objektů / Modifications of stochastic objectsKadlec, Karel January 2012 (has links)
In this thesis, we are concerned with the modifications of the stochastic processes and the random probability measures. First chapter is devoted to modifications of the stochastic process to the space of continuous functions, modifications of submartingale to the set of right-continuous with finite left-hand limits functions and separable modifications of stochastic process. In the second chapter is the attention on the regularization of random probability measure in Markov kernel focused. In particular, we work with random probability measures on the Borel subset of the Polish space, or Radon separable topological space.
|
Page generated in 0.1454 seconds