Global ETD Search

1	Convex Analysis And Flows In Infinite Networks Wattanataweekul, Hathaikarn 13 May 2006 (has links) We study the existence of flows in infinite networks and extend basic theorems due to Gale and Hoffman and to Ford and Fulkerson. The classical approach to finite networks uses a constructive combinatorical algorithm that has become known as the labelling algorithm. Our approach to infinite networks involves Hahn--Banach type theorems on the existence of certain linear functionals. Thus the main tools are from the theory of functional and convex analysis. In Chapter II, we discuss sublinear and linear functionals on real vector spaces in the spirit of the work of K"{o}nig. In particular, a generalization of K"{o}nig's minimum theorem is established. Our theory leads to some useful interpolation results. We also establish a variant of the main interpolation theorem in the context of convex cones. We reformulate the results of Ford--Fulkerson and Gale--Hoffman in terms of certain additive and biadditive set functions. In Chapter III, we show that the space of all additive set functions may be canonically identified with the dual space of a space of certain step functions and that the space of all biadditive set functions may be identified with the dual space of a space of certain step functions in two variables. Our work an additive set functions is in the spirit of classical measure theory, while the case of biadditive set functions resembles the theory of product measures. In Chapter IV, we develop an extended version of the Gale--Hoffman theorem on the existence of flows in infinite networks in a setting of measure-theoretic flavor. This general flow theorem is one of our central results. We discuss, as an application of our flow theorem, a Ford--Fulkerson type result on maximal flows and minimal cuts in infinite networks containing sources and sinks. In addition, we present applications to flows in locally finite networks and to the existence of antisymmetric flows under certain natural conditions. We conclude with a discussion of the case of triadditive set functions. In the appendix, we review briefly the classical theory of maximal flows and minimal cuts in networks with finitely many nodes. convex analysis network flows
2	Sobre medidas unicamente maximizantes e outras questões em otimização ergódica Spier, Thomás Jung January 2016 (has links) Nessa dissertação estudamos Sistemas Dinâmicos do ponto de vista da Otimização Ergódica. Analizamos o problema da maximização da integral de potenciais com respeito a probabilidades invariantes pela dinâmica. Mostramos que toda medida ergódica e unicamente maximizante para algum potencial. Verificamos que o conjunto de potenciais com exatamente uma medida maximizadora e residual. Esses resultados são obtidos atrav es de técnicas da Teoria Ergódica e Análise Convexa. / In this thesis we study dynamical systems trough the viewpoint of ergodic optimization. We analyze the problem of maximizing integrals of potentials with respect to invariant probabilities. We show that every ergodic measure is uniquely maximizing for some potential. We also verify that the set of potentials with exactly one maximizing measure is residual. This results are obtained through techniques of ergodic theory and convex analysis. Otimização ergódica Probabilidades Ergodic optimization Convex analysis Uniquely maximizing probabilities
3	Sobre medidas unicamente maximizantes e outras questões em otimização ergódica Spier, Thomás Jung January 2016 (has links) Nessa dissertação estudamos Sistemas Dinâmicos do ponto de vista da Otimização Ergódica. Analizamos o problema da maximização da integral de potenciais com respeito a probabilidades invariantes pela dinâmica. Mostramos que toda medida ergódica e unicamente maximizante para algum potencial. Verificamos que o conjunto de potenciais com exatamente uma medida maximizadora e residual. Esses resultados são obtidos atrav es de técnicas da Teoria Ergódica e Análise Convexa. / In this thesis we study dynamical systems trough the viewpoint of ergodic optimization. We analyze the problem of maximizing integrals of potentials with respect to invariant probabilities. We show that every ergodic measure is uniquely maximizing for some potential. We also verify that the set of potentials with exactly one maximizing measure is residual. This results are obtained through techniques of ergodic theory and convex analysis. Otimização ergódica Probabilidades Ergodic optimization Convex analysis Uniquely maximizing probabilities
4	Sobre medidas unicamente maximizantes e outras questões em otimização ergódica Spier, Thomás Jung January 2016 (has links) Nessa dissertação estudamos Sistemas Dinâmicos do ponto de vista da Otimização Ergódica. Analizamos o problema da maximização da integral de potenciais com respeito a probabilidades invariantes pela dinâmica. Mostramos que toda medida ergódica e unicamente maximizante para algum potencial. Verificamos que o conjunto de potenciais com exatamente uma medida maximizadora e residual. Esses resultados são obtidos atrav es de técnicas da Teoria Ergódica e Análise Convexa. / In this thesis we study dynamical systems trough the viewpoint of ergodic optimization. We analyze the problem of maximizing integrals of potentials with respect to invariant probabilities. We show that every ergodic measure is uniquely maximizing for some potential. We also verify that the set of potentials with exactly one maximizing measure is residual. This results are obtained through techniques of ergodic theory and convex analysis. Otimização ergódica Probabilidades Ergodic optimization Convex analysis Uniquely maximizing probabilities
5	A Study of Machine Learning Approaches for Biomedical Signal Processing Shen, Minjie 10 June 2021 (has links) The introduction of high-throughput molecular profiling technologies provides the capability of studying diverse biological systems at molecular level. However, due to various limitations of measurement instruments, data preprocessing is often required in biomedical research. Improper preprocessing will have negative impact on the downstream analytics tasks. This thesis studies two important preprocessing topics: missing value imputation and between-sample normalization. Missing data is a major issue in quantitative proteomics data analysis. While many methods have been developed for imputing missing values in high-throughput proteomics data, comparative assessment on the accuracy of existing methods remains inconclusive, mainly because the true missing mechanisms are complex and the existing evaluation methodologies are imperfect. Moreover, few studies have provided an outlook of current and future development. We first report an assessment of eight representative methods collectively targeting three typical missing mechanisms. The selected methods are compared on both realistic simulation and real proteomics datasets, and the performance is evaluated using three quantitative measures. We then discuss fused regularization matrix factorization, a popular low-rank matrix factorization framework with similarity and/or biological regularization, which is extendable to integrating multi-omics data such as gene expressions or clinical variables. We further explore the potential application of convex analysis of mixtures, a biologically inspired latent variable modeling strategy, to missing value imputation. The preliminary results on proteomics data are provided together with an outlook into future development directions. While a few winners emerged from our comparative assessment, data-driven evaluation of imputation methods is imperfect because performance is evaluated indirectly on artificial missing or masked values not authentic missing values. Imputation accuracy may vary with signal intensity. Fused regularization matrix factorization provides a possibility of incorporating external information. Convex analysis of mixtures presents a biologically plausible new approach. Data normalization is essential to ensure accurate inference and comparability of gene expressions across samples or conditions. Ideally, gene expressions should be rescaled based on consistently expressed reference genes. However, for normalizing biologically diverse samples, the most commonly used reference genes have exhibited striking expression variability, and distribution-based approaches can be problematic when differentially expressed genes are significantly asymmetric. We introduce a Cosine score based iterative normalization (Cosbin) strategy to normalize biologically diverse samples. The between-sample normalization is based on iteratively identified consistently expressed genes, where differentially expressed genes are sequentially eliminated according to scale-invariant Cosine scores. We evaluate the performance of Cosbin and four other representative normalization methods (Total count, TMM/edgeR, DESeq2, DEGES/TCC) on both idealistic and realistic simulation data sets. Cosbin consistently outperforms the other methods across various performance criteria. Implemented in open-source R scripts and applicable to grouped or individual samples, the Cosbin tool will allow biologists to detect subtle yet important molecular signals across known or novel phenotypic groups. / Master of Science / Data preprocessing is often required due to various limitations of measurement instruments in biomedical research. This thesis studies two important preprocessing topics: missing value imputation and between-sample normalization. Missing data is a major issue in quantitative proteomics data analysis. Imputation is the process of substituting for missing values. We propose a more realistic assessment workflow which can preserve the original data distribution, and then assess eight representative general-purpose imputation strategies. We explore two biologically inspired imputation approaches: fused regularization matrix factorization (FRMF) and convex analysis of mixtures (CAM) imputation. FRMF integrates external information such as clinical variables and multi-omics data into imputation, while CAM imputation incorporates biological assumptions. We show that the integration of biological information improves the imputation performance. Data normalization is required to ensure correct comparison. For gene expression data, between sample normalization is needed. We propose a Cosine score based iterative normalization (Cosbin) strategy to normalize biologically diverse samples. We show that Cosbin significantly outperform other methods in both ideal simulation and realistic simulation. Implemented in open-source R scripts and applicable to grouped or individual samples, the Cosbin tool will allow biologists to detect subtle yet important molecular signals across known or novel cell types. bioinformatics imputation matrix factorization convex analysis normalization machine learning
6	Design and Implementation of Convex Analysis of Mixtures Software Suite Meng, Fan 10 September 2012 (has links) Various convex analysis of mixtures (CAM) based algorithms have been developed to address real world blind source separation (BSS) problems and proven to have good performances in previous papers. This thesis reported the implementation of a comprehensive software CAM-Java, which contains three different CAM based algorithms, CAM compartment modeling (CAM-CM), CAM non-negative independent component analysis (CAM-nICA), and CAM non-negative well-grounded component analysis (CAM-nWCA). The implementation works include: translation of MATLAB coded algorithms to open-sourced R alternatives. As well as building a user friendly graphic user interface (GUI) to integrate three algorithms together, which is accomplished by adopting Java Swing API. In order to combine R and Java coded modules, an open-sourced project RCaller is used to handle the establishment of low level connection between R and Java environment. In addition, specific R scripts and Java classes are also implemented to accomplish the tasks of passing parameters and input data from Java to R, run R scripts in Java environment, read R results back to Java, display R generated figures, and so on. Furthermore, system stream redirection and multi-threads techniques are used to build a simple R messages displaying window in Java built GUI. The final version of the software runs smoothly and stable, and the CAM-CM results on both simulated and real DCE-MRI data are quite close to the original MATLAB version algorithms. The whole GUI based open-sourced software is easy to use, and can be freely distributed among the communities. Technical details in both R and Java modules implementation are also discussed, which presents some good examples of how to develop software with both complicate and up to date algorithms, as well as decent and user friendly GUI in the scientific or engineering research fields. / Master of Science R script R script Graphic User Interface Graphic User Interface Convex Analysis of Mixtures Convex Analysis of Mixtures Compartment Modeling Compartment Modeling
7	On the Generalizations of Gershgorin's Theorem Lee, Sang-Gu 01 May 1986 (has links) This paper deals with generalization fo Gershgorin's theorem. This theorem is investigated and generalized in terms of contour integrals, directed graphs, convex analysis, and clock matrices. These results are shown to apply to some specified matrices such as stable and stochastic matrices and some examples will show the relationship of eigenvalue inclusion regions among them. generalization Gershgorin Theorem contour integral directed graph convex analysis block matrix Mathematics
8	Metabolic design of dynamic bioreaction models Provost, Agnès 06 November 2006 (has links) This thesis is concerned with the derivation of bioprocess models intended for engineering purposes. In contrast with other techniques, the methodology used to derive a macroscopic model is based on available intracellular information. This information is extracted from the metabolic network describing the intracellular metabolism. The aspects of metabolic regulation are modeled by representing the metabolism of cultured cells with several metabolic networks. Here we present a systematic methodology for deriving macroscopic models when such metabolic networks are known. A separate model is derived for each “phase” of the culture. Each of these models relies upon a set of macroscopic bioreactions that resumes the information contained in the corresponding metabolic network. Such a set of macroscopic bioreactions is obtained by translating the set of Elementary Flux Modes which are well-known tools in the System Biology community. The Elementary Flux Modes are described in the theory of Convex Analysis. They represent pathways across metabolic networks. Once the set of Elementary Flux Modes is computed and translated into macroscopic bioreactions, a general model could be obtained for the type of culture under investigation. However, depending on the size and the complexity of the metabolic network, such a model could contain hundreds, and even thousands, of bioreactions. Since the reaction kinetics of such bioreactions are parametrized with at least one parameter that needs to be identified, the reduction of the general model to a more manageable size is desirable. Convex Analysis provides further results that allow for the selection of a macroscopic bioreaction subset. This selection is based on the data collected from the available experiments. The selected bioreactions then allow for the construction of a model for the experiments at hand. Elementary flux modes Convex analysis Dynamical modelling Metabolic flux analysis Metabolic network
9	Learning Statistical and Geometric Models from Microarray Gene Expression Data Zhu, Yitan 01 October 2009 (has links) In this dissertation, we propose and develop innovative data modeling and analysis methods for extracting meaningful and specific information about disease mechanisms from microarray gene expression data. To provide a high-level overview of gene expression data for easy and insightful understanding of data structure, we propose a novel statistical data clustering and visualization algorithm that is comprehensively effective for multiple clustering tasks and that overcomes some major limitations of existing clustering methods. The proposed clustering and visualization algorithm performs progressive, divisive hierarchical clustering and visualization, supported by hierarchical statistical modeling, supervised/unsupervised informative gene/feature selection, supervised/unsupervised data visualization, and user/prior knowledge guidance through human-data interactions, to discover cluster structure within complex, high-dimensional gene expression data. For the purpose of selecting suitable clustering algorithm(s) for gene expression data analysis, we design an objective and reliable clustering evaluation scheme to assess the performance of clustering algorithms by comparing their sample clustering outcome to phenotype categories. Using the proposed evaluation scheme, we compared the performance of our newly developed clustering algorithm with those of several benchmark clustering methods, and demonstrated the superior and stable performance of the proposed clustering algorithm. To identify the underlying active biological processes that jointly form the observed biological event, we propose a latent linear mixture model that quantitatively describes how the observed gene expressions are generated by a process of mixing the latent active biological processes. We prove a series of theorems to show the identifiability of the noise-free model. Based on relevant geometric concepts, convex analysis and optimization, gene clustering, and model stability analysis, we develop a robust blind source separation method that fits the model to the gene expression data and subsequently identify the underlying biological processes and their activity levels under different biological conditions. Based on the experimental results obtained on cancer, muscle regeneration, and muscular dystrophy gene expression data, we believe that the research work presented in this dissertation not only contributes to the engineering research areas of machine learning and pattern recognition, but also provides novel and effective solutions to potentially solve many biomedical research problems, for improving the understanding about disease mechanisms. / Ph. D. Blind Source Separation Convex Analysis and Optimization Gene Expressions Clustering Evaluation Data Clustering and Visualization
10	Machine Learning Approaches for Modeling and Correction of Confounding Effects in Complex Biological Data Wu, Chiung Ting 09 June 2021 (has links) With the huge volume of biological data generated by new technologies and the booming of new machine learning based analytical tools, we expect to advance life science and human health at an unprecedented pace. Unfortunately, there is a significant gap between the complex raw biological data from real life and the data required by mathematical and statistical tools. This gap is contributed by two fundamental and universal problems in biological data that are both related to confounding effects. The first is the intrinsic complexities of the data. An observed sample could be the mixture of multiple underlying sources and we may be only interested in one or part of the sources. The second type of complexities come from the acquisition process of the data. Different samples may be gathered at different time and/or from different locations. Therefore, each sample is associated with specific distortion that must be carefully addressed. These confounding effects obscure the signals of interest in the acquired data. Specifically, this dissertation will address the two major challenges in confounding effects removal: alignment and deconvolution. Liquid chromatography–mass spectrometry (LC-MS) is a standard method for proteomics and metabolomics analysis of biological samples. Unfortunately, it suffers from various changes in the retention time (RT) of the same compound in different samples, and these must be subsequently corrected (aligned) during data processing. Classic alignment methods such as in the popular XCMS package often assume a single time-warping function for each sample. Thus, the potentially varying RT drift for compounds with different masses in a sample is neglected in these methods. Moreover, the systematic change in RT drift across run order is often not considered by alignment algorithms. Therefore, these methods cannot effectively correct all misalignments. To utilize this information, we develop an integrated reference-free profile alignment method, neighbor-wise compound-specific Graphical Time Warping (ncGTW), that can detect misaligned features and align profiles by leveraging expected RT drift structures and compound-specific warping functions. Specifically, ncGTW uses individualized warping functions for different compounds and assigns constraint edges on warping functions of neighboring samples. We applied ncGTW to two large-scale metabolomics LC-MS datasets, which identifies many misaligned features and successfully realigns them. These features would otherwise be discarded or uncorrected using existing methods. When the desired signal is buried in a mixture, deconvolution is needed to recover the pure sources. Many biological questions can be better addressed when the data is in the form of individual sources, instead of mixtures. Though there are some promising supervised deconvolution methods, when there is no a priori information, unsupervised deconvolution is still needed. Among current unsupervised methods, Convex Analysis of Mixtures (CAM) is the most theoretically solid and strongest performing one. However, there are some major limitations of this method. Most importantly, the overall time complexity can be very high, especially when analyzing a large dataset or a dataset with many sources. Also, since there are some stochastic and heuristic steps, the deconvolution result is not accurate enough. To address these problems, we redesigned the modules of CAM. In the feature clustering step, we propose a clustering method, radius-fixed clustering, which could not only control the space size of the cluster, but also find out the outliers simultaneously. Therefore, the disadvantages of K-means clustering, such as instability and the need of cluster number are avoided. Moreover, when identifying the convex hull, we replace Quickhull with linear programming, which decreases the computation time significantly. To avoid the not only heuristic but also approximated step in optimal simplex identification, we propose a greedy search strategy instead. The experimental results demonstrate the vast improvement of computation time. The accuracy of the deconvolution is also shown to be higher than the original CAM. / Doctor of Philosophy / Due to the complexity of biological data, there are two major pre-processing steps: alignment and deconvolution. The alignment step corrects the time and location related data acquisition distortion by aligning the detected signals to a reference signal. Though many alignment methods are proposed for biological data, most of them fail to consider the relationships among samples carefully. This piece of structure information can help alignment when the data is noisy and/or irregular. To utilize this information, we develop a new method, Neighbor-wise Compound-specific Graphical Time Warping (ncGTW), inspired by graph theory. This new alignment method not only utilizes the structural information but also provides a reference-free solution. We show that the performance of our new method is better than other methods in both simulations and real datasets. When the signal is from a mixture, deconvolution is needed to recover the pure sources. Many biological questions can be better addressed when the data is in the form of single sources, instead of mixtures. There is a classic unsupervised deconvolution method: Convex Analysis of Mixtures (CAM). However, there are some limitations of this method. For example, the time complexity of some steps is very high. Thus, when facing a large dataset or a dataset with many sources, the computation time would be extremely long. Also, since there are some stochastic and heuristic steps, the deconvolution result may be not accurate enough. We improved CAM and the experimental results show that the speed and accuracy of the deconvolution is significantly improved. bioinformatics multiple alignment deconvolution unsupervised learning convex analysis feature selection tissue heterogeneity

Search results