Spelling suggestions: "subject:"pairwise"" "subject:"fairwise""
31 |
Alinhamento de seqüências biológicas com o uso de algoritmos genéticos.Ogata, Sabrina Oliveira 14 March 2005 (has links)
Made available in DSpace on 2016-06-02T20:21:33Z (GMT). No. of bitstreams: 1
DissSOO.pdf: 370558 bytes, checksum: eaf0bc7ee24eeffcd23041e4273aa013 (MD5)
Previous issue date: 2005-03-14 / Universidade Federal de Sao Carlos / The comparison of genome sequences from different organisms is one
of the computational application most frequently used by molecular
biologists. This operation serves as support for other processes as, for
instance, the determination of the three- dimensional structure of the
proteins. During the last years, a significative increase has been observed
in the use of Genetic Algorithms (GA) for optimization problems as, for
example, dna or protein sequence alignment. GAs are search techniques
inspired in mechanisms from Natural Selection and Genetics. In this work,
a GA was used to development a computacional tool for the sequence
alignment problem. A benchmark comparison of this program with the
bl2seq tool (from Blast package) is showed using some sequences, varying
in similarity and length. In some cases the GA overhelmed the bl2seq tool.
This work contributes to the Bioinformatic field by the use of a novel
methodology that can use as an option for sequence analysis in genome
projects and to enhance the results of other tools such like Blast. / A comparação de seqüências de genomas de organismos é uma das
aplicações computacionais mais utilizadas por biólogos moleculares. Esta
operação serve de suporte para outros processos como, por exemplo, a
determinação da estrutura tridimensional de proteínas. Durante os últimos
anos, tem- se observado uma grande expansão na utilização de Algoritmos
Evolutivos, especialmente Algoritmos Genéticos (AGs), para problemas de
otimização, como os de alinhamento de seqüências de dna e proteínas. Os
AGs são técnicas de busca inspiradas em mecanismos de seleção natural
e da genética. Neste trabalho, um AG foi utilizado para o desenvolvimento
de uma ferramenta computacional para o problema de alinhamento de
pares de seqüências. Uma comparação foi realizada com o uso da
ferramenta bl2seq (do pacote de ferramentas Blast) afim de se checar a
desempenho do AG. Utilizou- se para isso seqüências de diversos tamanhos
e graus de similaridade. Os resultados demonstraram alguns casos
específicos onde o AG superou a ferramenta bl2seq. A principal
contribuição deste trabalho está na área de Bioinformática, com o
emprego de uma nova metodologia, que posteriormente possa ser
utilizada como mais uma opção para análises de seqüências em projetos
genomas e para refinamento dos resultados obtidos por outras
ferramentas usualmente utilizadas, como o Blast.
|
32 |
Extended stochastic dynamics : theory, algorithms, and applications in multiscale modelling and data scienceShang, Xiaocheng January 2016 (has links)
This thesis addresses the sampling problem in a high-dimensional space, i.e., the computation of averages with respect to a defined probability density that is a function of many variables. Such sampling problems arise in many application areas, including molecular dynamics, multiscale models, and Bayesian sampling techniques used in emerging machine learning applications. Of particular interest are thermostat techniques, in the setting of a stochastic-dynamical system, that preserve the canonical Gibbs ensemble defined by an exponentiated energy function. In this thesis we explore theory, algorithms, and numerous applications in this setting. We begin by comparing numerical methods for particle-based models. The class of methods considered includes dissipative particle dynamics (DPD) as well as a newly proposed stochastic pairwise Nosé-Hoover-Langevin (PNHL) method. Splitting methods are developed and studied in terms of their thermodynamic accuracy, two-point correlation functions, and convergence. When computational efficiency is measured by the ratio of thermodynamic accuracy to CPU time, we report significant advantages in simulation for the PNHL method compared to popular alternative schemes in the low-friction regime, without degradation of convergence rate. We propose a pairwise adaptive Langevin (PAdL) thermostat that fully captures the dynamics of DPD and thus can be directly applied in the setting of momentum-conserving simulation. These methods are potentially valuable for nonequilibrium simulation of physical systems. We again report substantial improvements in both equilibrium and nonequilibrium simulations compared to popular schemes in the literature. We also discuss the proper treatment of the Lees-Edwards boundary conditions, an essential part of modelling shear flow. We also study numerical methods for sampling probability measures in high dimension where the underlying model is only approximately identified with a gradient system. These methods are important in multiscale modelling and in the design of new machine learning algorithms for inference and parameterization for large datasets, challenges which are increasingly important in "big data" applications. In addition to providing a more comprehensive discussion of the foundations of these methods, we propose a new numerical method for the adaptive Langevin/stochastic gradient Nosé-Hoover thermostat that achieves a dramatic improvement in numerical efficiency over the most popular stochastic gradient methods reported in the literature. We demonstrate that the newly established method inherits a superconvergence property (fourth order convergence to the invariant measure for configurational quantities) recently demonstrated in the setting of Langevin dynamics. Furthermore, we propose a covariance-controlled adaptive Langevin (CCAdL) thermostat that can effectively dissipate parameter-dependent noise while maintaining a desired target distribution. The proposed method achieves a substantial speedup over popular alternative schemes for large-scale machine learning applications.
|
33 |
A Pairwise Comparison Matrix Framework for Large-Scale Decision MakingJanuary 2013 (has links)
abstract: A Pairwise Comparison Matrix (PCM) is used to compute for relative priorities of criteria or alternatives and are integral components of widely applied decision making tools: the Analytic Hierarchy Process (AHP) and its generalized form, the Analytic Network Process (ANP). However, a PCM suffers from several issues limiting its application to large-scale decision problems, specifically: (1) to the curse of dimensionality, that is, a large number of pairwise comparisons need to be elicited from a decision maker (DM), (2) inconsistent and (3) imprecise preferences maybe obtained due to the limited cognitive power of DMs. This dissertation proposes a PCM Framework for Large-Scale Decisions to address these limitations in three phases as follows. The first phase proposes a binary integer program (BIP) to intelligently decompose a PCM into several mutually exclusive subsets using interdependence scores. As a result, the number of pairwise comparisons is reduced and the consistency of the PCM is improved. Since the subsets are disjoint, the most independent pivot element is identified to connect all subsets. This is done to derive the global weights of the elements from the original PCM. The proposed BIP is applied to both AHP and ANP methodologies. However, it is noted that the optimal number of subsets is provided subjectively by the DM and hence is subject to biases and judgement errors. The second phase proposes a trade-off PCM decomposition methodology to decompose a PCM into a number of optimally identified subsets. A BIP is proposed to balance the: (1) time savings by reducing pairwise comparisons, the level of PCM inconsistency, and (2) the accuracy of the weights. The proposed methodology is applied to the AHP to demonstrate its advantages and is compared to established methodologies. In the third phase, a beta distribution is proposed to generalize a wide variety of imprecise pairwise comparison distributions via a method of moments methodology. A Non-Linear Programming model is then developed that calculates PCM element weights which maximizes the preferences of the DM as well as minimizes the inconsistency simultaneously. Comparison experiments are conducted using datasets collected from literature to validate the proposed methodology. / Dissertation/Thesis / Ph.D. Industrial Engineering 2013
|
34 |
Extensões no método de comparação indireta aos pares para otimização de produtos com variáveis sensoriaisDutra, Camila Costa January 2007 (has links)
Na otimização de produtos e processos industriais todas as medidas de qualidade devem ser consideradas simultaneamente. No setor alimentício para a avaliação de produtos são considerados, além de medidas usuais de qualidade, dados de painéis sensoriais. Esta dissertação apresenta o estudo de um método desenvolvido especificamente em um contexto de otimização de produtos com variáveis sensoriais: o método de Comparação Indireta aos Pares (CIP). Para coleta de dados, o CIP baseia-se na comparação pareada de amostras e para análise de dados utiliza elementos do AHP (Processo Analítico Hierárquico, na sigla em inglês). Neste método, são propostas extensões com vistas a torná-lo mais confiável e aumentar sua aplicabilidade. Para atingir esse propósito são feitas adaptações em diferentes procedimentos de coleta de dados sensoriais, assim como a validação de valores de referência utilizados na análise de dados e a construção de tabelas com valores de referência para casos onde o método CIP é aplicado. As melhorias propostas no método de CIP são ilustradas através de uma aplicação prática em uma empresa alimentícia, onde, deseja-se otimizar o processo de desenvolvimento de uma barra de chocolate. O método CIP é utilizado para determinar o percentual de ingredientes utilizados na formulação da barra de chocolate. / In the optimization of products and industrial processes several quality measures must be considered simultaneously. When analyzing food products, in addition to the usual measures of quality, the performance of products as measured by a sensory panel should be also taken into account. In this thesis we analyze a method developed specifically for the optimization of products with sensory variables: the Indirect Pairwise Comparison (IPC) method. Regarding the sensory data collection the IPC is based in the pairwise comparison of samples; as for data analysis, the method uses elements of the AHP (Analytical Hierarchical Process). Extensions are proposed in the IPC in order to improve its reliability and applicability. For that matter we propose adaptations in different procedures for sensory data collection. We also validate some reference values used in the IPC’s data analysis framework and develop tables with reference values for special cases where the IPC method is applied. The proposed improvements are illustrated through a practical application in a food industry. In the case study it is desired to optimize the development of a chocolate bar. The IPC is used to determine the percentage of ingredients used in the product recipe.
|
35 |
Preference elicitation from pairwise comparisons in multi-criteria decision makingSiraj, Sajid January 2011 (has links)
Decision making is an essential activity for humans and often becomes complex in the presence of uncertainty or insufficient knowledge. This research aims at estimating preferences using pairwise comparisons. A decision maker uses pairwise comparison when he/she is unable to directly assign criteria weights or scores to the available options. The judgments provided in pairwise comparisons may not always be consistent for several reasons. Experimentation has been used to obtain statistical evidence related to the widely-used consistency measures. The results highlight the need to propose new consistency measures. Two new consistency measures - termed congruence and dissonance - are proposed to aid the decision maker in the process of elicitation. Inconsistencies in pairwise comparisons are of two types i.e. cardinal and ordinal. It is shown that both cardinal and ordinal consistency can be improved with the help of these two measures. A heuristic method is then devised to detect and remove intransitive judgments. The results suggest that the devised method is feasible for improving ordinal consistency and is computationally more efficient than the optimization-based methods. There exist situations when revision of judgments is not allowed and prioritization is required without attempting to remove inconsistency. A new prioritization method has been proposed using the graph-theoretic approach. Although the performance of the proposed prioritization method was found to be comparable to other approaches, it has practical limitation in terms of computation time. As a consequence, the problem of prioritization is explored as an optimization problem. A new method based on multi-objective optimization is formulated that offers multiple non-dominated solutions and outperforms all other relevant methods for inconsistent set of judgments. A priority estimation tool (PriEsT) has been developed that implements the proposed consistency measures and prioritization methods. In order to show the benefits of PriEsT, a case study involving Telecom infrastructure selection is presented.
|
36 |
On Enumeration of Tree-Like Graphs and Pairwise Compatibility Graphs / 木状グラフ及び対互換性グラフの列挙Naveed, Ahmed Azam 23 March 2021 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23322号 / 情博第758号 / 新制||情||129(附属図書館) / 京都大学大学院情報学研究科数理工学専攻 / (主査)教授 永持 仁, 教授 太田 快人, 教授 山下 信雄 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
37 |
Mnohorozměrné modely extrémních hodnot a jejich aplikace v hydrologii / Multivariate extreme value models and their application in hydrologyDrápal, Lukáš January 2014 (has links)
Present thesis deals with the multivariate extreme value theory. First, concepts of modelling block maxima and threshold excesses in the univariate case are reviewed. In the multivariate setting the point process approach is chosen to model dependence. The dependence structure of multivariate extremes is provided by a spectral measure or an exponent function. Models for asymptotically dependent variables are provided. A construction principle from Ballani and Schlather (2011) is discussed. Based on this discussion the pairwise beta model introduced by Cooley et al. (2010) is modified to provide higher flexibility. Models are applied to data from nine hydrological stations from northern Moravia previously analysed by Jarušková (2009). Usage of the new pairwise beta model is justified as it brought a substantial improvement of log-likelihood. Models are also compared with Bayesian model selection introduced by Sabourin et al. (2013). Powered by TCPDF (www.tcpdf.org)
|
38 |
Bayes Factors for the Proposition of a Common Source of Amphetamine Seizures.Pawar, Yash January 2021 (has links)
This thesis sets out to address the challenges with the comparison of Amphetamine material in determining whether they originate from the same source or different sources using pairwise ratios of peak areas within each chromatogram of material and then modeling the difference between the ratios for each comparison as a basis for evaluation. The evaluation of an existing method that uses these ratios to determine the sum of significant differences between each comparison of material that is provided is done. The outcome of this evaluation suggests that there the distributions for comparison of samples originating from the same source and the comparison of samples originating from different sources have an overlap leading to uncertainties in conclusions. In this work, the differences between the ratios of peak areas have been modeled using a feature-based approach. Because the feature space is quite large, Discriminant Analysis methods such as Linear Discriminant Analysis (LDA) and Partial least squares Discriminant Analysis (PLS-DA) have been implemented to perform classification by dimensionality reduction. Another popular method that works on the principle of nearest centroid classifier called as Nearest shrunken centroid is also applied that performs classification on shrunken centroids of the features. The results and analysis of all the methods have been performed to obtain the classification results for classes +1 (samples originate from the same source) and ́1 (samples originate from different sources). Likelihood ratios of each class for each of these methods have also been evaluated using the Empirical Cross-Entropy (ECE) method to determine the robustness of the classifiers. All three models seem to have performed fairly well in terms of classification with LDA being the most robust and reliable with its predictions.
|
39 |
Calculating power for the Finkelstein and Schoenfeld test statisticZhou, Thomas J. 07 March 2022 (has links)
The Finkelstein and Schoenfeld (FS) test is a popular generalized pairwise comparison approach to analyze prioritized composite endpoints (e.g., components are assessed in order of clinical importance). Power and sample size estimation for the FS test, however, are generally done via simulation studies. This simulation approach can be extremely computationally burdensome, compounded by an increasing number of composite endpoints and with increasing sample size. We propose an analytic solution to calculate power and sample size for commonly encountered two-component hierarchical composite endpoints. The power formulas are derived assuming underlying distributions in each of the component outcomes on the population level, which provide a computationally efficient and practical alternative to the standard simulation approach. The proposed analytic approach is extended to derive conditional power formulas, which are used in combination with the promising zone methodology to perform sample size re-estimation in the setting of adaptive clinical trials. Prioritized composite endpoints with more than two components are also investigated. Extensive Monte Carlo simulation studies were conducted to demonstrate that the performance of the proposed analytic approach is consistent with that of the standard simulation approach. We also demonstrate through simulations that the proposed methodology possesses generally desirable objective properties including robustness to mis-specified underlying distributional assumptions. We illustrate our proposed methods through application of the proposed formulas by calculating power and sample size for the Transthyretin Amyloidosis Cardiomyopathy Clinical Trial (ATTR-ACT) and the EMPULSE trial for empagliozin treatment of acute heart failure.
|
40 |
Probabilistic Extensions of the Erdos-Ko-Rado PropertyCelaya, Anna, Godbole, Anant P., Schleifer, Mandy Rae 01 September 2006 (has links)
The classical Erdos-Ko-Rado (EKR) Theorem states that if we choose a family of subsets, each of size k, from a fixed set of size (n > 2k), then the largest possible pairwise intersecting family has size t = (k-1n-1). We consider the probability that a randomly selected family of size t = tn has the EKR property (pairwise nonempty intersection) as n and k = kn tend to infinity, the latter at a specific rate. As t gets large, the EKR property is less likely to occur, while as t gets smaller, the EKR property is satisfied with high probability. We derive the threshold value for t using Janson's inequality. Using the Stein-Chen method we show that the distribution of X0, defined as the number of disjoint pairs of subsets in our family, can be approximated by a Poisson distribution. We extend our results to yield similar conclusions for Xi, the number of pairs of subsets that overlap in exactly i elements. Finally, we show that the joint distribution X0, X1, ⋯, Xb) can be approximated by a multidimensional Poisson vector with independent components.
|
Page generated in 0.0698 seconds