• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • 1
  • Tagged with
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Negative Correlation Properties for Matroids

Erickson, Alejandro January 2008 (has links)
In pursuit of negatively associated measures, this thesis focuses on certain negative correlation properties in matroids. In particular, the results presented contribute to the search for matroids which satisfy $$P(\{X:e,f\in X\}) \leq P(\{X:e\in X\})P(\{X:f\in X\})$$ for certain measures, $P$, on the ground set. Let $\mathcal M$ be a matroid. Let $(y_g:g\in E)$ be a weighting of the ground set and let $${Z = \sum_{X}\left( \prod_{x\in X} y_x\right) }$$ be the polynomial which generates Z-sets, were Z $\in \{$ B,I,S $\}$. For each of these, the sum is over bases, independent sets and spanning sets, respectively. Let $e$ and $f$ be distinct elements of $E$ and let $Z_e$ indicate partial derivative. Then $\mathcal M$ is Z-Rayleigh if $Z_eZ_f-ZZ_{ef}\geq 0$ for every positive evaluation of the $y_g$s. The known elementary results for the B, I and S-Rayleigh properties and two special cases called negative correlation and balance are proved. Furthermore, several new results are discussed. In particular, if a matroid is binary on at most nine elements or paving or rank three, then it is I-Rayleigh if it is B-Rayleigh. Sparse paving matroids are B-Rayleigh. The I-Rayleigh difference for graphs on at most seven vertices is a sum of monomials times squares of polynomials and this same special form holds for all series parallel graphs.
2

Negative Correlation Properties for Matroids

Erickson, Alejandro January 2008 (has links)
In pursuit of negatively associated measures, this thesis focuses on certain negative correlation properties in matroids. In particular, the results presented contribute to the search for matroids which satisfy $$P(\{X:e,f\in X\}) \leq P(\{X:e\in X\})P(\{X:f\in X\})$$ for certain measures, $P$, on the ground set. Let $\mathcal M$ be a matroid. Let $(y_g:g\in E)$ be a weighting of the ground set and let $${Z = \sum_{X}\left( \prod_{x\in X} y_x\right) }$$ be the polynomial which generates Z-sets, were Z $\in \{$ B,I,S $\}$. For each of these, the sum is over bases, independent sets and spanning sets, respectively. Let $e$ and $f$ be distinct elements of $E$ and let $Z_e$ indicate partial derivative. Then $\mathcal M$ is Z-Rayleigh if $Z_eZ_f-ZZ_{ef}\geq 0$ for every positive evaluation of the $y_g$s. The known elementary results for the B, I and S-Rayleigh properties and two special cases called negative correlation and balance are proved. Furthermore, several new results are discussed. In particular, if a matroid is binary on at most nine elements or paving or rank three, then it is I-Rayleigh if it is B-Rayleigh. Sparse paving matroids are B-Rayleigh. The I-Rayleigh difference for graphs on at most seven vertices is a sum of monomials times squares of polynomials and this same special form holds for all series parallel graphs.
3

Concentration of measure, negative association, and machine learning

Root, Jonathan 07 December 2016 (has links)
In this thesis we consider concentration inequalities and the concentration of measure phenomenon from a variety of angles. Sharp tail bounds on the deviation of Lipschitz functions of independent random variables about their mean are well known. We consider variations on this theme for dependent variables on the Boolean cube. In recent years negatively associated probability distributions have been studied as potential generalizations of independent random variables. Results on this class of distributions have been sparse at best, even when restricting to the Boolean cube. We consider the class of negatively associated distributions topologically, as a subset of the general class of probability measures. Both the weak (distributional) topology and the total variation topology are considered, and the simpler notion of negative correlation is investigated. The concentration of measure phenomenon began with Milman's proof of Dvoretzky's theorem, and is therefore intimately connected to the field of high-dimensional convex geometry. Recently this field has found application in the area of compressed sensing. We consider these applications and in particular analyze the use of Gordon's min-max inequality in various compressed sensing frameworks, including the Dantzig selector and the matrix uncertainty selector. Finally we consider the use of concentration inequalities in developing a theoretically sound anomaly detection algorithm. Our method uses a ranking procedure based on KNN graphs of given data. We develop a max-margin learning-to-rank framework to train limited complexity models to imitate these KNN scores. The resulting anomaly detector is shown to be asymptotically optimal in that for any false alarm rate α, its decision region converges to the α-percentile minimum volume level set of the unknown underlying density.
4

Extraction optimisée de règles d'association positives et négatives intéressantes / Efficient mining of interesting positive and negative association rules

Papon, Pierre-Antoine 09 June 2016 (has links)
L’objectif de la fouille de données consiste à extraire des connaissances à partir de grandes masses de données. Les connaissances extraites peuvent prendre différentes formes. Dans ce travail, nous allons chercher à extraire des connaissances uniquement sous la forme de règles d’association positives et de règles d’association négatives. Une règle d’association négative est une règle dans laquelle la présence ainsi que l’absence d’une variable peuvent être utilisées. En considérant l’absence des variables dans l’étude, nous allons élargir la sémantique des connaissances et extraire des informations non détectables par les méthodes d’extraction de règles d’association positives. Cela va par exemple permettre aux médecins de trouver des caractéristiques qui empêchent une maladie de se déclarer, en plus de chercher des caractéristiques déclenchant une maladie. Cependant, l’ajout de la négation va entraîner différents défis. En effet, comme l’absence d’une variable est en général plus importante que la présence de ces mêmes variables, les coûts de calculs vont augmenter exponentiellement et le risque d’extraire un nombre prohibitif de règles, qui sont pour la plupart redondantes et inintéressantes, va également augmenter. Afin de remédier à ces problèmes, notre proposition, dérivée de l’algorithme de référence A priori, ne va pas se baser sur les motifs fréquents comme le font les autres méthodes. Nous définissons donc un nouveau type de motifs : les motifs raisonnablement fréquents qui vont permettre d’améliorer la qualité des règles. Nous nous appuyons également sur la mesure M G pour connaître les types de règles à extraire mais également pour supprimer des règles inintéressantes. Nous utilisons également des méta-règles nous permettant d’inférer l’intérêt d’une règle négative à partir d’une règle positive. Par ailleurs, notre algorithme va extraire un nouveau type de règles négatives qui nous semble intéressant : les règles dont la prémisse et la conclusion sont des conjonctions de motifs négatifs. Notre étude se termine par une comparaison quantitative et qualitative aux autres algorithmes d’extraction de règles d’association positives et négatives sur différentes bases de données de la littérature. Notre logiciel ARA (Association Rules Analyzer ) facilite l’analyse qualitative des algorithmes en permettant de comparer intuitivement les algorithmes et d’appliquer en post-traitement différentes mesures de qualité. Finalement, notre proposition améliore l’extraction au niveau du nombre et de la qualité des règles extraites mais également au niveau du parcours de recherche des règles. / The purpose of data mining is to extract knowledge from large amount of data. The extracted knowledge can take different forms. In this work, we will seek to extract knowledge only in the form of positive association rules and negative association rules. A negative association rule is a rule in which the presence and the absence of a variable can be used. When considering the absence of variables in the study, we will expand the semantics of knowledge and extract undetectable information by the positive association rules mining methods. This will, for example allow doctors to find characteristics that prevent disease instead of searching characteristics that cause a disease. Nevertheless, adding the negation will cause various challenges. Indeed, as the absence of a variable is usually more important than the presence of these same variables, the computational costs will increase exponentially and the risk to extract a prohibitive number of rules, which are mostly redundant and uninteresting, will also increase. In order to address these problems, our proposal, based on the famous Apriori algorithm, does not rely on frequent itemsets as other methods do. We define a new type of itemsets : the reasonably frequent itemsets which will improve the quality of the rules. We also rely on the M G measure to know which forms of rules should be mined but also to remove uninteresting rules. We also use meta-rules to allow us to infer the interest of a negative rule from a positive one. Moreover, our algorithm will extract a new type of negative rules that seems interesting : the rules for which the antecedent and the consequent are conjunctions of negative itemsets. Our study ends with a quantitative and qualitative comparison with other positive and negative association rules mining algorithms on various databases of the literature. Our software ARA (Association Rules Analyzer ) facilitates the qualitative analysis of the algorithms by allowing to compare intuitively the algorithms and to apply in post-process treatments various quality measures. Finally, our proposal improves the extraction in the number and the quality of the extracted rules but also in the rules search path.

Page generated in 0.1215 seconds