Spelling suggestions: "subject:"concearning"" "subject:"1earning""
1 |
Inductive generalisation in case-based reasoning systemsGriffiths, Anthony D. January 1996 (has links)
No description available.
|
2 |
Approximation Algorithms and New Models for Clustering and LearningAwasthi, Pranjal 01 August 2013 (has links)
This thesis is divided into two parts. In part one, we study the k-median and the k-means clustering problems. We take a different approach than the traditional worst case analysis models. We show that by looking at certain well motivated stable instances, one can design much better approximation algorithms for these problems. Our algorithms achieve arbitrarily good approximation factors on stable instances, something which is provably hard on worst case instances. We also study a different model for clustering which introduces limited amount of interaction with the user. Such interactive models are very popular in the context of learning algorithms but their effectiveness for clustering is not well understood. We present promising theoretical and experimental results in this direction.
The second part of the thesis studies the design of provably good learning algorithms which work under adversarial noise. One of the fundamental problems in this area is to understand the learnability of the class of disjunctions of Boolean variables. We design a learning algorithm which improves on the guarantees of the previously best known result for this problem. In addition, the techniques used seem fairly general and promising to be applicable to a wider class of problems. We also propose a new model for learning with queries. This model restricts the algorithms ability to only ask certain “local” queries. We motivate the need for the model and show that one can design efficient local query algorithms for a wide class of problems.
|
3 |
On the Sample Complexity of Privately Learning Gaussians and their Mixtures / Privately Learning Gaussians and their MixturesAden-Ali, Ishaq January 2021 (has links)
Multivariate Gaussians: We provide sample complexity upper bounds for semi-agnostically
learning multivariate Gaussians under the constraint of approximate differential
privacy. These are the first finite sample upper bounds for general Gaussians
which do not impose restrictions on the parameters of the distribution. Our bounds
are near-optimal in the case when the covariance is known to be the identity, and
conjectured to be near-optimal in the general case. From a technical standpoint, we
provide analytic tools for arguing the existence of global "locally small" covers from
local covers of the space. These are exploited using modifications of recent techniques
for for differentially private hypothesis selection.
Mixtures of Gaussians: We consider the problem of learning mixtures of Gaussians
under the constraint of approximate differential privacy. We provide the first
sample complexity upper bounds for privately learning mixtures of unbounded axis-aligned
(or even unbounded univariate) Gaussians. To prove our results, we design
a new technique for privately learning mixture distributions. A class of distributions
F is said to be list-decodable if there is an algorithm that, given "heavily corrupted"
samples from a distribution f in F, outputs a list of distributions, H, such that one of the distributions in H approximates f. We show that if F is privately list-decodable then
we can privately learn mixtures of distributions in F. Finally, we show axis-aligned
Gaussian distributions are privately list-decodable, thereby proving mixtures of such
distributions are privately learnable. / Thesis / Master of Science (MSc) / Is it possible to estimate an unknown probability distribution given random samples from it? This is a fundamental problem known as distribution learning (or density estimation) that has been studied by statisticians for decades, and in recent years has become a topic of interest for computer scientists. While distribution learning is a mature and well understood problem, in many cases the samples (or data) we observe may consist of sensitive information belonging to individuals and well-known solutions may inadvertently result in the leakage of private information.
In this thesis we study distribution learning under the assumption that the data is generated from high-dimensional Gaussians (or their mixtures) with the aim of understanding how many samples an algorithm needs before it can guarantee a good estimate. Furthermore, to protect against leakage of private information, we consider approaches that satisfy differential privacy — the gold standard for modern private data analysis.
|
4 |
PAC-learning with label noiseJabbari Arfaee, Shahin 06 1900 (has links)
One of the main criticisms of previously studied label noise models in the PAC-learning framework is the inability of such models to represent the noise in real world data. In this thesis, we study this problem by introducing a framework for modeling label noise and suggesting four new
label noise models. We prove positive learnability results for these noise models in learning simple concept classes and discuss the difficulty of the problem of learning other interesting concept classes under these new models. In addition, we study the previous general learning algorithm,
called the minimum pn-disagreement strategy, that is used to prove learnability results in the PAC-learning framework both in the absence and presence of noise. Because of limitations of the minimum pn-disagreement strategy, we propose a new general learning algorithm called the minimum
nn-disagreement strategy. Finally, for both minimum pn-disagreement strategy and minimum nn-disagreement strategy, we investigate some properties of label noise models that provide sufficient conditions for the learnability of specific concept classes.
|
5 |
PAC-learning with label noiseJabbari Arfaee, Shahin Unknown Date
No description available.
|
6 |
Efficient pac-learning for episodic tasks with acyclic state spaces and the optimal node visitation problem in acyclic stochastic digaphs.Bountourelis, Theologos 19 December 2008 (has links)
The first part of this research program concerns the development of customized and easily implementable Probably Approximately Correct (PAC)-learning algorithms for episodic tasks over acyclic state spaces. The defining characteristic of our algorithms is that they take explicitly into consideration the acyclic structure of the underlying state space and the episodic nature of the considered learning task. The first of the above two attributes enables a very straightforward and efficient resolution of the ``exploration vs exploitation' dilemma, while the second provides a natural regenerating mechanism that is instrumental in the dynamics of our algorithms. Some additional characteristics that distinguish our algorithms from those developed in the past literature are (i) their direct nature, that eliminates the need of a complete specification of the underlying MDP model and reduces their execution to a very simple computation, and (ii) the unique emphasis that they place in the efficient implementation of the sampling process that is defined by their PAC property.
More specifically, the aforementioned PAC-learning algorithms complete their learning task by implementing a systematic episodic sampling schedule on the underlying acyclic state space.
This sampling schedule combined with the stochastic nature of
the transitions taking place, define
the need for efficient routing policies that will help the
algorithms complete their exploration program while minimizing, in expectation, the
number of executed episodes.
The design of an optimal policy that
will satisfy a specified pattern of arc visitation requirements in
an acyclic stochastic graph, while minimizing the expected number
of required episodes, is a challenging problem, even under the
assumption that all the branching probabilities involved are known
a priori. Hence, the sampling process that takes place in
the proposed PAC-learning algorithms gives rise to a novel, very
interesting stochastic control/scheduling problem, that is
characterized as the problem of the Optimal Node Visitation
(ONV) in acyclic stochastic digraphs. The second part of the work presented herein seeks the systematic modelling and analysis of the ONV problem.
The last part of this research program explores the computational merits
obtained by heuristical implementations that result from the integration of
the ONV problem developments into the PAC-algorithms developed in the first part of this work. We study, through numerical experimentation, the relative performance of these resulting heuristical implementations in comparison to (i) the initial version of the PAC-learning algorithms, presented in the first part of the research program, and (ii) standard Q-learning algorithm variations provided in the RL literature. The work presented in this last part reinforces and confirms the driving assumption of this research, i.e., that one can design customized RL algorithms of enhanced performance if the underlying problem structure is taken into account.
|
7 |
Learning with Recurrent Neural Networks / Lernen mit Rekurrenten Neuronalen NetzenHammer, Barbara 15 September 2000 (has links)
This thesis examines so called folding neural networks as a mechanism for machine learning. Folding networks form a
generalization of partial recurrent neural networks such that they are able to deal with tree structured inputs instead of
simple linear lists. In particular, they can handle classical formulas - they were proposed originally for this purpose. After
a short explanation of the neural architecture we show that folding networks are well suited as a learning mechanism in
principle. This includes three parts: the proof of their universal approximation ability, the aspect of information theoretical
learnability, and the examination of the complexity of training.
Approximation ability: It is shown that any measurable function can be approximated in probability. Explicit bounds on
the number of neurons result if only a finite number of points is dealt with. These bounds are new results in the case of
simple recurrent networks, too. Several restrictions occur if a function is to be approximated in the maximum norm.
Afterwards, we consider briefly the topic of computability. It is shown that a sigmoidal recurrent neural network can
compute any mapping in exponential time. However, if the computation is subject to noise almost the capability of tree
automata arises.
Information theoretical learnability: This part contains several contributions to distribution dependent learnability: The
notation of PAC and PUAC learnability, consistent PAC/ PUAC learnability, and scale sensitive versions are considered.
We find equivalent characterizations of these terms and examine their respective relation answering in particular an open
question posed by Vidyasagar. It is shown at which level learnability only because of an encoding trick is possible. Two
approaches from the literature which can guarantee distribution dependent learnability if the VC dimension of the concept
class is infinite are generalized to function classes: The function class is stratified according to the input space or
according to a so-called luckiness function which depends on the output of the learning algorithm and the concrete
training data.
Afterwards, the VC, pseudo-, and fat shattering dimension of folding networks are estimated: We improve some lower
bounds for recurrent networks and derive new lower bounds for the pseudodimension and lower and upper bounds for
folding networks in general. As a consequence, folding architectures are not distribution independent learnable.
Distribution dependent learnability can be guaranteed. Explicit bounds on the number of examples which guarantee valid
generalization can be derived using the two approaches mentioned above. We examine in which cases these bounds are
polynomial. Furthermore, we construct an explicit example for a learning scenario where an exponential number of
examples is necessary.
Complexity: It is shown that training a fixed folding architecture with perceptron activation function is polynomial.
Afterwards, a decision problem, the so-called loading problem, which is correlated to neural network training is examined.
For standard multilayer feed-forward networks the following situations turn out to be NP-hard: Concerning the
perceptron activation function, a classical result from the literature, the NP-hardness for varying input dimension, is
generalized to arbitrary multilayer architectures. Additionally, NP-hardness can be found if the input dimension is fixed
but the number of neurons may vary in at least two hidden layers. Furthermore, the NP-hardness is examined if the
number of patterns and number of hidden neurons are correlated. We finish with a generalization of the classical NP
result as mentioned above to the sigmoidal activation function which is used in practical applications.
|
8 |
PAC-Lernen zur Insolvenzvorhersage und Hotspot-Identifikation / PAC-Learning for insolvency-prediction and hotspot-identificationBrodag, Thomas 28 May 2008 (has links)
No description available.
|
Page generated in 0.068 seconds