• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 74
  • 17
  • 16
  • 6
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 138
  • 138
  • 46
  • 34
  • 27
  • 26
  • 26
  • 25
  • 23
  • 19
  • 18
  • 17
  • 16
  • 16
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
101

Duality theory for optimal mechanism design

Giannakopoulos, Ioannis January 2015 (has links)
In this work we present a general duality-theory framework for revenue maximization in additive Bayesian auctions involving multiple items and many bidders whose values for the goods follow arbitrary continuous joint distributions over some multi-dimensional real interval. Although the single-item case has been resolved in a very elegant way by the seminal work of Myerson [1981], optimal solutions involving more items still remain elusive. The framework extends linear programming duality and complementarity to constraints with partial derivatives. The dual system reveals the natural geometric nature of the problem and highlights its connection with the theory of bipartite graph matchings. We demonstrate the power of the framework by applying it to various special monopoly settings where a seller of multiple heterogeneous goods faces a buyer with independent item values drawn from various distributions of interest, to design both exact and approximately optimal selling mechanisms. Previous optimal solutions were only known for up to two and three goods, and a very limited range of distributional priors. The duality framework is used not only for proving optimality, but perhaps more importantly, for deriving the optimal mechanisms themselves. Some of our main results include: the proposal of a simple deterministic mechanism, which we call Straight-Jacket Auction (SJA) and is defined in a greedy, recursive way through natural geometric constraints, for many uniformly distributed goods, where exact optimality is proven for up to six items and general optimality is conjectured; a scheme of sufficient conditions for exact optimality for two-good settings and general independent distributions; a technique for upper-bounding the optimal revenue for arbitrarily many goods, with an application to uniform and exponential priors; and the proof that offering deterministically all items in a single full bundle is the optimal way of selling multiple exponentially i.i.d. items.
102

Techniques combinatoires pour les algorithmes paramétrés et les noyaux, avec applications aux problèmes de multicoupe. / Combinatorial Techniques for Parameterized Algorithms and Kernels, with Applications to Multicut.

Daligault, Jean 05 July 2011 (has links)
Dans cette thèse, nous abordons des problèmes NP-difficiles à l'aide de techniques combinatoires, en se focalisant sur le domaine de la complexité paramétrée. Les principaux problèmes que nous considérons sont les problèmes de Multicoupe et d'Arbre Orienté Couvrant avec Beaucoup de Feuilles. La Multicoupe est une généralisation naturelle du très classique problème de coupe, et consiste à séparer un ensemble donné de paires de sommets en supprimant le moins d'arêtes possible dans un graphe. Le problème d'Arbre Orienté Couvrant avec Beaucoup de Feuilles consiste à trouver un arbre couvrant avec le plus de feuilles possible dans un graphe dirigé. Les résultats principaux de cette thèse sont les suivants. Nous montrons que le problème de Multicoupe paramétré par la taille de la solution est FPT (soluble à paramètre fixé), c'est-à-dire que l'existence d'une multicoupe de taille $k$ dans un graphe à $n$ sommets peut être décidée en temps $f(k)*poly(n)$. Nous montrons que Multicoupe dans les arbres admet un noyau polynomial, c'est-à-dire est réductible aux instances de taille polynomiale en $k$. Nous donnons un algorithme en temps $O^*(3.72^k)$ pour le problème d'Arbre Orienté Couvrant avec Beaucoup de Feuilles et le premier algorithme exponentiel exact non trivial (c'est-à-dire meilleur que $2^n$). Nous fournissons aussi un noyau quadratique et une approximation à facteur constant. Ces résultats algorithmiques sont basés sur des résultats combinatoires et des propriétés structurelles qui concernent, entre autres, les décompositions arborescentes, les mineurs, des règles de réduction et les $s-t$ numberings. Nous présentons des résultats combinatoires hors du domaine de la complexité paramétrée: une caractérisation des graphes de cercle Helly comme les graphes de cercle sans diamant induit, et une caractérisation partielle des classes de graphes 2-bel-ordonnées. / This thesis tackles NP-hard problems with combinatorial techniques, focusing on the framework of Fixed-Parameter Tractability. The main problems considered here are Multicut and Maximum Leaf Out-branching. Multicut is a natural generalisation of the cut problem, and consists in simultaneously separating prescribed pairs of vertices by removing as few edges as possible in a graph. Maximum Leaf Out-branching consists in finding a spanning directed tree with as many leaves as possible in a directed graph. The main results of this thesis are the following. We show that Multicut is FPT when parameterized by the solution size, i.e. deciding the existence of a multicut of size $k$ in a graph with $n$ vertices can be done in time $f(k)*poly(n)$. We show that Multicut In Trees admits a polynomial kernel, i.e. can be reduced to instances of size polynomial in $k$. We give an $O^*(3.72^k)$ algorithm for Maximum Leaf Out-branching and the first non-trivial (better than $2^n$) exact algorithm. We also provide a quadratic kernel and a constant factor approximation algorithm. These algorithmic results are based on combinatorial results and structural properties, involving tree decompositions, minors, reduction rules and $s-t$ numberings, among others. We present results obtained with combinatorial techniques outside the scope of parameterized complexity: a characterization of Helly circle graphs as the diamond-free circle graphs, and a partial characterisation of 2-well-quasi-ordered classes of graphs.
103

Algoritmos para o problema da cobertura por sensores / Algorithms for the sensor cover problem

Barbosa, Rafael da Ponte 12 December 2011 (has links)
Neste trabalho estudamos aspectos algorítmicos do Problema da Cobertura por Sensores. Em linhas gerais, este problema a entrada consiste em uma região a ser monitorada por um conjunto de sensores previamente posicionados, cada qual dotado de bateria com duração limitada, e o objetivo é atribuir a cada sensor um tempo de início, de modo que toda a região seja coberta o maior tempo possível. Focamos nosso estudo no caso unidimensional do problema, chamado Problema da Cobertura de Faixa Restrita, no qual a região a ser monitorada é um intervalo (da reta real). Estudamos diversas variantes, de acordo com os subintervalos que os sensores cobrem (se de tamanhos fixos ou variados), e de acordo com a duração das baterias (se uniformes ou não). Estudamos também o caso preemptivo: quando os sensores podem ser ligados mais de uma vez. Para este último caso, projetamos um algoritmo polinomial bem simples. O Problema da Cobertura de Faixa Restrita é NP-difícil no caso não-preemptivo em que os sensores têm bateria de duração variável. Para este caso, em 2009 Gibson e Varadarajan apresentaram um algoritmo polinomial que provaram ser uma 5-aproximação. Provamos que este algoritmo tem fator de aproximação 4, e mostramos que este fator é justo. Apresentamos também formulações lineares inteiras para este caso, e os resultados computacionais obtidos. / We study the algorithmic aspects of the Sensor Cover Problem. Broadly speaking, in this problem the input consists of a region to be covered by a set of sensors previously positioned, each one powered with a battery of limited duration, and the objective is to assign to each sensor an initial time, so as to cover the given region for as long as possible. We focus our study on the one-dimensional case of the problem, called Restricted Strip Cover Problem, in which the region to be covered is an interval (of the real line). We study several variants, according to the type of the subintervals the sensors cover (if they have fixed length or not), to the duration of the batteries (if uniform or not). We also study the preemptive case: when the sensors can be turned on and off more than once. For this case, we designed a simple polynomial-time algorithm. The Restricted Strip Cover Problem is NP-hard in the non-preemptive case in which the sensors have non-uniform duration batteries. For this case, in 2009 Gibson and Varadarajan designed a polynomial-time algorithm which they proved to be a 5-aproximation. We prove that this algorithm has approximation ratio 4, and show that this ratio is tight. We also present two integer linear formulations for this case, and report on the computational results obtained with this approach.
104

Algoritmos para o problema da cobertura por sensores / Algorithms for the sensor cover problem

Rafael da Ponte Barbosa 12 December 2011 (has links)
Neste trabalho estudamos aspectos algorítmicos do Problema da Cobertura por Sensores. Em linhas gerais, este problema a entrada consiste em uma região a ser monitorada por um conjunto de sensores previamente posicionados, cada qual dotado de bateria com duração limitada, e o objetivo é atribuir a cada sensor um tempo de início, de modo que toda a região seja coberta o maior tempo possível. Focamos nosso estudo no caso unidimensional do problema, chamado Problema da Cobertura de Faixa Restrita, no qual a região a ser monitorada é um intervalo (da reta real). Estudamos diversas variantes, de acordo com os subintervalos que os sensores cobrem (se de tamanhos fixos ou variados), e de acordo com a duração das baterias (se uniformes ou não). Estudamos também o caso preemptivo: quando os sensores podem ser ligados mais de uma vez. Para este último caso, projetamos um algoritmo polinomial bem simples. O Problema da Cobertura de Faixa Restrita é NP-difícil no caso não-preemptivo em que os sensores têm bateria de duração variável. Para este caso, em 2009 Gibson e Varadarajan apresentaram um algoritmo polinomial que provaram ser uma 5-aproximação. Provamos que este algoritmo tem fator de aproximação 4, e mostramos que este fator é justo. Apresentamos também formulações lineares inteiras para este caso, e os resultados computacionais obtidos. / We study the algorithmic aspects of the Sensor Cover Problem. Broadly speaking, in this problem the input consists of a region to be covered by a set of sensors previously positioned, each one powered with a battery of limited duration, and the objective is to assign to each sensor an initial time, so as to cover the given region for as long as possible. We focus our study on the one-dimensional case of the problem, called Restricted Strip Cover Problem, in which the region to be covered is an interval (of the real line). We study several variants, according to the type of the subintervals the sensors cover (if they have fixed length or not), to the duration of the batteries (if uniform or not). We also study the preemptive case: when the sensors can be turned on and off more than once. For this case, we designed a simple polynomial-time algorithm. The Restricted Strip Cover Problem is NP-hard in the non-preemptive case in which the sensors have non-uniform duration batteries. For this case, in 2009 Gibson and Varadarajan designed a polynomial-time algorithm which they proved to be a 5-aproximation. We prove that this algorithm has approximation ratio 4, and show that this ratio is tight. We also present two integer linear formulations for this case, and report on the computational results obtained with this approach.
105

[en] TWO GRAPH OPTIMIZATION PROBLEMS: PIPELINE TRANSPORTATION AND SEARCHING WITH ACCESS COSTS / [pt] DOIS PROBLEMAS DE OTIMIZAÇÃO EM GRAFOS: TRANSPORTE EM REDES DE DUTOS E BUSCA COM CUSTOS DE ACESSOS

ARTUR ALVES PESSOA 07 January 2004 (has links)
[pt] Consideramos dois problemas de otimização combinatória: o problema de transporte em redes de dutos (PTD) e o problema de busca com custos de acesso variados (PBC). No PTD, é dado um grafo orientado G = (N,A) onde cada arco tem um duto associado. Também é dado um conjunto de bateladas, onde cada batelada está inicialmente em um nó ou arco do grafo e tem um nó de destino. Algumas bateladas são chamadas de proteláveis. O objetivo do PTD é encontrar uma sequência de operações que transporte todas as bateladas não-proteláveis aos seus respectivos nós de destino. Primeiro, demonstramos o PTD é NP-difícil, mesmo que o grafo G seja acíclico. Em seguida, apresentamos um algoritmo polinomial chamado de BPA. Este algoritmo resolve o PTDS, uma variação do PTD, para qualquer grafo G. Para grafos acíclicos, o BPA minimiza uma função de custo genérica. Para minimizar o makespan no PTDS, demonstramos que não existe algoritmo polinomial n1-e - aproximado para nenhum E>0, a menos que P = NP, onde n é o tamanho da instância. Este resultado também vale se G é acíclico e planar. No PBC, são dados um vetor ordenado e o custo de acessar cada um de seus n elementos. O objetivo do problema é encontrar uma estratégia de busca que minimize o custo médio com probabilidades uniformes (PBCM) ou o custo do pior caso (PBCN). Em ambos os casos, o melhor algoritmo exato conhecido executa em tempo O(n3) e espaço O(n2). Para o PBCN, apresentamos o algoritmo da razão, que executa em tempo O(n2) e espaço O(n). Este algoritmo sempre obtém uma solução de custo menor ou igual a 41n(n+1)/n, assumindo que a soma dos custos é 1. Além disso, desenvolvemos dois algoritmos aproximados: um para o PBCM e outro para o PBPC. Ambos constroem soluções (2+E+0(1)) - aproximadas, para qualquer E>0, em tempo e espaço O(n). / [en] We consider two combinatorial optimization problems the pipeline transportation problem (PTD) and the problem of searching with different access costs (PBC). In PTD, we are given a directed graph G = (N,A) where each arc corresponds to a pipeline. We are also given a set of batches, each batch being initially located at an arc or node and having a destination node. A subset of these batches are considered as further batches. Our aim is to find a sequence of pipeline operations leading all non-further batches to their corresponding destination nodes. First, we show that PDT is NP-hard, even for the case where G is acyclic. Next, we present a polynomial algorithm called BPA. This algorithm solves PTDS, a variation of PTD, for general graphs. For acyclic graphs, BPA also minimizes a general cost function. For the case of makespan minimization for PTDS, we prove that there is no n1-e - approximate algorithm for any E]0, unless P = NP, where n is the instance size. The previous result also holds if G is both ayclic and planar. In PBC, we are given an ordered vector with n elements and the corresponding access costs. Our aim is to find a search strategy that minimizes either the average cost (PBPC). In both cases, the best known exact algorithm requires in O(n3) time and O(n2) space. For PBCM, we present the ratio algorithm, that requires O(n2) time and O(n3)space. This algorithm always obtains a search strategy with average cost at most 41n(n+1)/n, assuming the sum of all access costs to be 1. Furthermore, we introduce approximation algorithms for both PBCM and PBPC. Both of them give (2+E+0(1)) - approximate solutions, for any E}0, in O(n) time and space.
106

Big Networks: Analysis and Optimal Control

Nguyen, Hung The 01 January 2018 (has links)
The study of networks has seen a tremendous breed of researches due to the explosive spectrum of practical problems that involve networks as the access point. Those problems widely range from detecting functionally correlated proteins in biology to finding people to give discounts and gain maximum popularity of a product in economics. Thus, understanding and further being able to manipulate/control the development and evolution of the networks become critical tasks for network scientists. Despite the vast research effort putting towards these studies, the present state-of-the-arts largely either lack of high quality solutions or require excessive amount of time in real-world `Big Data' requirement. This research aims at affirmatively boosting the modern algorithmic efficiency to approach practical requirements. That is developing a ground-breaking class of algorithms that provide simultaneously both provably good solution qualities and low time and space complexities. Specifically, I target the important yet challenging problems in the three main areas: Information Diffusion: Analyzing and maximizing the influence in networks and extending results for different variations of the problems. Community Detection: Finding communities from multiple sources of information. Security and Privacy: Assessing organization vulnerability under targeted-cyber attacks via social networks.
107

Optimization in Graphs under Degree Constraints. Application to Telecommunication Networks

Sau, Ignasi 16 October 2009 (has links) (PDF)
La première partie de cette thèse s'intéresse au groupage de trafic dans les réseaux de télécommunications. La notion de groupage de trafic correspond à l'agrégation de flux de faible débit dans des conduits de plus gros débit. Cependant, à chaque insertion ou extraction de trafic sur une longueur d'onde il faut placer dans le noeud du réseau un multiplexeur à insertion/extraction (ADM). De plus il faut un ADM pour chaque longueur d'onde utilisée dans le noeud, ce qui représente un coût d'équipements important. Les objectifs du groupage de trafic sont d'une part le partage efficace de la bande passante et d'autre part la réduction du coût des équipements de routage. Nous présentons des résultats d'inapproximabilité, des algorithmes d'approximation, un nouveau modèle qui permet au réseau de pouvoir router n'importe quel graphe de requêtes de degré borné, ainsi que des solutions optimales pour deux scénarios avec trafic all-to-all: l'anneau bidirectionnel et l'anneau unidirectionnel avec un facteur de groupage qui change de manière dynamique. La deuxième partie de la thèse s'intéresse aux problèmes consistant à trouver des sous-graphes avec contraintes sur le degré. Cette classe de problèmes est plus générale que le groupage de trafic, qui est un cas particulier. Il s'agit de trouver des sous-graphes d'un graphe donné avec contraintes sur le degré, tout en optimisant un paramètre du graphe (très souvent, le nombre de sommets ou d'arêtes). Nous présentons des algorithmes d'approximation, des résultats d'inapproximabilité, des études sur la complexité paramétrique, des algorithmes exacts pour les graphes planaires, ainsi qu'une méthodologie générale qui permet de résoudre efficacement cette classe de problèmes (et de manière plus générale, la classe de problèmes tels qu'une solution peut être codé avec une partition d'un sous-ensemble des sommets) pour les graphes plongés dans une surface. Finalement, plusieurs annexes présentent des résultats sur des problèmes connexes.
108

Uncalibrated robotic visual servo tracking for large residual problems

Munnae, Jomkwun 17 November 2010 (has links)
In visually guided control of a robot, a large residual problem occurs when the robot configuration is not in the neighborhood of the target acquisition configuration. Most existing uncalibrated visual servoing algorithms use quasi-Gauss-Newton methods which are effective for small residual problems. The solution used in this study switches between a full quasi-Newton method for large residual case and the quasi-Gauss-Newton methods for the small case. Visual servoing to handle large residual problems for tracking a moving target has not previously appeared in the literature. For large residual problems various Hessian approximations are introduced including an approximation of the entire Hessian matrix, the dynamic BFGS (DBFGS) algorithm, and two distinct approximations of the residual term, the modified BFGS (MBFGS) algorithm and the dynamic full Newton method with BFGS (DFN-BFGS) algorithm. Due to the fact that the quasi-Gauss-Newton method has the advantage of fast convergence, the quasi-Gauss-Newton step is used as the iteration is sufficiently near the desired solution. A switching algorithm combines a full quasi-Newton method and a quasi-Gauss-Newton method. Switching occurs if the image error norm is less than the switching criterion, which is heuristically selected. An adaptive forgetting factor called the dynamic adaptive forgetting factor (DAFF) is presented. The DAFF method is a heuristic scheme to determine the forgetting factor value based on the image error norm. Compared to other existing adaptive forgetting factor schemes, the DAFF method yields the best performance for both convergence time and the RMS error. Simulation results verify validity of the proposed switching algorithms with the DAFF method for large residual problems. The switching MBFGS algorithm with the DAFF method significantly improves tracking performance in the presence of noise. This work is the first successfully developed model independent, vision-guided control for large residual with capability to stably track a moving target with a robot.
109

Sub-Polyhedral Compilation using (Unit-)Two-Variables-Per-Inequality Polyhedra

Upadrasta, Ramakrishna 13 March 2013 (has links) (PDF)
The goal of this thesis is to design algorithms that run with better complexity when compiling or parallelizing loop programs. The framework within which our algorithms operate is the polyhedral model of compilation which has been successful in the design and implementation of complex loop nest optimizers and parallelizing compilers. The algorithmic complexity and scalability limitations of the above framework remain one important weakness. We address it by introducing sub-polyhedral compilation by using (Unit-)Two-Variable-Per-Inequality or (U)TVPI Polyhedra, namely polyhedrawith restricted constraints of the type ax_{i}+bx_{j}\le c (\pm x_{i}\pm x_{j}\le c). A major focus of our sub-polyhedral compilation is the introduction of sub-polyhedral scheduling, where we propose a technique for scheduling using (U)TVPI polyhedra. As part of this, we introduce algorithms that can be used to construct under-aproximations of the systems of constraints resulting from affine scheduling problems. This technique relies on simple polynomial time algorithms to under approximate a general polyhedron into (U)TVPI polyhedra. The above under-approximation algorithms are generic enough that they can be used for many kinds of loop parallelization scheduling problems, reducing each of their complexities to asymptotically polynomial time. We also introduce sub-polyhedral code-generation where we propose algorithms to use the improved complexities of (U)TVPI sub-polyhedra in polyhedral code generation. In this problem, we show that the exponentialities associated with the widely used polyhedral code generators could be reduced to polynomial time using the improved complexities of (U)TVPI sub-polyhedra. The above presented sub-polyhedral scheduling techniques are evaluated in an experimental framework. For this, we modify the state-of-the-art PLuTo compiler which can parallelize for multi-core architectures using permutation and tiling transformations. We show that using our scheduling technique, the above under-approximations yield polyhedra that are non-empty for 10 out of 16 benchmarks from the Polybench (2.0) kernels. Solving the under-approximated system leads to asymptotic gains in complexity, and shows practically significant improvements when compared to a traditional LP solver. We also verify that code generated by our sub-polyhedral parallelization prototype matches the performance of PLuTo-optimized code when the under-approximation preserves feasibility.
110

Stochastic Approximation Algorithms with Set-valued Dynamics : Theory and Applications

Ramaswamy, Arunselvan January 2016 (has links) (PDF)
Stochastic approximation algorithms encompass a class of iterative schemes that converge to a sought value through a series of successive approximations. Such algorithms converge even when the observations are erroneous. Errors in observations may arise due to the stochastic nature of the problem at hand or due to extraneous noise. In other words, stochastic approximation algorithms are self-correcting schemes, in that the errors are wiped out in the limit and the algorithms still converge to the sought values. The rst stochastic approximation algorithm was developed by Robbins and Monro in 1951 to solve the root- nding problem. In 1977 Ljung showed that the asymptotic behavior of a stochastic approximation algorithm can be studied by associating a deterministic ODE, called the associated ODE, and studying it's asymptotic behavior instead. This is commonly referred to as the ODE method. In 1996 Bena•m and Bena•m and Hirsch [1] [2] used the dynamical systems approach in order to develop a framework to analyze generalized stochastic approximation algorithms, given by the following recursion: xn+1 = xn + a(n) [h(xn) + Mn+1] ; (1) where xn 2 Rd for all n; h : Rd ! Rd is Lipschitz continuous; fa(n)gn 0 is the given step-size sequence; fMn+1gn 0 is the Martingale difference noise. The assumptions of [1] later became the `standard assumptions for convergence'. One bottleneck in deploying this framework is the requirement on stability (almost sure boundedness) of the iterates. In 1999 Borkar and Meyn developed a unified set of assumptions that guaranteed both stability and convergence of stochastic approximations. However, the aforementioned frameworks did not account for scenarios with set-valued mean fields. In 2005 Bena•m, Hofbauer and Sorin [3] showed that the dynamical systems approach to stochastic approximations can be extended to scenarios with set-valued mean- fields. Again, stability of the fiterates was assumed. Note that stochastic approximation algorithms with set-valued mean- fields are also called stochastic recursive inclusions (SRIs). The Borkar-Meyn theorem for SRIs [10] As stated earlier, in many applications stability of the iterates is a hard assumption to verify. In Chapter 2 of the thesis, we present an extension of the original theorem of Borkar and Meyn to include SRIs. Specifically, we present two different (yet related) easily-verifiable sets of assumptions for both stability and convergence of SRIs. A SRI is given by the following recursion in Rd: xn+1 = xn + a(n) [yn + Mn+1] ; (2) where 8 n yn 2 H(xn) and H : Rd ! fsubsets of Rdg is a given Marchaud map. As a corollary to one of our main results, a natural generalization of the original Borkar and Meyn theorem is seen to follow. We also present two applications of our framework. First, we use our framework to provide a solution to the `approximate drift problem'. This problem can be stated as follows. When an experimenter runs a traditional stochastic approximation algorithm such as (1), the exact value of the drift h cannot be accurately calculated at every stage. In other words, the recursion run by the experimenter is given by (2), where yn is an approximation of h(xn) at stage n. A natural question arises: Do the errors due to approximations accumulate and wreak havoc with the long-term behavior (convergence) of the algorithm? Using our framework, we show the following: Suppose a stochastic approximation algorithm without errors can be guaranteed to be stable, then it's `approximate version' with errors is also stable, provided the errors are bounded at every stage. For the second application, we use our framework to relax the stability assumptions involved in the original Borkar-Meyn theorem, hence making the framework more applicable. It may be noted that the contents of Chapter 2 are based on [10]. Analysis of gradient descent methods with non-diminishing, bounded errors [9] Let us consider a continuously differentiable function f. Suppose we are interested in nding a minimizer of f, then a gradient descent (GD) scheme may be employed to nd a local minimum. Such a scheme is given by the following recursion in Rd: xn+1 = xn a(n)rf(xn): (3) GD is an important implementation tool for many machine learning algorithms, such as the backpropagation algorithm to train neural networks. For the sake of convenience, experimenters often employ gradient estimators such as Kiefer-Wolfowitz estimator, simultaneous perturbation stochastic approximation, etc. These estimators provide an estimate of the gradient rf(xn) at stage n. Since these estimators only provide an approximation of the true gradient, the experimenter is essentially running the recursion given by (2), where yn is a `gradient estimate' at stage n. Such gradient methods with errors have been previously studied by Bertsekas and Tsitsiklis [5]. However, the assumptions involved are rather restrictive and hard to verify. In particular, the gradient-errors are required to vanish asymptotically at a prescribed rate. This may not hold true in many scenarios. In Chapter 3 of the thesis, the results of [5] are extended to GD with bounded, non-diminishing errors, given by the following recursion in Rd: xn+1 = xn a(n) [rf(xn) + (n)] ; (4) where k (n)k for some fixed > 0. As stated earlier, previous literature required k (n)k ! 0, as n ! 1, at a `prescribed rate'. Sufficient conditions are presented for both stability and convergence of (4). In other words, the conditions presented in Chapter 3 ensure that the errors `do not accumulate' and wreak havoc with the stability or convergence of GD. Further, we show that (4) converges to a small neighborhood of the minimum set, which in turn depends on the error-bound . To the best of our knowledge this is the first time that GD with bounded non-diminishing errors has been analyzed. As an application, we use our framework to present a simplified implementation of simultaneous perturbation stochastic approximation (SPSA), a popular gradient descent method introduced by Spall [13]. Traditional convergence-analysis of SPSA involves assumptions that `couple' the `sensitivity parameters' of SPSA and the step-sizes. These assumptions restrict the choice of step-sizes available to the experimenter. In the context of machine learning, the learning rate may be adversely affected. We present an implementation of SPSA using `constant sensitivity parameters', thereby `decoupling' the step-sizes and sensitivity parameters. Further, we show that SPSA with constant sensitivity parameters can be analyzed using our framework. Finally, we present experimental results to support our theory. It may be noted that contents of Chapter 3 are based on [9]. b(n) a(n) Stochastic recursive inclusions with two timescales [12] There are many scenarios wherein the traditional single timescale framework cannot be used to analyze the algorithm at hand. Consider for example, the adaptive heuristic critic approach to reinforcement learning, which requires a stationary value iteration (for a fixed policy) to be executed between two policy iterations. To analyze such schemes Borkar [6] introduced the two timescale framework, along with a set of sufficient conditions which guarantee their convergence. Perkins and Leslie [8] extended the framework of Borkar to include set-valued mean- fields. However, the assumptions involved were still very restrictive and not easily verifiable. In Chapter 4 of the thesis, we present a generalization of the aforementioned frameworks. The framework presented is more general when compared to the frameworks of [6] and [8], and the assumptions involved are easily verifiable. A SRI with two timescales is given by the following coupled iteration: xn+1 = xn + a(n) un + Mn1+1 ; (5) yn+1 = yn + b(n) vn + Mn2+1 ; (6) where xn 2 R d and yn 2 R k for all n 0; un 2 h(xn; yn) and vn 2 g(xn; yn) for all n 0, where h : Rd Rk ! fsubsets of Rdg and g : Rd Rk ! fsubsets of Rkg are two given Marchaud maps; fa(n)gn 0 and fb(n)gn 0 are the step-size sequences satisfying ! 0 as n ! 1; fMn1+1gn 0 and fMn2+1 gn 0 constitute the Martingale noise terms. Our main contribution is in the weakening of the key assumption that `couples' the behavior of the x and y iterates. As an application of our framework we analyze the two timescale algorithm which solves the `constrained Lagrangian dual optimization problem'. The problem can be stated as thus: Given two functions f : Rd ! R and g : Rd ! Rk, we want to minimize f(x) subject to the condition that g(x) 0. This problem can be stated in the following primal form: inf sup f(x) + T g(x) : (7) 2R 2R0 x d k Under strong duality, solving the above equation is equivalent to solving it's dual: sup inf f(x) + T g(x) : (8) 2Rk x2Rd 0 The corresponding two timescale algorithm to solve the dual is given by: xn+1 = xn a(n) rx f(xn) + nT g(xn) + Mn2+1 ; (9) n+1 = n + b(n) f(xn) + nT g(xn) + Mn1+1 : r We use our framework to show that (9) converges to a solution of the dual given by (8). Further, as a consequence of our framework, the class of objective and constraint functions, for which (9) can be analyzed, is greatly enlarged. It may be noted that the contents of Chapter 4 are based on [12]. Stochastic approximation driven by `controlled Markov' process and temporal difference learning [11] In the field of reinforcement learning, one encounters stochastic approximation algorithms that are driven by Markov processes. The groundwork for analyzing the long-term behavior of such algorithms was laid by Benveniste et. al. [4]. Borkar [7] extended the results of [4] to include algorithms driven by `controlled Markov' processes i.e., algorithms where the `state process' was in turn driven by a time varying `control' process. Another important extension was that multiple stationary distributions were allowed, see [7] for details. The convergence analysis of [7] assumed that the iterates were stable. In reinforcement learning applications, stability is a hard assumption to verify. Hence, the stability assumption poses a bottleneck when deploying the aforementioned framework for the analysis of reinforcement algorithms. In Chapter 5 of the thesis we present sufficient conditions for both stability and convergence of stochastic approximations driven by `controlled Markov' processes. As an application of our framework, sufficient conditions for stability of temporal difference (T D) learning algorithm, an important policy-evaluation method, are presented that are compatible with existing conditions for convergence. The conditions are weakened two-fold in that (a) the Markov process is no longer required to evolve in a finite state space and (b) the state process is not required to be ergodic under a given stationary policy. It may be noted that the contents of Chapter 5 are based on [11].

Page generated in 0.1192 seconds