Global ETD Search

241	Online Learning and Simulation Based Algorithms for Stochastic Optimization Lakshmanan, K January 2012 (has links) (PDF) In many optimization problems, the relationship between the objective and parameters is not known. The objective function itself may be stochastic such as a long-run average over some random cost samples. In such cases finding the gradient of the objective is not possible. It is in this setting that stochastic approximation algorithms are used. These algorithms use some estimates of the gradient and are stochastic in nature. Amongst gradient estimation techniques, Simultaneous Perturbation Stochastic Approximation (SPSA) and Smoothed Functional(SF) scheme are widely used. In this thesis we have proposed a novel multi-time scale quasi-Newton based smoothed functional (QN-SF) algorithm for unconstrained as well as constrained optimization. The algorithm uses the smoothed functional scheme for estimating the gradient and the quasi-Newton method to solve the optimization problem. The algorithm is shown to converge with probability one. We have also provided here experimental results on the problem of optimal routing in a multi-stage network of queues. Policies like Join the Shortest Queue or Least Work Left assume knowledge of the queue length values that can change rapidly or hard to estimate. If the only information available is the expected end-to-end delay as with our case, such policies cannot be used. The QN-SF based probabilistic routing algorithm uses only the total end-to-end delay for tuning the probabilities. We observe from the experiments that the QN-SF algorithm has better performance than the gradient and Jacobi versions of Newton based smoothed functional algorithms. Next we consider constrained routing in a similar queueing network. We extend the QN-SF algorithm to this case. We study the convergence behavior of the algorithm and observe that the constraints are satisfied at the point of convergence. We provide experimental results for the constrained routing setup as well. Next we study reinforcement learning algorithms which are useful for solving Markov Decision Process(MDP) when the precise information on transition probabilities is not known. When the state, and action sets are very large, it is not possible to store all the state-action tuples. In such cases, function approximators like neural networks have been used. The popular Q-learning algorithm is known to diverge when used with linear function approximation due to the ’off-policy’ problem. Hence developing stable learning algorithms when used with function approximation is an important problem. We present in this thesis a variant of Q-learning with linear function approximation that is based on two-timescale stochastic approximation. The Q-value parameters for a given policy in our algorithm are updated on the slower timescale while the policy parameters themselves are updated on the faster scale. We perform a gradient search in the space of policy parameters. Since the objective function and hence the gradient are not analytically known, we employ the efficient one-simulation simultaneous perturbation stochastic approximation(SPSA) gradient estimates that employ Hadamard matrix based deterministic perturbations. Our algorithm has the advantage that, unlike Q-learning, it does not suffer from high oscillations due to the off-policy problem when using function approximators. Whereas it is difficult to prove convergence of regular Q-learning with linear function approximation because of the off-policy problem, we prove that our algorithm which is on-policy is convergent. Numerical results on a multi-stage stochastic shortest path problem show that our algorithm exhibits significantly better performance and is more robust as compared to Q-learning. Future work would be to compare it with other policy-based reinforcement learning algorithms. Finally, we develop an online actor-critic reinforcement learning algorithm with function approximation for a problem of control under inequality constraints. We consider the long-run average cost Markov decision process(MDP) framework in which both the objective and the constraint functions are suitable policy-dependent long-run averages of certain sample path functions. The Lagrange multiplier method is used to handle the inequality constraints. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal solution. We also provide the results of numerical experiments on a problem of routing in a multistage queueing network with constraints on long-run average queue lengths. We observe that our algorithm exhibits good performance on this setting and converges to a feasible point. Stochastic Approximation Algorithms Stochastic Optimization Markov Decision Process Reinforcement Learning Algorithm Queueing Networks Queuing Theory Online Q-Learning Algorithm Online Actor-Critic Algorithm Markov Decision Processes Q-learning Algorithm Linear Function Approximation Computer Science
242	Sur deux problèmes d’apprentissage automatique : la détection de communautés et l’appariement adaptatif / On two problems in machine learning : community detection and adaptive matching Gulikers, Lennart 13 November 2017 (has links) Dans cette thèse, nous étudions deux problèmes d'apprentissage automatique : (I) la détection des communautés et (II) l'appariement adaptatif. I) Il est bien connu que beaucoup de réseaux ont une structure en communautés. La détection de ces communautés nous aide à comprendre et exploiter des réseaux de tout genre. Cette thèse considère principalement la détection des communautés par des méthodes spectrales utilisant des vecteurs propres associés à des matrices choisiesavec soin. Nous faisons une analyse de leur performance sur des graphes artificiels. Au lieu du modèle classique connu sous le nom de « Stochastic Block Model » (dans lequel les degrés sont homogènes) nous considérons un modèle où les degrés sont plus variables : le « Degree-Corrected Stochastic Block Model » (DC-SBM). Dans ce modèle les degrés de tous les nœuds sont pondérés - ce qui permet de générer des suites des degrés hétérogènes. Nous étudions ce modèle dans deux régimes: le régime dense et le régime « épars », ou « dilué ». Dans le régime dense, nous prouvons qu'un algorithme basé sur une matrice d'adjacence normalisée réussit à classifier correctement tous les nœuds sauf une fraction négligeable. Dans le régime épars il existe un seuil en termes de paramètres du modèle en-dessous lequel n'importe quel algorithme échoue par manque d'information. En revanche, nous prouvons qu'un algorithme utilisant la matrice « non-backtracking » réussit jusqu'au seuil - cette méthode est donc très robuste. Pour montrer cela nous caractérisons le spectre des graphes qui sont générés selon un DC-SBM dans son régime épars. Nous concluons cette partie par des tests sur des réseaux sociaux. II) Les marchés d'intermédiation en ligne tels que des plateformes de Question-Réponse et des plateformes de recrutement nécessitent un appariement basé sur une information incomplète des deux parties. Nous développons un modèle de système d'appariement entre tâches et serveurs représentant le comportement de telles plateformes. Pour ce modèle nous donnons une condition nécessaire et suffisante pour que le système puisse gérer un certain flux de tâches. Nous introduisons également une politique de « back-pressure » sous lequel le débit gérable par le système est maximal. Nous prouvons que cette politique atteint un débit strictement plus grand qu'une politique naturelle « gloutonne ». Nous concluons en validant nos résultats théoriques avec des simulations entrainées par des données de la plateforme Stack-Overflow. / In this thesis, we study two problems of machine learning: (I) community detection and (II) adaptive matching. I) It is well-known that many networks exhibit a community structure. Finding those communities helps us understand and exploit general networks. In this thesis we focus on community detection using so-called spectral methods based on the eigenvectors of carefully chosen matrices. We analyse their performance on artificially generated benchmark graphs. Instead of the classical Stochastic Block Model (which does not allow for much degree-heterogeneity), we consider a Degree-Corrected Stochastic Block Model (DC-SBM) with weighted vertices, that is able to generate a wide class of degree sequences. We consider this model in both a dense and sparse regime. In the dense regime, we show that an algorithm based on a suitably normalized adjacency matrix correctly classifies all but a vanishing fraction of the nodes. In the sparse regime, we show that the availability of only a small amount of information entails the existence of an information-theoretic threshold below which no algorithm performs better than random guess. On the positive side, we show that an algorithm based on the non-backtracking matrix works all the way down to the detectability threshold in the sparse regime, showing the robustness of the algorithm. This follows after a precise characterization of the non-backtracking spectrum of sparse DC-SBM's. We further perform tests on well-known real networks. II) Online two-sided matching markets such as Q&A forums and online labour platforms critically rely on the ability to propose adequate matches based on imperfect knowledge of the two parties to be matched. We develop a model of a task / server matching system for (efficient) platform operation in the presence of such uncertainty. For this model, we give a necessary and sufficient condition for an incoming stream of tasks to be manageable by the system. We further identify a so-called back-pressure policy under which the throughput that the system can handle is optimized. We show that this policy achieves strictly larger throughput than a natural greedy policy. Finally, we validate our model and confirm our theoretical findings with experiments based on user-contributed content on an online platform. Apprentissage automatique Détection des communautés Réseaux sociaux Degree-corrected stochastic block models Méthodes spectrales Matrices aléatoires Matrice non-backtracking Graphes aléatoires Appariement adaptatif Apprentissage par renforcement Théorie des files d’attente Machine learning Community detection Social networks Degree-corrected stochastic block models Spectral methods Random matrices Nonbacktracking matrix Random graphs Recommendation systems Reinforcement learning Queueing theory 519
243	Quality of Service Aware Mechanisms for (Re)Configuring Data Stream Processing Applications on Highly Distributed Infrastructure / Mécanismes prenant en compte la qualité de service pour la (re)configuration d’applications de traitement de flux de données sur une infrastructure hautement distribuée Da Silva Veith, Alexandre 23 September 2019 (has links) Une grande partie de ces données volumineuses ont plus de valeur lorsqu'elles sont analysées rapidement, au fur et à mesure de leur génération. Dans plusieurs scénarios d'application émergents, tels que les villes intelligentes, la surveillance opérationnelle de grandes infrastructures et l'Internet des Objets (Internet of Things), des flux continus de données doivent être traités dans des délais très brefs. Dans plusieurs domaines, ce traitement est nécessaire pour détecter des modèles, identifier des défaillances et pour guider la prise de décision. Les données sont donc souvent rassemblées et analysées par des environnements logiciels conçus pour le traitement de flux continus de données. Ces environnements logiciels pour le traitement de flux de données déploient les applications sous-la forme d'un graphe orienté ou de dataflow. Un dataflow contient une ou plusieurs sources (i.e. capteurs, passerelles ou actionneurs); opérateurs qui effectuent des transformations sur les données (e.g., filtrage et agrégation); et des sinks (i.e., éviers qui consomment les requêtes ou stockent les données). Nous proposons dans cette thèse un ensemble de stratégies pour placer les opérateurs dans une infrastructure massivement distribuée cloud-edge en tenant compte des caractéristiques des ressources et des exigences des applications. En particulier, nous décomposons tout d'abord le graphe d'application en identifiant quelques comportements tels que des forks et des joints, puis nous le plaçons dynamiquement sur l'infrastructure. Des simulations et un prototype prenant en compte plusieurs paramètres d'application démontrent que notre approche peut réduire la latence de bout en bout de plus de 50% et aussi améliorer d'autres métriques de qualité de service. L'espace de recherche de solutions pour la reconfiguration des opérateurs peut être énorme en fonction du nombre d'opérateurs, de flux, de ressources et de liens réseau. De plus, il est important de minimiser le coût de la migration tout en améliorant la latence. Des travaux antérieurs, Reinforcement Learning (RL) et Monte-Carlo Tree Searh (MCTS) ont été utilisés pour résoudre les problèmes liés aux grands nombres d’actions et d’états de recherche. Nous modélisons le problème de reconfiguration d'applications sous la forme d'un processus de décision de Markov (MDP) et étudions l'utilisation des algorithmes RL et MCTS pour concevoir des plans de reconfiguration améliorant plusieurs métriques de qualité de service. / A large part of this big data is most valuable when analysed quickly, as it is generated. Under several emerging application scenarios, such as in smart cities, operational monitoring of large infrastructure, and Internet of Things (IoT), continuous data streams must be processed under very short delays. In multiple domains, there is a need for processing data streams to detect patterns, identify failures, and gain insights. Data is often gathered and analysed by Data Stream Processing Engines (DSPEs).A DSPE commonly structures an application as a directed graph or dataflow. A dataflow has one or multiple sources (i.e., gateways or actuators); operators that perform transformations on the data (e.g., filtering); and sinks (i.e., queries that consume or store the data). Most complex operator transformations store information about previously received data as new data is streamed in. Also, a dataflow has stateless operators that consider only the current data. Traditionally, Data Stream Processing (DSP) applications were conceived to run in clusters of homogeneous resources or on the cloud. In a cloud deployment, the whole application is placed on a single cloud provider to benefit from virtually unlimited resources. This approach allows for elastic DSP applications with the ability to allocate additional resources or release idle capacity on demand during runtime to match the application requirements.We introduce a set of strategies to place operators onto cloud and edge while considering characteristics of resources and meeting the requirements of applications. In particular, we first decompose the application graph by identifying behaviours such as forks and joins, and then dynamically split the dataflow graph across edge and cloud. Comprehensive simulations and a real testbed considering multiple application settings demonstrate that our approach can improve the end-to-end latency in over 50% and even other QoS metrics. The solution search space for operator reassignment can be enormous depending on the number of operators, streams, resources and network links. Moreover, it is important to minimise the cost of migration while improving latency. Reinforcement Learning (RL) and Monte-Carlo Tree Search (MCTS) have been used to tackle problems with large search spaces and states, performing at human-level or better in games such as Go. We model the application reconfiguration problem as a Markov Decision Process (MDP) and investigate the use of RL and MCTS algorithms to devise reconfiguring plans that improve QoS metrics. Mécanismes Qualité de service (re)configuration Infrastructure hautement distribuée Internet des Objets Réseau edge-cloud Théorie des files d'attente Processus de décision de Markov Reinforcement Learning Mechanisms Quality of Service (re)configuration Data Stream Processing Applications Highly Distributed Infrastructure Internet of Things Edge-cloud infrastructure Queueing theory Markov Decision Process Reinforcement Learning
244	Conception et performance de schémas de coordination dans les réseaux cellulaires / Design and performance of coordination schemes in cellular networks Abbas, Nivine 09 November 2016 (has links) L'interférence entre stations de base est considérée comme le principal facteur limitant les performances des réseaux cellulaires. Nous nous intéressons aux différents schémas de coordination multi-point (CoMP) proposés dans la norme LTE-A pour y faire face, en tenant compte de l'aspect dynamique du trafic et de la mobilité des utilisateurs. Les résultats sont obtenus par l'analyse mathématique de modèles markoviens et par des simulations du système. Nous montrons l'importance de l'algorithme d'ordonnancement sur les performances en présence d'utilisateurs mobiles, pour des services de téléchargement de fichier et de streaming vidéo. Nous proposons un nouvel algorithme d'ordonnancement basé sur la dé-priorisation des utilisateurs mobiles se trouvant en bord de cellule, afin d'améliorer l'efficacité globale du système. Nous montrons ensuite qu'il est intéressant d'activer la technique dite Joint Processing uniquement dans un réseau à forte interférence, son activation dans un réseau à faible interférence pouvant conduire à une dégradation des performances. Nous proposons un nouveau mécanisme de coordination où une cellule ne coopère que lorsque sa coopération apporte un gain moyen de débit suffisant pour compenser les pertes de ressources engendrées. Nous considérons enfin la technique de formation de faisceaux coordonnée. Nous montrons notamment que la coordination n'est pas nécessaire lorsque l'on dispose d'un grand nombre d'antennes par station de base, un simple mécanisme d'ordonnancement opportuniste permettant d'obtenir des performances optimales. Pour un nombre limité d’antennes parstation de base, la coordination est nécessaire afin d’éviter l’interférence entre les faisceaux activés, et permet des gains de performance substantiels. / Interference is still the main limiting factor in cellular networks. We focus on the different coordinated multi-point schemes (CoMP) proposed in the LTE-A standard to cope with interference, taking into account the dynamic aspect of traffic and users’ mobility. The results are obtained by the analysis of Markov models and system-level simulations. We show the important impact of the scheduling strategy on the network performance in the presence of mobile users considering elastic traffic and video streaming. We propose a new scheduler that deprioritizes mobile users at the cell edge, in order to improve the overall system efficiency. We show that it is interesting to activate Joint Processing technique only in a high-interference network, its activation in a low-interference network may lead to performance degradation. We propose a new coordination mechanism, where a cell cooperates only when its cooperation brings a sufficient mean throughput gain, which compensates the extra resource consumption. Finally, we show that the coordination of beams is not necessary when a large number of antennas is deployed at each base station; a simple opportunistic scheduling strategy provides optimal performance. For a limited number of antennas per base station,coordination is necessary to avoid interference between the activated beams, allowing substantial performance gains. Réseaux cellulaires Mobilité Ordonnancement opportuniste Coordination multipoint CoMP Technique de formation de faisceaux Modélisation niveau flot du trafic Théorie des files d'attente Simulation système 2D beamforming Cellular data networks Mobility Opportunistic scheduling Coordinated multipoint CoMP Coordinated beamforming Flow-level modeling Queueing theory System-level simulations Faisceau 2D
245	Entropy Maximisation and Open Queueing Networks with Priority and Blocking. Kouvatsos, Demetres D., Awan, Irfan U. January 2003 (has links) No / A review is carried out on the characterisation and algorithmic implementation of an extended product-form approximation, based on the principle of maximum entropy (ME), for a wide class of arbitrary finite capacity open queueing network models (QNMs) with service and space priorities. A single server finite capacity GE/GE/1/N queue with R (R>1) distinct priority classes, compound Poisson arrival processes (CPPs) with geometrically distributed batches and generalised exponential (GE) service times is analysed via entropy maximisation, subject to suitable GE-type queueing theoretic constraints, under preemptive resume (PR) and head-of-line (HOL) scheduling rules combined with complete buffer sharing (CBS) and partial buffer sharing (PBS) management schemes stipulating a sequence of buffer thresholds {N=(N1,¿,NR),0<Ni¿Ni¿1,i=2,¿,R}. The GE/GE/1/N queue is utilised, in conjunction with GE-type first two moment flow approximation formulae, as a cost-effective building block towards the establishment of a generic ME queue-by-queue decomposition algorithm for arbitrary open QNMs with space and service priorities under repetitive service blocking with random destination (RS-RD). Typical numerical results are included to illustrate the credibility of the ME algorithm against simulation for various network topologies and define experimentally pessimistic GE-type performance bounds. Remarks on the extensions of the ME algorithm to other types of blocking mechanisms, such as repetitive service blocking with fixed destination (RS-FD) and blocking-after-service (BAS), are included. Queueing network models (QNMs); Maximum entropy (ME) principle Preemptive resume (PR) rule Head-of-line (HOL) rule Complete buffer sharing (CBS) scheme Partial buffer sharing (PBS) scheme Compound Poisson process (CPP); Blocking-after-service (BAS) mechanism

Page generated in 0.0469 seconds