• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 52
  • 19
  • 8
  • 5
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 104
  • 104
  • 17
  • 15
  • 14
  • 14
  • 11
  • 11
  • 10
  • 10
  • 9
  • 9
  • 8
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

HIGH-DIMENSIONAL INFERENCE OVER NETWORKS: STATISTICAL AND COMPUTATIONAL GUARANTEES

Yao Ji (19697335) 19 September 2024 (has links)
<p dir="ltr">Distributed optimization problems defined over mesh networks are ubiquitous in signal processing, machine learning, and control. In contrast to centralized approaches where all information and computation resources are available at a centralized server, agents on a distributed system can only use locally available information. As a result, efforts have been put into the design of efficient distributed algorithms that take into account the communication constraints and make coordinated decisions in a fully distributed manner from a pure optimization perspective. Given the massive sample size and high-dimensionality generated by distributed systems such as social media, sensor networks, and cloud-based databases, it is essential to understand the statistical and computational guarantees of distributed algorithms to solve such high-dimensional problems over a mesh network.</p><p dir="ltr">A goal of this thesis is a first attempt at studying the behavior of distributed methods in the high-dimensional regime. It consists of two parts: (I) distributed LASSO and (II) distributed stochastic sparse recovery.</p><p dir="ltr">In Part (I), we start by studying linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension, which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the problem, statistical and computational guarantees of most of them remain unclear in high dimensions. This thesis provides a first statistical study of the Distributed Gradient Descent (DGD) in the Adapt-Then-Combine (ATC) form. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions--which hold with high probability for standard data generation models--suitable conditions on the network connectivity and algorithm tuning, DGD-ATC converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model. In the worst-case scenario, the total number of communications to statistical optimality grows logarithmically with the ambient dimension, which improves on the communication complexity of DGD in the Combine-Then-Adapt (CTA) form, scaling linearly with the dimension. This reveals that mixing gradient information among agents, as DGD-ATC does, is critical in high-dimensions to obtain favorable rate scalings. </p><p dir="ltr">In Part (II), we focus on addressing the problem of distributed stochastic sparse recovery through stochastic optimization. We develop and analyze stochastic optimization algorithms for problems over a network, modeled as an undirected graph (with no centralized node), where the expected loss is strongly convex with respect to the Euclidean norm, and the optimum is sparse. Assuming agents only have access to unbiased estimates of the gradients of the underlying expected objective, and stochastic gradients are sub-Gaussian, we use distributed stochastic dual averaging (DSDA) as a building block to develop a fully decentralized restarting procedure for recovery of sparse solutions over a network. We show that with high probability, the iterates generated by all agents linearly converge to an approximate solution, eliminating fast the initial error; and then converge sublinearly to the exact sparse solution in the steady-state stages owing to observation noise. The algorithm asymptotically achieves the optimal convergence rate and favorable dimension dependence enjoyed by a non-Euclidean centralized scheme. Further, we precisely identify its non-asymptotic convergence rate as a function of characteristics of the objective functions and the network, and we characterize the transient time needed for the algorithm to approach the optimal rate of convergence. We illustrate the performance of the algorithm in application to classical problems of sparse linear regression, sparse logistic regression and low rank matrix recovery. Numerical experiments demonstrate the tightness of the theoretical results.</p>
82

Deep Learning Based Image Segmentation for Tumor Cell Death Characterization

Forsberg, Elise, Resare, Alexander January 2024 (has links)
This report presents a deep learning based approach for segmenting and characterizing tumor cell deaths using images provided by the Önfelt lab, which contain NK cells and HL60 leukemia cells. We explore the efficiency of convolutional neural networks (CNNs) in distinguishing between live and dead tumor cells, as well as different classes of cell death. Three CNN architectures: MobileNetV2, ResNet-18, and ResNet-50 were employed, utilizing transfer learning to optimize performance given the limited size of available datasets. The networks were trained using two loss functions: weighted cross-entropy and generalized dice loss and two optimizers: Adaptive moment estimation (Adam) and stochastic gradient descent with momentum (SGDM), with performance evaluations based on metrics such as mean accuracy, intersection over union (IoU), and BF score. Our results indicate that MobileNetV2 with cross-entropy loss and the Adam optimizer outperformed other configurations, demonstrating high mean accuracy. Challenges such as class imbalance, annotation bias, and dataset limitations are discussed, alongside potential future directions to enhance model robustness and accuracy. The successful training of networks capable of classifying all identified types of cell death, demonstrates the potential for a deep learning approach to identify different types of cell deaths as a tool for analyzing immunotherapeutic strategies and enhance understanding of NK cell behaviors in cancer treatment.
83

Μελέτη των RWA και IA-RWA μέσω γενετικών αλγορίθμων

Μονογιός, Δημήτρης 26 August 2009 (has links)
Η πρόσφατη τεχνολογική ανάπτυξη των οπτικών ενισχυτών, πολυπλεκτών/αποπλεκτών, οπτικών διακοπτών καθώς και άλλων οπτικών συσκευών μας οδηγεί στο να ελπίζουμε ότι σύντομα στο μέλλον θα υλοποιηθεί ένα πλήρες οπτικό (all optical), WDM (wavelength division multiplexing) δίκτυο που να ικανοποιεί και την ανάγκη για μεγάλα μεγέθη χωρητικότητας. Σε ένα τέτοιο δίκτυο η μετατροπή του οπτικού σήματος σε ηλεκτρονικό και εκ νέου στο οπτικό (ΟΕΟ) δεν θα χρησιμοποιείται στους ενδιάμεσους κόμβους, και αυτό συμβάλει σε οικονομικότερες υλοποιήσεις των οπτικών δικτύων. Σε ένα WDM δρομολογούμενο δίκτυο, τα δεδομένα μεταφέρονται μέσω ενός οπτικού καναλιού, lightpath, στους κόμβους του δικτύου που συνδέονται με οπτικές ίνες. Στις πλείστες των περιπτώσεων, κατά την άφιξη ενός lightpath σε κάποιο κόμβο, εφαρμόζεται σε αυτό οπτικό-ηλεκτρονική μετατροπή και αντίστροφα, ούτως ώστε το σήμα να αναδημιουργηθεί λόγω των απωλειών που υπέστη κατά την μεταφορά, ή ακόμη για να αναλυθεί από ενδιάμεσες ηλεκτρονικές συσκευές. Στα μη πλήρη οπτικά δίκτυα, η μεταφορά των δεδομένων γίνεται από κόμβο σε κόμβο κατά μήκος του δικτύου, ούτως ώστε το οπτικό σήμα να ενισχύεται και να αναγεννάτε μέσω της OEO επεξεργασίας. Παρ’ όλα αυτά, η κάθε ενδιάμεση ανάλυση του θέματος σε ένα τέτοιο δίκτυο προϋποθέτει πολύ μεγάλα κόστη λόγω των πολλών συσκευών που απαιτούνται για τη OEO επεξεργασία. Το γεγονός αυτό μας οδηγεί στα ημί-πλήρη δίκτυα όπου η ενίσχυση και αναγέννηση του θέματος δε γίνεται σε όλους τους ενδιάμεσους κόμβους αλλά σε μερικούς από αυτούς. Ο τελικός στόχος όμως είναι η απαλοιφή της ηλεκτρονικής μετατροπής και αυτό οδηγεί στην υλοποίηση των πλήρως οπτικών δικτύων. Στα πλήρη οπτικά δίκτυα, ένα σήμα που μεταδίδεται παραμένει, για όλο το lightpath, στο οπτικό επίπεδο. Έτσι, το πλήρες οπτικό δίκτυο μπορεί να απαλείψει την ασύμφορη OEO μετατροπή. Η αναζήτηση των κατάλληλων μονοπατιών με τα κατάλληλα μήκη κύματος που θα ικανοποιούσε ένα πλήρες οπτικό δίκτυο το οποίο δρομολογείται από ligthpaths, ονομάζεται Routing and Wavelength Assignment (RWA) και αποτελεί ένα από τα σημαντικότερα ζητήματα για το σωστό σχεδιασμό των οπτικών δικτύων τέτοιου είδους. Το πρόβλημα γίνεται ιδιαίτερα πολύπλοκο όταν στην τελική απόφαση θα πρέπει να συμπεριληφθούν και τα χαρακτηριστικά του φυσικού επιπέδου του δικτύου, όπως εξασθένιση του σήματος, μη γραμμικά φαινόμενα, διασπορά κ.ά, η συμβολή των οποίων στην τελική δρομολόγηση δεν θεωρείται αμελητέα (Impairment Aware Routing and Wavelength Assignment, ΙΑ-RWA). Σε αυτή την εργασία μελετάται το RWA πρόβλημα και προτείνεται ένας μονού στόχου γενετικός αλγόριθμος (Single Objective Genetic Algorithm - SOGA), ο οποίος επιλύει ικανοποιητικά το πρόβλημα θεωρώντας στατική κίνηση. Επιπλέον τονίζεται η σημασία των φυσικών παραμέτρων του προβλήματος και πως αυτές επηρεάζουν την απόδοση του πλήρους οπτικού δικτυου. Στη συνέχεια προτείνεται ένας νέος, πολλαπλών στόχων γενετικός αλγόριθμος (multi objective genetic algorithm – MOGA) ο οποίος βελτιστοποιεί τις λύσεις του προβλήματος ικανοποιητικά λαμβάνοντας ταυτόχρονα υπόψη, με έμμεσο τρόπο, και τις φυσικές παραμέτρους. Επίσης προτείνεται και ένας μονού στόχου γενετικός αλγόριθμος οποίος χρησιμοποιεί ένα εργαλείο αποτίμηση της ποιότητας μετάδοσης (Q-TOOL) σαν μέτρο κατά τη διαδικασία εύρεσης ικανοποιητικής λύσης. Το υπόλοιπο της εργασίας οργανώνεται ως ακολούθως: Στην ενότητα 2 παρουσιάζεται μια σύντομη αναφορά στα WDM δίκτυα καθώς και η περιγραφή του RWA και IA-RWA προβλήματος, ενώ στην ενότητα 3 παρουσιάζεται η πρόταση επίλυσης του RWA προβληματος με τη χρήση γενετικών αλγορίθμων. Ακολουθεί στην ενότητα 4 η πρότασή μας για επίλυση του IA-RWA προβλήματος με τη χρήση Multi-objective διαδικασιών βελτιστοποίησης, καθώς και η βελτιστοποίηση του προβλήματος με τη χρήση του Q-TOOL. Τέλος στην ενότητα 5 συνοψίζουμε την εργασία και παρουσιάζουμε τα συμπεράσματα. / The recent development of optical amplifiers, multiplexers / de-multiplexers, optical switches and other optical devices leads us to hope that soon in future all optical, WDM (wavelength division multiplexing) networks will be implemented which that will satisfy the needs for large capacity. In such networks a viable conversion of the optical -> Electronic and back to optical (OEO) will not be used at intermediate nodes, and this will contribute to efficient and economical implementation. The search for the appropriate paths with the appropriate wavelengths that meet the requirement in all optical networks is called Routing and Wavelength Assignment (RWA) and is one of the most important issues for proper design of such optical networks. The problem becomes particularly complex when the final decision should include the characteristics of the physical layer of the network, such as attenuation of the signal, nonlinear effects, dispersion, etc., whose contribution to the final result is not considered negligible (Impairment Aware Routing and Wavelength Assignment,IA-RWA). This work studies the RWA problem considering static traffic, and proposes a single-objective genetic algorithm (Single Objective Genetic Algorithm - SOGA), which resolves the problem satisfactorily. Furthermore the work stresses the importance of physical parameters of the problem and how these affect the performance of the all optical networks, and proposes a new, multi-objective genetic algorithm (MOGA) which optimizes the solution of IA-RWA problem adequately taking into account indirectly, and the physical impairments that affect the quality of the signal. In addition, a single objective genetic algorithm is proposed that uses a tool to assess the quality of the transmission signal (Q-TOOL), as a benchmark, in the process of optimization of the solution to the IA-RWA problem.
84

L’extraction de phrases en relation de traduction dans Wikipédia

Rebout, Lise 06 1900 (has links)
Afin d'enrichir les données de corpus bilingues parallèles, il peut être judicieux de travailler avec des corpus dits comparables. En effet dans ce type de corpus, même si les documents dans la langue cible ne sont pas l'exacte traduction de ceux dans la langue source, on peut y retrouver des mots ou des phrases en relation de traduction. L'encyclopédie libre Wikipédia constitue un corpus comparable multilingue de plusieurs millions de documents. Notre travail consiste à trouver une méthode générale et endogène permettant d'extraire un maximum de phrases parallèles. Nous travaillons avec le couple de langues français-anglais mais notre méthode, qui n'utilise aucune ressource bilingue extérieure, peut s'appliquer à tout autre couple de langues. Elle se décompose en deux étapes. La première consiste à détecter les paires d’articles qui ont le plus de chance de contenir des traductions. Nous utilisons pour cela un réseau de neurones entraîné sur un petit ensemble de données constitué d'articles alignés au niveau des phrases. La deuxième étape effectue la sélection des paires de phrases grâce à un autre réseau de neurones dont les sorties sont alors réinterprétées par un algorithme d'optimisation combinatoire et une heuristique d'extension. L'ajout des quelques 560~000 paires de phrases extraites de Wikipédia au corpus d'entraînement d'un système de traduction automatique statistique de référence permet d'améliorer la qualité des traductions produites. Nous mettons les données alignées et le corpus extrait à la disposition de la communauté scientifique. / Working with comparable corpora can be useful to enhance bilingual parallel corpora. In fact, in such corpora, even if the documents in the target language are not the exact translation of those in the source language, one can still find translated words or sentences. The free encyclopedia Wikipedia is a multilingual comparable corpus of several millions of documents. Our task is to find a general endogenous method for extracting a maximum of parallel sentences from this source. We are working with the English-French language pair but our method -- which uses no external bilingual resources -- can be applied to any other language pair. It can best be described in two steps. The first one consists of detecting article pairs that are most likely to contain translations. This is achieved through a neural network trained on a small data set composed of sentence aligned articles. The second step is to perform the selection of sentence pairs through another neural network whose outputs are then re-interpreted by a combinatorial optimization algorithm and an extension heuristic. The addition of the 560~000 pairs of sentences extracted from Wikipedia to the training set of a baseline statistical machine translation system improves the quality of the resulting translations. We make both the aligned data and the extracted corpus available to the scientific community.
85

Modélisation mathématique et simulation numérique de populations neuronales thalamo-corticales dans le contexte de l'anesthésie générale / Analytical and numerical studies of thalamo-cortical neural population models during general anesthesia

Hashemi, Meysam 14 January 2016 (has links)
Bien que l’anesthésie générale soit un outil indispensable dans la chirurgie médicale d’aujourd’hui, ses mécanismes sous-jacents précis sont encore inconnus. Au cours de la sédation induite par le propofol les actions anesthésiques à l’échelle microscopique du neurone isolé conduisent à des changements spécifiques à l’échelle macroscopique qui sont observables comme les signaux électroencéphalogrammes (EEG). Pour une concentration faible en propofol, ces changements caractéristiques comprennent une augmentation de l’activité dans les bandes de fréquence delta (0.5-4 Hz) et alpha (8 13 Hz) dans la région frontal, une l’activité augmentée de delta et une l’activité diminuée de alpha dans la région occipitale. Dans cette thèse, nous utilisons des modèles de populations neuronales thalamo-corticales basés sur des données expérimentales. Les effets de propofol sur les synapses et sur les récepteurs extra-synaptiques GABAergiques situés dans le cortex et le thalamus sont modélisés afin de comprendre les mécanismes sous-jacents aux changements observés dans certaines puissances de l’EEG spectrale. Il est démontré que les modèles reproduisent bien les spectrales caractéristiques observées expérimentalement. Une des conclusions principales de ce travail est que l’origine des delta rythmes est fondamentalement différente de celle des alpha rythmes. Nos résultats indiquent qu’en fonction des valeurs moyennes des potentiels de l’état du système au repos, une augmentation ou une diminution des fonctions de gain thalamo-corticale résulte respectivement en une augmentation ou une diminution de alpha puissance. En revanche, l’évolution de la delta puissance est plutôt indépendant de l’état du système au repos; l'amélioration de la puissance spectrale de delta bande résulte de l’inhibition GABAergique synaptique ou extra-synaptique pour les fonctions de gain non linéaire à la fois croissante et décroissante. De plus, nous cherchons à identifier les paramètres d’un modèle de thalamo-corticale en ajustant le spectre de puissance de modèle pour les enregistrements EEG. Pour ce faire, nous considérons la tâche de l’estimation des paramètres dans les modèles qui sont décrits par un ensemble d’équations différentielles ordinaires ou bien stochastiques avec retard. Deux études de cas portant sur des données pseudo-expérimentales bruyantes sont d’abord effectuées pour comparer les performances des différentes méthodes d’optimisation. Les résultats de cette élaboration montrent que la méthode utilisée dans cette étude est capable d’estimer avec précision les paramètres indépendants du modèle et cela nous permet d’éviter les coûts de calcul des intégrations numériques. En considérant l’ensemble, les conclusions de cette thèse apportent de nouveaux éclairages sur les mécanismes responsables des changements spécifiques qui sont observées pendant la sédation propofol-induite dans les modèles de EEG. / Although general anaesthesia is an indispensable tool in today’s medical surgery, its precise underlying mechanisms are still unknown. During the propofol-induced sedation, the anaesthetic actions on the microscopic single neuron scale lead to specific changes in macroscopic-scale observables such as electroencephalogram (EEG) signals. For low concentration of propofol these characteristic changes comprised increased activity in the delta (0.5-4 Hz) and alpha (8-13 Hz) frequency bands over the frontal head region, but increased delta and decreased alpha power activity over the occipital region. In this thesis, we employ thalamo-cortical neural population models, and based on the experimental data, the propofol effects on the synaptic and extrasynaptic GABAergic receptors located in the cortex and thalamus are modelized to understand the mechanisms underlying the observed certain changes in EEG-spectral power. It is shown that the models reproduce well the characteristic spectral features observed experimentally. A key finding of this work is that the origin of delta rhythm is fundamentally different from the alpha rhythm. Our results indicate that dependent on the mean potential values of the system resting states, an increase or decrease in the thalamo-cortical gain functions results in an increase or decrease in the alpha power, respectively. In contrast, the evolution of the delta power is rather independent of the system resting states; the enhancement of spectral power in the delta band results from the increased synaptic or extra-synaptic GABAergic inhibition for both increasing and decreasing nonlinear gain functions. Furthermore, we aim to identify the parameters of a thalamo-cortical model by fitting the model power spectrum to the EEG recordings. To this end, we address the task of parameter estimation in the models that are described by a set of stochastic ordinary or delay differential equations. Two case studies dealing with noisy pseudo-experimental data are first carried out to compare the performance of different optimization methods. The results of this elaboration show that the method used in this study is able to accurately estimate the independent model parameters while it allows us to avoid the computational costs of the numerical integrations. Taken together, the findings of this thesis provide new insights into the mechanisms responsible for the specific changes in EEG patterns that are observed during propofol-induced sedation.
86

A Unified, Configurable, Non-Iterative Guidance System For Launch Vehicles

Rajeev, U P 12 1900 (has links)
A satellite launch vehicle not subjected to any perturbations, external or internal, could be guided along a trajectory by following a stored, pre-computed steering program. In practice, perturbations do occur, and in order to take account of them and to achieve an accurate injection, a closed loop guidance system is required. Guidance algorithm is developed by solving the optimal control problem. Closed form solution is difficult because the necessary conditions are in the form of Two Point Boundary Value Problems (TBVP) or Multi Point Boundary Value Problems (MPBVP). Development of non-iterative guidance algorithm is taken as a prime objective of this thesis to ensure reliable on-board implementation. If non-iterative algorithms are required, the usual practice is to approximate the system equations to derive closed form solutions. In the present work, approximations cannot be used because the algorithm has to cater to a wide variety of vehicles and missions. Present development adopts an alternate approach by splitting the reconfigurable algorithm development in to smaller sub-problems such that each sub-problem has closed form solution. The splitting is done in such a way that the solution of the sub-problems can be used as building blocks to construct the final solution. By adding or removing the building blocks, the algorithm can be configured to suit specific requirements. Chapter 1 discusses the motivation and objectives of the thesis and gives a literature survey. In chapter 2, Classical Flat Earth (CFE) guidance algorithm is discussed. The assumptions and the nature of solution are closely analyzed because CFE guidance is used as the baseline for further developments. New contribution in chapter 2 is the extension of CFE guidance for a generalized propulsion system in which liquid and solid engines are present. In chapter 3, CFE guidance is applied for a mission with large pitch steering angles. The result shows loss of optimality and performance. An algorithm based on regular perturbation is developed to compensate for the small angle approximation. The new contribution in chapter 3 is the development of Regular Perturbation based FE (RPFE) guidance as an extension of CFE guidance. RPFE guidance can be configured as CFE guidance and FEGP. Algorithms presented up to chapter 3 are developed to inject a satellite in to orbits with unspecified inertial orientation. Communication satellite missions demand injection in to an orbit with a specific inertial orientation defined by argument of perigee. This problem is formulated using Calculus of Variations in chapter 4. A non-iterative closed form solution (Predicted target Flat Earth or PFE guidance) is derived for this problem. In chapter 5, PFE guidance is extended to a multi-stage vehicle with a constraint on the impact point of spent lower stage. Since the problem is not analytically solvable, the original problem is split in to three sub-problems and solved. Chapter 6 has two parts. First part gives theoretical analysis of the sub-optimal strategies with special emphasis to guidance. Behavior of predicted terminal error and control commands in presence of plant approximations are theoretically analyzed for a class of optimal control problems and the results are presented as six theorems. Chapter 7 presents the conclusions and future works.
87

L’extraction de phrases en relation de traduction dans Wikipédia

Rebout, Lise 06 1900 (has links)
Afin d'enrichir les données de corpus bilingues parallèles, il peut être judicieux de travailler avec des corpus dits comparables. En effet dans ce type de corpus, même si les documents dans la langue cible ne sont pas l'exacte traduction de ceux dans la langue source, on peut y retrouver des mots ou des phrases en relation de traduction. L'encyclopédie libre Wikipédia constitue un corpus comparable multilingue de plusieurs millions de documents. Notre travail consiste à trouver une méthode générale et endogène permettant d'extraire un maximum de phrases parallèles. Nous travaillons avec le couple de langues français-anglais mais notre méthode, qui n'utilise aucune ressource bilingue extérieure, peut s'appliquer à tout autre couple de langues. Elle se décompose en deux étapes. La première consiste à détecter les paires d’articles qui ont le plus de chance de contenir des traductions. Nous utilisons pour cela un réseau de neurones entraîné sur un petit ensemble de données constitué d'articles alignés au niveau des phrases. La deuxième étape effectue la sélection des paires de phrases grâce à un autre réseau de neurones dont les sorties sont alors réinterprétées par un algorithme d'optimisation combinatoire et une heuristique d'extension. L'ajout des quelques 560~000 paires de phrases extraites de Wikipédia au corpus d'entraînement d'un système de traduction automatique statistique de référence permet d'améliorer la qualité des traductions produites. Nous mettons les données alignées et le corpus extrait à la disposition de la communauté scientifique. / Working with comparable corpora can be useful to enhance bilingual parallel corpora. In fact, in such corpora, even if the documents in the target language are not the exact translation of those in the source language, one can still find translated words or sentences. The free encyclopedia Wikipedia is a multilingual comparable corpus of several millions of documents. Our task is to find a general endogenous method for extracting a maximum of parallel sentences from this source. We are working with the English-French language pair but our method -- which uses no external bilingual resources -- can be applied to any other language pair. It can best be described in two steps. The first one consists of detecting article pairs that are most likely to contain translations. This is achieved through a neural network trained on a small data set composed of sentence aligned articles. The second step is to perform the selection of sentence pairs through another neural network whose outputs are then re-interpreted by a combinatorial optimization algorithm and an extension heuristic. The addition of the 560~000 pairs of sentences extracted from Wikipedia to the training set of a baseline statistical machine translation system improves the quality of the resulting translations. We make both the aligned data and the extracted corpus available to the scientific community.
88

Optimization Algorithms for Deterministic, Stochastic and Reinforcement Learning Settings

Joseph, Ajin George January 2017 (has links) (PDF)
Optimization is a very important field with diverse applications in physical, social and biological sciences and in various areas of engineering. It appears widely in ma-chine learning, information retrieval, regression, estimation, operations research and a wide variety of computing domains. The subject is being deeply studied both theoretically and experimentally and several algorithms are available in the literature. These algorithms which can be executed (sequentially or concurrently) on a computing machine explore the space of input parameters to seek high quality solutions to the optimization problem with the search mostly guided by certain structural properties of the objective function. In certain situations, the setting might additionally demand for “absolute optimum” or solutions close to it, which makes the task even more challenging. In this thesis, we propose an optimization algorithm which is “gradient-free”, i.e., does not employ any knowledge of the gradient or higher order derivatives of the objective function, rather utilizes objective function values themselves to steer the search. The proposed algorithm is particularly effective in a black-box setting, where a closed-form expression of the objective function is unavailable and gradient or higher-order derivatives are hard to compute or estimate. Our algorithm is inspired by the well known cross entropy (CE) method. The CE method is a model based search method to solve continuous/discrete multi-extremal optimization problems, where the objective function has minimal structure. The proposed method seeks, in the statistical manifold of the parameters which identify the probability distribution/model defined over the input space to find the degenerate distribution concentrated on the global optima (assumed to be finite in quantity). In the early part of the thesis, we propose a novel stochastic approximation version of the CE method to the unconstrained optimization problem, where the objective function is real-valued and deterministic. The basis of the algorithm is a stochastic process of model parameters which is probabilistically dependent on the past history, where we reuse all the previous samples obtained in the process till the current instant based on discounted averaging. This approach can save the overall computational and storage cost. Our algorithm is incremental in nature and possesses attractive features such as stability, computational and storage efficiency and better accuracy. We further investigate, both theoretically and empirically, the asymptotic behaviour of the algorithm and find that the proposed algorithm exhibits global optimum convergence for a particular class of objective functions. Further, we extend the algorithm to solve the simulation/stochastic optimization problem. In stochastic optimization, the objective function possesses a stochastic characteristic, where the underlying probability distribution in most cases is hard to comprehend and quantify. This begets a more challenging optimization problem, where the ostentatious nature is primarily due to the hardness in computing the objective function values for various input parameters with absolute certainty. In this case, one can only hope to obtain noise corrupted objective function values for various input parameters. Settings of this kind can be found in scenarios where the objective function is evaluated using a continuously evolving dynamical system or through a simulation. We propose a multi-timescale stochastic approximation algorithm, where we integrate an additional timescale to accommodate the noisy measurements and decimate the effects of the gratuitous noise asymptotically. We found that if the objective function and the noise involved in the measurements are well behaved and the timescales are compatible, then our algorithm can generate high quality solutions. In the later part of the thesis, we propose algorithms for reinforcement learning/Markov decision processes using the optimization techniques we developed in the early stage. MDP can be considered as a generalized framework for modelling planning under uncertainty. We provide a novel algorithm for the problem of prediction in reinforcement learning, i.e., estimating the value function of a given stationary policy of a model free MDP (with large state and action spaces) using the linear function approximation architecture. Here, the value function is defined as the long-run average of the discounted transition costs. The resource requirement of the proposed method in terms of computational and storage cost scales quadratically in the size of the feature set. The algorithm is an adaptation of the multi-timescale variant of the CE method proposed in the earlier part of the thesis for simulation optimization. We also provide both theoretical and empirical evidence to corroborate the credibility and effectiveness of the approach. In the final part of the thesis, we consider a modified version of the control problem in a model free MDP with large state and action spaces. The control problem most commonly addressed in the literature is to find an optimal policy which maximizes the value function, i.e., the long-run average of the discounted transition payoffs. The contemporary methods also presume access to a generative model/simulator of the MDP with the hidden premise that observations of the system behaviour in the form of sample trajectories can be obtained with ease from the model. In this thesis, we consider a modified version, where the cost function to be optimized is a real-valued performance function (possibly non-convex) of the value function. Additionally, one has to seek the optimal policy without presuming access to the generative model. In this thesis, we propose a stochastic approximation algorithm for this peculiar control problem. The only information, we presuppose, available to the algorithm is the sample trajectory generated using a priori chosen behaviour policy. The algorithm is data (sample trajectory) efficient, stable, robust as well as computationally and storage efficient. We provide a proof of convergence of our algorithm to a high performing policy relative to the behaviour policy.
89

An?lise e otimiza??o de superf?cies seletivas de Frequ?ncia utilizando redes neurais artificiais e algoritmos de otimiza??o natural

Cruz, Rossana Moreno Santa 28 September 2009 (has links)
Made available in DSpace on 2014-12-17T14:54:53Z (GMT). No. of bitstreams: 1 RossanaMSC.pdf: 3237270 bytes, checksum: 01cfb4de4da5c1c94fba895ebbbdddb1 (MD5) Previous issue date: 2009-09-28 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / The bidimensional periodic structures called frequency selective surfaces have been well investigated because of their filtering properties. Similar to the filters that work at the traditional radiofrequency band, such structures can behave as band-stop or pass-band filters, depending on the elements of the array (patch or aperture, respectively) and can be used for a variety of applications, such as: radomes, dichroic reflectors, waveguide filters, artificial magnetic conductors, microwave absorbers etc. To provide high-performance filtering properties at microwave bands, electromagnetic engineers have investigated various types of periodic structures: reconfigurable frequency selective screens, multilayered selective filters, as well as periodic arrays printed on anisotropic dielectric substrates and composed by fractal elements. In general, there is no closed form solution directly from a given desired frequency response to a corresponding device; thus, the analysis of its scattering characteristics requires the application of rigorous full-wave techniques. Besides that, due to the computational complexity of using a full-wave simulator to evaluate the frequency selective surface scattering variables, many electromagnetic engineers still use trial-and-error process until to achieve a given design criterion. As this procedure is very laborious and human dependent, optimization techniques are required to design practical periodic structures with desired filter specifications. Some authors have been employed neural networks and natural optimization algorithms, such as the genetic algorithms and the particle swarm optimization for the frequency selective surface design and optimization. This work has as objective the accomplishment of a rigorous study about the electromagnetic behavior of the periodic structures, enabling the design of efficient devices applied to microwave band. For this, artificial neural networks are used together with natural optimization techniques, allowing the accurate and efficient investigation of various types of frequency selective surfaces, in a simple and fast manner, becoming a powerful tool for the design and optimization of such structures / As estruturas planares peri?dicas bidimensionais, conhecidas como Superf?cies Seletivas de Frequ?ncia, t?m sido bastante estudadas por causa da propriedade de filtragem de frequ?ncia que apresentam. Similares aos filtros que operam na faixa tradicional de radiofrequ?ncia, tais estruturas podem apresentar caracter?sticas espectrais de filtros rejeitafaixa ou passa-faixa, dependendo do tipo de elemento do arranjo (patch ou abertura, respectivamente) e podem ser utilizadas em uma variedade de aplica??es, tais como radomes, refletores dicr?icos, filtros de micro-ondas, condutores magn?ticos artificiais, absorvedores etc. Para melhorar o desempenho de tais dispositivos eletromagn?ticos e investigar suas propriedades, muitos estudiosos t?m analisado v?rios tipos de estruturas peri?dicas: superf?cies seletivas de frequ?ncia reconfigur?veis, filtros de m?ltiplas camadas seletivas, al?m de arranjos peri?dicos impressos sobre substratos diel?tricos anisotr?picos e que utilizam geometrias fractais na sua forma??o. Em geral, n?o existe uma solu??o anal?tica diretamente extra?da a partir da resposta em frequ?ncia de um dispositivo; desta forma, a an?lise de suas caracter?sticas espectrais requer a aplica??o de t?cnicas de onda completa rigorosas, como o m?todo da equa??o integral, por exemplo. Al?m disso, devido ? complexidade computacional exigida para a implementa??o destes m?todos, muitos estudiosos ainda utilizam a investiga??o por tentativa e erro, para alcan?ar crit?rios satisfat?rios ao projeto dos dispositivos. Como este procedimento ? muito trabalhoso e dependente do homem, faz-se necess?rio o emprego de t?cnicas de otimiza??o que acelerem a obten??o de estruturas peri?dicas com especifica??es de filtragem desejadas. Alguns autores t?m utilizado redes neurais artificiais e algoritmos de otimiza??o natural, como os algoritmos gen?ticos e a otimiza??o por enxame de part?culas no projeto e otimiza??o das superf?cies seletivas de frequ?ncia. Este trabalho tem como objetivo realizar um estudo mais aprofundado sobre o comportamento eletromagn?tico das estruturas peri?dicas seletivas de frequ?ncia, possibilitando a obten??o de dispositivos eficientes e aplic?veis na faixa de micro-ondas. P ra isto, redes neurais artificiais s?o utilizadas em conjunto com t?cnicas de otimiza??o baseadas na natureza, permitindo a investiga??o precisa e eficiente de v?rios tipos de superf?cies seletivas de frequ?ncia, de forma simples e r?pida, tornando-se, portanto, uma poderosa ferramenta de projeto e otimiza??o de tais estruturas
90

Sistema de inferência Fuzzy para classificação de distúrbios em sinais elétricos

Aguiar, Eduardo Pestana de 30 August 2011 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-04-24T12:12:06Z No. of bitstreams: 1 eduardopestanadeaguiar.pdf: 1937921 bytes, checksum: 0472ffffb70cabf120dc5de86d6626b1 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-04-24T16:55:00Z (GMT) No. of bitstreams: 1 eduardopestanadeaguiar.pdf: 1937921 bytes, checksum: 0472ffffb70cabf120dc5de86d6626b1 (MD5) / Made available in DSpace on 2017-04-24T16:55:00Z (GMT). No. of bitstreams: 1 eduardopestanadeaguiar.pdf: 1937921 bytes, checksum: 0472ffffb70cabf120dc5de86d6626b1 (MD5) Previous issue date: 2011-08-30 / A presente dissertação tem como objetivo discutir o uso de técnicas de otimização baseadas no gradiente conjugado e de informações de segunda ordem para o treinamento de sistemas de inferência fuzzy singleton e non-singleton. Além disso, as soluções computacionais derivadas são aplicadas aos problemas de classificação de distúrbios múltiplos e isolados em sinais elétricos. Os resultados computacionais, obtidos a partir de dados sintéticos de distúrbios em sinais de tensão, indicam que os sistemas de inferência fuzzy singleton e non-singleton treinados pelos algoritmos de otimização considerados apresentam maior velocidade de convergência e melhores taxas de classificação quando comparados com aqueles treinados pelo algoritmo de otimização baseada em informações de primeira ordem e é bastante competitivo em relação à rede neural artificial perceptron multicamadas - multilayer perceptron (MLP) e ao classificador de Bayes. / This master dissertation aims to discuss the use of optimization techniques based on the conjugated gradient and on second order information for the training of singleton or non-singleton fuzzy inference systems. In addition, the computacional solutions obtained are applied to isolated a multiple disturbances classification problems in electric signals. Computational results obtained from synthetic data from disturbances in electric signals indicate that singleton or non-singleton fuzzy inference systems trained by the considered optimization algorithms present greater convergence speed and better classification rates when compared to those data trained by an optimization algorithm based on first order information and is quite competitive with multilayer perceptron neural network and Bayesian classifier.

Page generated in 0.5202 seconds