Global ETD Search

41	Melhorias de estabilidade numérica e custo computacional de aproximadores de funções valor de estado baseados em estimadores RLS para projeto online de sistemas de controle HDP-DLQR / Numerical Stability and Computational Cost Implications of State Value Functions based on RLS Estimators for Online Design of HDP-DLQR control systems Ferreira, Ernesto Franklin Marçal 08 March 2016 (has links) Submitted by Rosivalda Pereira (mrs.pereira@ufma.br) on 2017-06-23T20:34:27Z No. of bitstreams: 1 ErnestoFerreira.pdf: 1744167 bytes, checksum: c125c90e5eb2aab2618350567f88cb31 (MD5) / Made available in DSpace on 2017-06-23T20:34:27Z (GMT). No. of bitstreams: 1 ErnestoFerreira.pdf: 1744167 bytes, checksum: c125c90e5eb2aab2618350567f88cb31 (MD5) Previous issue date: 2016-03-08 / The development and the numerical stability analysis of a new adaptive critic algorithm to approximate the state-value function for online discrete linear quadratic regulator (DLQR) optimal control system design based on heuristic dynamic programming (HDP) are presented in this work. The proposed algorithm makes use of unitary transformations and QR decomposition methods to improve the online learning e-ciency in the critic network through the recursive least-squares (RLS) approach. The developed learning strategy provides computational performance improvements in terms of numerical stability and computational cost which aim at making possible the implementations in real time of optimal control design methodology based upon actor-critic reinforcement learning paradigms. The convergence behavior and numerical stability of the proposed online algorithm, called RLSµ-QR-HDP-DLQR, are evaluated by computational simulations in three Multiple-Input and Multiple-Output (MIMO) models, that represent the automatic pilot of an F-16 aircraft of third order, a fourth order RLC circuit with two input voltages and two controllable voltage levels, and a doubly-fed induction generator with six inputs and six outputs for wind energy conversion systems. / Neste trabalho, apresenta-se o desenvolvimento e a análise da estabilidade numérica de um novo algoritmo crítico adaptativo para aproximar a função valor de estado para o projeto do sistema de controle ótimo online, utilizando o regulador linear quadrático discreto (DLQR), com base em programação dinâmica heurística (HDP). O algoritmo proposto faz uso de transformações unitárias e métodos de decomposição QR para melhorar a e-ciência da aprendizagem online na rede crítica por meio da abordagem dos mínimos quadrados recursivos (RLS). A estratégia de aprendizagem desenvolvida fornece melhorias no desempenho computacional em termos de estabilidade numérica e custo computacional, que visam tornar possíveis as implementações em tempo real da metodologia do projeto de controle ótimo com base em paradigmas de aprendizado por reforço ator-crítico. O comportamento de convergência e estabilidade numérica do algoritmo online proposto, denominado RLSµ-QR-HDP-DLQR, são avaliados por meio de simulações computacionais em três modelos Múltiplas-Entradas e Múltiplas-Saídas (MIMO), que representam o piloto automático de uma aeronave F-16 de terceira ordem, um circuito de quarta ordem RLC com duas tensões de entrada e dois níveis de tensão controláveis, e um gerador de indução duplamente alimentados com seis entradas e seis saídas para sistemas de conversão de energia eólica. Programação Dinâmica Aprendizagem por Reforço Programa ção Dinâmica Heurística Controle Multivariável Controle Ótimo Regulador Linear Quadrático Discreto Mínimos Quadrados Recursivos Decomposição QR Dynamic Programming Reinforcement Learning Heuristic Dynamic Programming Multivariable Control Optimal Control Discrete Linear Quadratic Regulator Recursive Least-Squares Engenharia de Software
42	Aprendizagem por Reforço e Programação Dinâmica Aproximada para Controle Ótimo: Uma Abordagem para o Projeto Online do Regulador Linear Quadrático Discreto com Programação Dinâmica Heurística Dependente de Estado e Ação. / Reinforcement and Programming Learning Approximate Dynamics for Optimal Control: An Approach to the Linear Regulator Online Project Discrete Quadratic with Heuristic Dynamic Programming Dependent on State and Action. RÊGO, Patrícia Helena Moraes 24 July 2014 (has links) Submitted by Maria Aparecida (cidazen@gmail.com) on 2017-08-30T15:33:12Z No. of bitstreams: 1 Patricia Helena.pdf: 11110405 bytes, checksum: ca1f067231658f897d84b86181dbf1b9 (MD5) / Made available in DSpace on 2017-08-30T15:33:12Z (GMT). No. of bitstreams: 1 Patricia Helena.pdf: 11110405 bytes, checksum: ca1f067231658f897d84b86181dbf1b9 (MD5) Previous issue date: 2014-07-24 / In this thesis a proposal of an uni ed approach of dynamic programming, reinforcement learning and function approximation theories aiming at the development of methods and algorithms for design of optimal control systems is presented. This approach is presented in the approximate dynamic programming context that allows approximating the optimal feedback solution as to reduce the computational complexity associated to the conventional dynamic programming methods for optimal control of multivariable systems. Speci cally, in the state and action dependent heuristic dynamic programming framework, this proposal is oriented for the development of online approximated solutions, numerically stable, of the Riccati-type Hamilton-Jacobi-Bellman equation associated to the discrete linear quadratic regulator problem which is based on a formulation that combines value function estimates by means of a RLS (Recursive Least-Squares) structure, temporal di erences and policy improvements. The development of the proposed methodologies, in this work, is focused mainly on the UDU T factorization that is inserted in this framework to improve the RLS estimation process of optimal decision policies of the discrete linear quadratic regulator, by circumventing convergence and numerical stability problems related to the covariance matrix ill-conditioning of the RLS approach. / Apresenta-se nesta tese uma proposta de uma abordagem uni cada de teorias de programação dinâmica, aprendizagem por reforço e aproximação de função que tem por objetivo o desenvolvimento de métodos e algoritmos para projeto online de sistemas de controle ótimo. Esta abordagem é apresentada no contexto de programação dinâmica aproximada que permite aproximar a solução de realimentação ótima de modo a reduzir a complexidade computacional associada com métodos convencionais de programação dinâmica para controle ótimo de sistemas multivariáveis. Especi camente, no quadro de programação dinâmica heurística e programação dinâmica heurística dependente de ação, esta proposta é orientada para o desenvolvimento de soluções aproximadas online, numericamente estáveis, da equação de Hamilton-Jacobi-Bellman do tipo Riccati associada ao problema do regulador linear quadrático discreto que tem por base uma formulação que combina estimativas da função valor por meio de uma estrutura RLS (do inglês Recursive Least-Squares), diferenças temporais e melhorias de política. O desenvolvimento das metodologias propostas, neste trabalho, tem seu foco principal voltado para a fatoração UDU T que é inserida neste quadro para melhorar o processo de estimação RLS de políticas de decisão ótimas do regulador linear quadrá- tico discreto, contornando-se problemas de convergência e estabilidade numérica relacionados com o mal condicionamento da matriz de covariância da abordagem RLS.
43	Χωροχρονικές τεχνικές επεξεργασίας σήματος σε ασύρματα τηλεπικοινωνιακά δίκτυα / Space -Time signal processing techniques for wireless communication networks Κεκάτος, Βασίλειος 25 October 2007 (has links) Τα τελευταία χρόνια χαρακτηρίζονται από μια αλματώδη ανάπτυξη των προϊόντων και υπηρεσιών που βασίζονται στα δίκτυα ασύρματης επικοινωνίας, ενώ προκύπτουν σημαντικές ερευνητικές προκλήσεις. Τα συστήματα πολλαπλών κεραιών στον πομπό και στο δέκτη, γνωστά και ως συστήματα MIMO (multi-input multi-output), καθώς και η τεχνολογία πολλαπλής προσπέλασης με χρήση κωδικών (code division multiple access, CDMA) αποτελούν δύο από τα βασικά μέτωπα ανάπτυξης των ασύρματων τηλεπικοινωνιών. Στα πλαίσια της παρούσας διδακτορικής διατριβής, ασχοληθήκαμε με την ανάπτυξη και μελέτη αλγορίθμων επεξεργασίας σήματος για τα δύο παραπάνω συστήματα, όπως περιγράφεται αναλυτικά παρακάτω. Σχετικά με τα συστήματα MIMO, η πρωτοποριακή έρευνα που πραγματοποιήθηκε στα Bell Labs γύρω στα 1996, όπου αναπτύχθηκε η αρχιτεκτονική BLAST (Bell Labs Layered Space-Time), απέδειξε ότι η χρήση πολλαπλών κεραιών μπορεί να οδηγήσει σε σημαντική αύξηση της χωρητικότητας των ασύρματων συστημάτων. Προκειμένου να αξιοποιηθούν οι παραπάνω δυνατότητες, απαιτείται η σχεδίαση σύνθετων δεκτών MIMO. Προς αυτήν την κατεύθυνση, έχει προταθεί ένας μεγάλος αριθμός μεθόδων ισοστάθμισης του καναλιού. Ωστόσο, οι περισσότερες από αυτές υποθέτουν ότι το ασύρματο κανάλι είναι: 1) χρονικά σταθερό, 2) συχνοτικά επίπεδο (δεν εισάγει διασυμβολική παρεμβολή), και κυρίως 3) ότι είναι γνωστό στο δέκτη. Δεδομένου ότι σε ευρυζωνικά συστήματα μονής φέρουσας οι παραπάνω υποθέσεις είναι δύσκολο να ικανοποιηθούν, στραφήκαμε προς τις προσαρμοστικές μεθόδους ισοστάθμισης. Συγκεκριμένα, αναπτύξαμε τρεις βασικούς αλγορίθμους. Ο πρώτος αλγόριθμος αποτελεί έναν προσαρμοστικό ισοσταθμιστή ανάδρασης αποφάσεων (decision feedback equalizer, DFE) για συχνοτικά επίπεδα κανάλια ΜΙΜΟ. Ο προτεινόμενος MIMO DFE ακολουθεί την αρχιτεκτονική BLAST, και ανανεώνεται με βάση τον αλγόριθμο αναδρομικών ελαχίστων τετραγώνων (RLS) τετραγωνικής ρίζας. Ο ισοσταθμιστής μπορεί να παρακολουθήσει ένα χρονικά μεταβαλλόμενο κανάλι, και, από όσο γνωρίζουμε, έχει τη χαμηλότερη πολυπλοκότητα από όλους τους δέκτες BLAST που έχουν προταθεί έως σήμερα. Ο δεύτερος αλγόριθμος αποτελεί την επέκταση του προηγούμενου σε συχνοτικά επιλεκτικά κανάλια. Μέσω κατάλληλης μοντελοποίησης του προβλήματος ισοστάθμισης, οδηγηθήκαμε σε έναν αποδοτικό DFE για ευρυζωνικά κανάλια MIMO. Τότε, η διαδικασία της ισοστάθμισης εμφανίζει προβλήματα αριθμητικής ευστάθειας, που λόγω της υλοποίησης RLS τετραγωνικής ρίζας αντιμετωπίστηκαν επιτυχώς. Κινούμενοι προς την κατεύθυνση περαιτέρω μείωσης της πολυπλοκότητας, προτείναμε έναν προσαρμοστικό MIMO DFE που ανανεώνεται με βάση τον αλγόριθμο ελαχίστων μέσων τετραγώνων (LMS) υλοποιημένο εξ ολοκλήρου στο πεδίο της συχνότητας. Με χρήση του ταχύ μετασχηματισμού Fourier (FFT), μειώνεται η απαιτούμενη πολυπλοκότητα. Παράλληλα, η μετάβαση στο πεδίο των συχνοτήτων έχει ως αποτέλεσμα την προσεγγιστική διαγωνοποίηση του συστήματος, προσφέροντας ανεξάρτητη ανανέωση των φίλτρων ανά συχνοτική συνιστώσα και επιτάχυνση της σύγκλισης του αλγορίθμου. Ο προτεινόμενος ισοσταθμιστής πετυχαίνει μια καλή ανταλλαγή μεταξύ απόδοσης και πολυπλοκότητας. Παράλληλα με τα παραπάνω, ασχοληθήκαμε με την εκτίμηση του ασύρματου καναλιού σε ένα ασύγχρονο σύστημα CDMA. Το βασικό σενάριο είναι ότι ο σταθμός βάσης γνωρίζει ήδη τους ενεργούς χρήστες, και καλείται να εκτιμήσει τις παραμέτρους του καναλιού ανερχόμενης ζεύξης ενός νέου χρήστη που εισέρχεται στο σύστημα. Το πρόβλημα περιγράφεται από μια συνάρτηση ελαχίστων τετραγώνων, η οποία είναι γραμμική ως προς τα κέρδη του καναλιού, και μη γραμμική ως προς τις καθυστερήσεις του. Αποδείξαμε ότι το πρόβλημα έχει μια προσεγγιστικά διαχωρίσιμη μορφή, και προτείναμε μια επαναληπτική μέθοδο υπολογισμού των παραμέτρων. Ο προτεινόμενος αλγόριθμος δεν απαιτεί κάποια ειδική ακολουθία διάχυσης και λειτουργεί αποδοτικά ακόμη και για περιορισμένη ακολουθία εκπαίδευσης. Είναι εύρωστος στην παρεμβολή πολλαπλών χρηστών και περισσότερο ακριβής από μια υπάρχουσα μέθοδο εις βάρος μιας ασήμαντης αύξησης στην υπολογιστική πολυπλοκότητα. / Over the last decades, a dramatic progress in the products and services based on wireless communication networks has been observed, while, at the same time, new research challenges arise. The systems employing multiple antennas at the transmitter and the receiver, known as MIMO (multi-input multi-output) systems, as well as code division multiple access (CDMA) systems, are two of the main technologies employed for the evolution of wireless communications. During this PhD thesis, we worked on the design and analysis of signal processing algorithms for the two above systems, as it is described in detail next. Concerning the MIMO systems, the pioneering work performed at Bell Labs around 1996, where the BLAST (Bell Labs Layered Space-Time) architecture has been developed, proved that by using multiple antennas can lead to a significant increase in wireless systems capacity. To exploit this potential, sophisticated MIMO receivers should be designed. To this end, a large amount of channel equalizers has been proposed. However, most of these methods assume that the wireless channel is: 1) static, 2) frequency flat (no intersymbol interference is introduced), and mainly 3) it is perfectly known at the receiver. Provided that in high rate single carrier systems these assumptions are difficult to be met, we focused our attention on adaptive equalization methods. More specifically, three basic algorithms have been developed. The first algorithm is an adaptive decision feedback equalizer (DFE) for frequency flat MIMO channels. The proposed MIMO DFE implements the BLAST architecture, and it is updated by the recursive least squares (RLS) algorithm in its square root form. The new equalizer can track time varying channels, and, to the best of our knowledge, it has the lowest computational complexity among the BLAST receivers that have been proposed up to now. The second algorithm is an extension of the previous one to the frequency selective channel case. By proper modeling of the equalization problem, we arrived at an efficient DFE for wideband MIMO channels. In this case, the equalization process encounters numerical instability problems, which were successfully treated by the square root RLS implementation employed. To further reduce complexity, we proposed an adaptive MIMO DFE that is updated by the least mean square (LMS) algorithm, fully implemented in the frequency domain. By using the fast Fourier transform (FFT), the complexity required is considerably reduced. Moreover, the frequency domain implementation leads to an approximate decoupling of the equalization problem at each frequency bin. Thus, an independent update of the filters at each frequency bin allows for a faster convergence of the algorithm. The proposed equalizer offers a good performance - complexity tradeoff. Furthermore, we worked on channel estimation for an asynchronous CDMA system. The assumed scenario is that the base station has already acquired all the active users, while the uplink channel parameters of a new user entering the system should be estimated. The problem can be described via a least squares cost function, which is linear with respect to the channel gains, and non linear to its delays. We proved that the problem is approximately decoupled, and a new iterative parameter estimation method has been proposed. The suggested method does not require any specific pilot sequence and performs well even for a short training interval. It is robust to multiple access interference and more accurate compared to an existing method, at the expense of an insignificant increase in computational complexity. Ισοστάθμιση καναλιού Εκτίμηση καναλιού Παραγοντοποίηση Cholesky Δέκτης 621.382 2 Channel equalization Multiple antennas systems Channel estimation Recursive least squares Least mean squares Cholesky factorization Intersymbol interference Receiver Code division multiple access CDMA MIMO RLS LMS Decision feedback equalizer DFE
44	[en] TS-TARX: TREE STRUCTURED - THRESHOLD AUTOREGRESSION WITH EXTERNAL VARIABLES / [pt] TS-TARX: UM MODELO DE REGRESSÃO COM LIMIARES BASEADO EM ÁRVORE DE DECISÃO CHRISTIAN NUNES ARANHA 28 January 2002 (has links) [pt] Este trabalho propõe um novo modelo linear por partes para a extração de regras de conhecimento de banco de dados. O modelo é uma heurística baseada em análise de árvore de regressão, como introduzido por Friedman (1979) e discutido em detalhe por Breiman (1984). A motivação desta pesquisa é trazer uma nova abordagem combinando técnicas estatísticas de modelagem e um algoritmo de busca por quebras eficiente. A decisão de quebra usada no algoritmo de busca leva em consideração informações do ajuste de equações lineares e foi implementado tendo por inspiração o trabalho de Tsay (1989). Neste, ele sugere um procedimento para construção um modelo para a análise de séries temporais chamado TAR (threshold autoregressive model), introduzido por Tong (1978) e discutido em detalhes por Tong e Lim (1980) e Tong (1983). O modelo TAR é um modelo linear por partes cuja idéia central é alterar os parâmetros do modelo linear autoregressivo de acordo com o valor de uma variável observada, chamada de variável limiar. No trabalho de Tsay, a Identificação do número e localização do potencial limiar era baseada na analise de gráficos. A idéia foi então criar um novo algoritmo todo automatizado. Este processo é um algoritmo que preserva o método de regressão por mínimos quadrados recursivo (MQR) usado no trabalho de Tsay. Esta talvez seja uma das grandes vantagens da metodologia introduzida neste trabalho, visto que Cooper (1998) em seu trabalho de análise de múltiplos regimes afirma não ser possível testar cada quebra. Da combinação da árvore de decisão com a técnica de regressão (MQR), o modelo se tornou o TS-TARX (Tree Structured - Threshold AutoRegression with eXternal variables). O procedimento consiste numa busca em árvore binária calculando a estatística F para a seleção das variáveis e o critério de informação BIC para a seleção dos modelos. Ao final, o algoritmo gera como resposta uma árvore de decisão (por meio de regras) e as equações de regressão estimadas para cada regime da partição. A principal característica deste tipo de resposta é sua fácil interpretação. O trabalho conclui com algumas aplicações em bases de dados padrões encontradas na literatura e outras que auxiliarão o entendimento do processo implementado. / [en] This research work proposes a new piecewise linear model to extract knowledge rules from databases. The model is an heuristic based on analysis of regression trees, introduced by Friedman (1979) and discussed in detail by Breiman (1984). The motivation of this research is to come up with a new approach combining both statistical modeling techniques and an efficient split search algorithm. The split decision used in the split search algorithm counts on information from adjusted linear equation and was implemented inspired by the work of Tsay (1989). In his work, he suggests a model-building procedure for a nonlinear time series model called by TAR (threshold autoregressive model), first proposed by Tong (1978) and discussed in detail by Tong and Lim (1980) and Tong (1983). The TAR model is a piecewise linear model which main idea is to set the coefficients of a linear autoregressive process in accordance with a value of observed variable, called by threshold variable. Tsay`s identification of the number and location of the potential thresholds was based on supplementary graphic devices. The idea is to get the whole process automatic on a new model-building process. This process is an algorithm that preserves the method of regression by recursive least squares (RLS) used in Tsay`s work. This regression method allowed the test of all possibilities of data split. Perhaps that is the main advantage of the methodology introduced in this work, seeing that Cooper, S. (1998) said about the impossibility of testing each break.Thus, combining decision tree methodology with a regression technique (RLS), the model became the TS-TARX (Tree Structured - Threshold AutoRegression with eXternal variables). It searches on a binary tree calculating F statistics for variable selection and the information criteria BIC for model selection. In the end, the algorithm produces as result a decision tree and a regression equation adjusted to each regime of the partition defined by the decision tree. Its major advantage is easy interpretation.This research work concludes with some applications in benchmark databases from literature and others that helps the understanding of the algorithm process. [pt] MINERACAO DE DADOS [en] DATA MINING [pt] REGRESSAO [en] REGRESSION [pt] REGRESSAO NAO LINEAR [en] NONLINEAR REGRESSION [pt] LINEAR POR PARTES [en] PIECEWISE MODEL [pt] ARVORE DE DECISAO [en] DECISION TREE [pt] ARVORE DE REGRESSAO [en] REGRESSION TREE [pt] MINIMOS QUADRADOS RECURSIVO [en] RECURSIVE LEAST SQUARES [pt] FACIL INTERPRETACAO [en] EASY INTERPRETATION
45	Odhad Letových Parametrů Malého Letounu / Light Airplane Flight Parameters Estimation Dittrich, Petr Unknown Date (has links) Tato práce je zaměřena na odhad letových parametrů malého letounu, konkrétně letounu Evektor SportStar RTC. Pro odhad letových parametrů jsou použity metody "Equation Error Method", "Output Error Method" a metody rekurzivních nejmenších čtverců. Práce je zaměřena na zkoumání charakteristik aerodynamických parametrů podélného pohybu a ověření, zda takto odhadnuté letové parametry odpovídají naměřeným datům a tudíž vytvářejí předpoklad pro realizaci dostatečně přesného modelu letadla. Odhadnuté letové parametry jsou dále porovnávány s a-priorními hodnotami získanými s využitím programů Tornado, AVL a softwarovéverze sbírky Datcom. Rozdíly mezi a-priorními hodnotami a odhadnutými letovými paramatery jsou porovnány s korekcemi publikovanými pro subsonické letové podmínky modelu letounu F-18 Hornet.

Page generated in 0.068 seconds