• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 684
  • 252
  • 79
  • 57
  • 42
  • 37
  • 30
  • 26
  • 25
  • 14
  • 9
  • 8
  • 7
  • 7
  • 7
  • Tagged with
  • 1504
  • 1030
  • 249
  • 238
  • 223
  • 215
  • 195
  • 185
  • 167
  • 163
  • 151
  • 124
  • 123
  • 122
  • 111
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1181

Gestão de projetos de P&D no IPEN: diagnóstico e sugestões ao Escritório de Projetos (PMO) / Project management of R&D in IPEN - Diagnosis and Suggestions to the Project Management Office (PMO)

Hannes, Egon Martins 12 March 2015 (has links)
O presente trabalho pretende entender a dinâmica do gerenciamento de projetos no IPEN. Para tal, decidiu-se pela pesquisa junto a literatura acadêmica de modelos que pudessem servir de base e que após modificações e ajustes pudessem refletir a realidade dos projetos de Institutos Públicos Pesquisa & Desenvolvimento. Após tratamento estatístico dos dados algumas hipóteses foram validadas e demonstraram sua influência positiva no desempenho do gerenciamento do projeto, tais como a influência das pessoas que compõem as equipes, o efeito da liderança, dentre outras. O modelo, inclusive mostrou-se válido para explicar quais fatores são relevantes para o sucesso dos projetos. Um das principais objetivos, foi exatamente o uso de modelo de avaliação de gestão projetos, que fossem passíveis de validação estatística, e não utilizar um dos disponíveis no mercado, tais como P3M3 e OPM3, para que houvesse um controle e confirmação estatística dos resultados. Outro objetivo foi utilizar um modelo cujas assertivas refletissem a natureza dos projetos de Pesquisa & Desenvolvimento gerenciados pelos pesquisadores do IPEN. Aliás, as referidas assertivas foram formuladas, e enviadas via pesquisa web, e respondidas por praticamente uma centena de profissionais do IPEN, envolvidos com projetos de P&D. A presente dissertação, acrescida das recomendações, ao final, tem como proposta servir de contribuição para os trabalhos desenvolvidos pelo Escritório de Projetos do IPEN. O modelo de avaliação, contido neste trabalho, pode ser aplicado em outras Instituições de P&D brasileiras, para que avaliem a forma e a maneira como gerenciam os seus respectivos projetos. / This paper aims to understand the dynamics involved in the project management at IPEN. To reach this goal, the method chosen was research along with academic literature of models that could serve as a base that after modifications and adjustments could reflect the reality of projects from the Public Institute of Research & Development. After undergoing statistical treatment of the data, some hypotheses were validated and showed positive influence on the project management performance, such as the influence of people who make up the teams, the leadership effect, among others. In fact, the model was found to be valid in explaining which factors are relevant for the success of the projects. One of the main goals was exactly the use of the project management evaluation model, submitted to statistical validation and not to use one available on the market, such as the P3M3 and OPM3, in order to assure the statistical control and confirmation of the results. Another goal was to use a model whose statements reflected the nature of the Research & Development project managed by researchers at IPEN. In fact, the aforementioned statements were formulated and sent via a web survey and answered by almost one hundred IPEN professionals who work on R&D projects. The following dissertation, along with the recommendations at the end, was included to serve as contribution to work developed by the IPEN Project Offices. The evaluation model included in this paper can be applied in other R&D organizations in Brazil, to evaluate the way their projects are managed.
1182

Algoritmos recursivos e não-recursivos aplicados à estimação fasorial em sistemas elétricos de potência / Recursive and non-recursive algorithms applied to power systems phasor estimation

Rocha, Rodolfo Varraschim 12 May 2016 (has links)
Este trabalho apresenta uma análise de algoritmos computacionais aplicados à estimação de fasores elétricos em SEPs. A medição dos fasores é realizada por meio da alocação de Unidades de Medição Fasorial nestes sistemas e encontra diversas aplicações nas áreas de operação, controle, proteção e planejamento. Para que os fasores possam ser aplicados, são definidos padrões de medição, sincronização e comunicação, por meio da norma IEEE C37.118.1. A norma apresenta os padrões de mensagens, timetag, fasores, sistema de sincronização, e define testes para avaliar a estimação. Apesar de abranger todos esses critérios, a diretriz não define um algoritmo de estimação padrão, abrindo espaço para uso de diversos métodos, desde que a precisão seja atendida. Nesse contexto, o presente trabalho analisa alguns algoritmos de estimação de fasores definidos na literatura, avaliando o comportamento deles em determinados casos. Foram considerados, dessa forma, os métodos: Transformada Discreta de Fourier, Método dos Mínimos Quadrados e Transformada Wavelet Discreta, nas versões recursivas e não-recursivas. Esses métodos foram submetidos a sinais sintéticos, a fim de verificar o comportamento diante dos testes propostos pela norma, avaliando o Total Vector Error, tempo de resposta e atraso e overshoot. Os algoritmos também foram embarcados em um hardware, denominado PC104, e avaliados de acordo com os sinais medidos pelo equipamento na saída analógica de um simulador em tempo real (Real Time Digital Simulator). / This work presents an analysis of computational algorithms applied to phasor estimation in Electrical Power Systems. The phasor estimation process uses the allocation of Phasor Measurement Units in the system and the measures can be used in many control, operation, planing and protection applications. Therefore, the power system phasors are very useful, specially if they have a common time reference, allowing the determination of the system\'s condition at a given time. The procedures necessary for power system\'s phasors estimation and application are defined by IEEE C37.118.1 standard. The standard defines the requirements for phasor estimation, presenting tests and a methodology to evaluate the algorithms performance. Thus, the standard defines the time tag and data patterns, some synchronization methods, and message examples, simplifying the communication requirements. Despite defining all these parts, the standard does not state which estimation algorithm should be used, making room for the use of various methods, since the standard precision is met. In this context, this work analyzes some phasor estimation algorithms defined in the literature, evaluating their behavior for some cases. It was adopted the recursive and non-recursive versions of the methods: Discrete Fourier Transform, Least Squares and Discrete Wavelet Transform. They were submitted to the standard signals, evaluating the Total Vector Error, time delays, and overshoots. The algorithms were also embedded in hardware (named PC104) and evaluated by real time simulated signals, measured by the PC104 using the analog outputs of a Real Time Digital Simulator.
1183

Uma abordagem estatística para o modelo do preço spot da energia elétrica no submercado sudeste/centro-oeste brasileiro / A statistical approach to model the spot price of electric energy: evidende from brazilian southeas/middle-west subsystem.

Ramalho, Guilherme Matiussi 20 March 2014 (has links)
O objetivo deste trabalho e o desenvolvimento de uma ferramenta estatistica que sirva de base para o estudo do preco spot da energia eletrica do subsistema Sudeste/Centro-Oeste do Sistema Interligado Nacional, utilizando a estimacao por regressao linear e teste de razao de verossimilhanca como instrumentos para desenvolvimento e avaliacao dos modelos. Na analise dos resultados estatsticos descritivos dos modelos, diferentemente do que e observado na literatura, a primeira conclusao e a verificacao de que as variaveis sazonais, quando analisadas isoladamente, apresentam resultados pouco aderentes ao preco spot PLD. Apos a analise da componente sazonal e verificada a influencia da energia fornecida e a energia demandada como variaveis de entrada, com o qual conclui-se que especificamente a energia armazenada e producao de energia termeletrica sao as variaveis que mais influenciam os precos spot no subsistema estudado. Entre os modelos testados, o que particularmente ofereceu os melhores resultados foi um modelo misto criado a partir da escolha das melhores variaveis de entrada dos modelos testados preliminarmente, alcancando um coeficiente de determinacao R2 de 0.825, resultado esse que pode ser considerado aderente ao preco spot. No ultimo capitulo e apresentada uma introducao ao modelo de predicao do preco spot, possibilitando dessa forma a analise do comportamento do preco a partir da alteracao das variaveis de entrada. / The objective of this work is the development of a statistical method to study the spot prices of the electrical energy of the Southeast/Middle-West (SE-CO) subsystem of the The Brazilian National Connected System, using the Least Squares Estimation and Likelihood Ratio Test as tools to perform and evaluate the models. Verifying the descriptive statistical results of the models, differently from what is observed in the literature, the first observation is that the seasonal component, when analyzed alone, presented results loosely adherent to the spot price PLD. It is then evaluated the influence of the energy supply and the energy demand as input variables, verifying that specifically the stored water and the thermoelectric power production are the variables that the most influence the spot prices in the studied subsystem. Among the models, the one that offered the best result was a mixed model created from the selection of the best input variables of the preliminarily tested models, achieving a coeficient of determination R2 of 0.825, a result that can be considered adherent to the spot price. At the last part of the work It is presented an introduction to the spot price prediction model, allowing the analysis of the price behavior by the changing of the input variables.
1184

Estimação de estado: a interpretação geométrica aplicada ao processamento de erros grosseiros em medidas / Study of systems with optical orthogonal multicarrier and consistent

Carvalho, Breno Elias Bretas de 22 March 2013 (has links)
Este trabalho foi proposto com o objetivo de implementar um programa computacional para estimar os estados (tensões complexas nodais) de um sistema elétrico de potência (SEP) e aplicar métodos alternativos para o processamento de erros grosseiros (EGs), baseados na interpretação geométrica dos erros e no conceito de inovação das medidas. Através da interpretação geométrica, BRETAS et al. (2009), BRETAS; PIERETI (2010), BRETAS; BRETAS; PIERETI (2011) e BRETAS et al. (2013) demonstraram matematicamente que o erro da medida se compõe de componentes detectáveis e não detectáveis, e ainda que a componente detectável do erro é exatamente o resíduo da medida. As metodologias até então utilizadas, para o processamento de EGs, consideram apenas a componente detectável do erro, e como consequência, podem falhar. Na tentativa de contornar essa limitação, e baseadas nos trabalhos citados previamente, foram estudadas e implementadas duas metodologias alternativas para processar as medidas portadoras de EGs. A primeira, é baseada na análise direta das componentes dos erros das medidas; a segunda, de forma similar às metodologias tradicionais, é baseada na análise dos resíduos das medidas. Entretanto, o diferencial da segunda metodologia proposta reside no fato de não considerarmos um valor limiar fixo para a detecção de medidas com EGs. Neste caso, adotamos um novo valor limiar (TV, do inglês: Threshold Value), característico de cada medida, como apresentado no trabalho de PIERETI (2011). Além disso, com o intuito de reforçar essa teoria, é proposta uma forma alternativa para o cálculo destes valores limiares, através da análise da geometria da função densidade de probabilidade da distribuição normal multivariável, referente aos resíduos das medidas. / This work was proposed with the objective of implementing a computer program to estimate the states (complex nodal voltages) in an electrical power system (EPS) and apply alternative methods for processing gross errors (GEs), based on the geometrical interpretation of the measurements errors and the innovation concept. Through the geometrical interpretation, BRETAS et al. (2009), BRETAS; PIERETI (2010), BRETAS; BRETAS; PIERETI (2011) and BRETAS et al. (2013) proved mathematically that the measurement error is composed of detectable and undetectable components, and also showed that the detectable component of the error is exactly the residual of the measurement. The methods hitherto used, for processing GEs, consider only the detectable component of the error, then as a consequence, may fail. In an attempt to overcome this limitation, and based on the works cited previously, were studied and implemented two alternative methodologies for process measurements with GEs. The first one is based on the direct analysis of the components of the errors of the measurements, the second one, in a similar way to the traditional methods, is based on the analysis of the measurements residuals. However, the differential of the second proposed methodology lies in the fact that it doesn\'t consider a fixed threshold value for detecting measurements with GEs. In this case, we adopted a new threshold value (TV ) characteristic of each measurement, as presented in the work of PIERETI (2011). Furthermore, in order to reinforce this theory, we propose an alternative way to calculate these thresholds, by analyzing the geometry of the probability density function of the multivariate normal distribution, relating to the measurements residuals.
1185

Sur un problème inverse en pressage de matériaux biologiques à structure cellulaire / On an inverse problem in pressing of biological materials with cellular structure

Ahmed Bacha, Rekia Meriem 19 October 2018 (has links)
Cette thèse, proposée dans le cadre du projet W2P1-DECOL (SAS PIVERT), financée par le ministère de l’enseignement supérieur est consacrée à l’étude d’un problème inverse de pressage des matériaux biologiques à structure cellulaire. Le but est d’identifier connaissant les mesures du flux d’huile sortant, le coefficient de consolidation du gâteau de pressage et l’inverse du temps caractéristique de consolidation sur deux niveaux : au niveau de la graine de colza et au niveau du gâteau de pressage. Dans un premier temps, nous présentons un système d’équations paraboliques modélisant le problème de pressage des matériaux biologiques à structure cellulaire, il découle de l’équation de continuité de la loi de Darcy et d’autres hypothèses simplificatrices. Puis l’analyse théorique et numérique du modèle direct est faite dans le cas linéaire. Enfin la méthode des différences finies est utilisée pour le discrétiser. Dans un second temps, nous introduisons le problème inverse du pressage où l’étude de l’identifiabilité de ce problème est résolue par une méthode spectrale. Par la suite, nous nous intéressons à l’étude de stabilité lipschitzienne locale et globale. De plus une estimation de stabilité lipschitzienne globale, pour le problème inverse de paramètres, dans le cas du système d’équations paraboliques, à partir des mesures sur ]0,T[ est établie. Enfin l’identification des paramètres est résolue par deux méthodes, l’une basée sur l’adaptation de la méthode algébrique et l’autre formulée comme la minimisation au sens des moindres carrés d’une fonctionnelle évaluant l’écart entre les mesures et les résultats du modèle direct, la résolution de ce problème inverse se fait en utilisant un algorithme itératif BFGS, l’algorithme est validé puis testé numériquement dans le cas des graines de colza, en utilisant des mesures synthétiques. Il donne des résultats très satisfaisants, malgré les difficultés rencontrés à manipuler et exploiter les données expérimentales. / This thesis, proposed in the framework of the W2P1-DECOL project (SAS PIVERT) and funded by the Ministry of Higher Education, is devoted to the study an inverse problem of pressing biological materials with a cellular structure. The aim is to identify, of the outgoing oil flow, the coefficient of consolidation of the pressing cake and the inverse of the characteristic time of consolidation on two levels : at the level of the rapeseed and at the level of the pressing cake. First, we present a system of parabolic equations modeling the pressing problem of biological materials with cellular structure; it follows from the continuity equation of Darcy’s law and other simplifying hypotheses. Then a theoretical and numerical analysis of a direct model is made in the linear case. Finally the finite difference method is usedt o discretize it. In a second step, we introduce the inverse problem of the pressing where the study of the identifiability of this problem is solved by a spectral method. Later we are interested in the study of local and global Lipschitizian stability. Moreover, global Lipschitz stability estimate for the inverse problem of parameters in the case of the system of parabolic equations from the measures on ]0,T[ is established. Finally, the identification of the parameters is solved by two methods; one based on the adaptation of the algebraic method and the other formulated as the minimization in the least squares sense of a functional evaluating the difference between measurements and the results of the direct model; the resolution of this inverse problem is done using an iterative algorithm BFGS, the algorithm is validated and then tested numerically in the case of rapeseeds, using synthetic measures. It gives very satisfactory results, despite the difficulties encountered in handling and exploiting the experimental data.
1186

Méthodes isogéométriques pour les équations aux dérivées partielles hyperboliques / Isogeometric methods for hyperbolic partial differential equations

Gdhami, Asma 17 December 2018 (has links)
L’Analyse isogéométrique (AIG) est une méthode innovante de résolution numérique des équations différentielles, proposée à l’origine par Thomas Hughes, Austin Cottrell et Yuri Bazilevs en 2005. Cette technique de discrétisation est une généralisation de l’analyse par éléments finis classiques (AEF), conçue pour intégrer la conception assistée par ordinateur (CAO), afin de combler l’écart entre la description géométrique et l’analyse des problèmes d’ingénierie. Ceci est réalisé en utilisant des B-splines ou des B-splines rationnelles non uniformes (NURBS), pour la description des géométries ainsi que pour la représentation de champs de solutions inconnus.L’objet de cette thèse est d’étudier la méthode isogéométrique dans le contexte des problèmes hyperboliques en utilisant les fonctions B-splines comme fonctions de base. Nous proposons également une méthode combinant l’AIG avec la méthode de Galerkin discontinue (GD) pour résoudre les problèmes hyperboliques. Plus précisément, la méthodologie de GD est adoptée à travers les interfaces de patches, tandis que l’AIG traditionnelle est utilisée dans chaque patch. Notre méthode tire parti de la méthode de l’AIG et la méthode de GD.Les résultats numériques sont présentés jusqu’à l’ordre polynomial p= 4 à la fois pour une méthode deGalerkin continue et discontinue. Ces résultats numériques sont comparés pour un ensemble de problèmes de complexité croissante en 1D et 2D. / Isogeometric Analysis (IGA) is a modern strategy for numerical solution of partial differential equations, originally proposed by Thomas Hughes, Austin Cottrell and Yuri Bazilevs in 2005. This discretization technique is a generalization of classical finite element analysis (FEA), designed to integrate Computer Aided Design (CAD) and FEA, to close the gap between the geometrical description and the analysis of engineering problems. This is achieved by using B-splines or non-uniform rational B-splines (NURBS), for the description of geometries as well as for the representation of unknown solution fields.The purpose of this thesis is to study isogeometric methods in the context of hyperbolic problems usingB-splines as basis functions. We also propose a method that combines IGA with the discontinuous Galerkin(DG)method for solving hyperbolic problems. More precisely, DG methodology is adopted across the patchinterfaces, while the traditional IGA is employed within each patch. The proposed method takes advantageof both IGA and the DG method.Numerical results are presented up to polynomial order p= 4 both for a continuous and discontinuousGalerkin method. These numerical results are compared for a range of problems of increasing complexity,in 1D and 2D.
1187

三向資料的主成分分析 / 3-way data principal component analysis

趙湘琪, Chao, Hsiang Chi Unknown Date (has links)
傳統的主成分分析(principal component analysis)法,只能分析二式二向的資料(2-mode 2-way data),若是要處裡三向三式的資料(3-mode 3-way data)或是更多維的資料,則必須用其它的方法。例如將某一向資料取平均數,再做分析。此法雖然可行,但卻忽略三向資料間可能潛藏的相關性。且社會科學的研究日趨複雜,三向資料也就更常見到,而我們可能也對三向資料間彼此的關聯感到興趣。因此在1960、1970年代,學者開始研究將主成分分析的模型加以擴展成適合分析三向資料的模型。本文除了介紹三向資料主成分分析所使用的Tucker3模型及其參數估計法外,也以28家股票上市公司為實例,探討資本結構影響因素於五年間(1989~1993年)在不同公司群組間的變化情形。
1188

以最小平方法處理有限離散型條件分配相容性問題 / Addressing the compatibility issues of finite discrete conditionals by the least squares approach

李宛靜, Lee, Wan Ching Unknown Date (has links)
給定兩個有限離散型條件分配,我們可以去探討有關相容性及唯一性的問題。Tian et al.(2009)提出一個統合的方法,將相容性的問題轉換成具限制條件的線性方程系統(以邊際機率為未知數),並藉由 l_2-距離測量解之誤差,進而求出最佳解來。他們也提出了電腦數值計算法在檢驗相容性及唯一性時的準則。 由於 Tian et al.(2009)的方法是把邊際機率和為 1 的條件放置在線性方程系統中,從理論的觀點來看,我們認為該條件在此種做法下未必會滿足。因此,本文中將邊際機率和為 1 的條件從線性方程系統中抽離出來,放入限制條件中,再對修正後的問題求最佳解。 我們提出了兩個解決問題的方法:(一) LRG 法;(二) 干擾參數法。LRG 法是先不管機率值在 0 與 1 之間的限制,在邊際機率和為 1 的條件下,利用 Lagrange 乘數法導出解的公式,之後再利用 Rao-Ghangurde 法進行修正,使解滿足機率值在 0 與 1 之間的要求。干擾參數法是在 Lagrange 乘數法公式解中有關廣義逆矩陣的計算部份引進了微量干擾值,使近似的逆矩陣及解可快速求得。理論證明,引進干擾參數所增加的誤差不超過所選定的干擾值,易言之,由干擾參數法所求出的解幾近最佳解。故干擾參數法在處理相容性問題上,是非常實用、有效的方法。從進一步分析Lagrange 乘數法公式解的過程中,我們也發現了檢驗條件分配"理論"相容的充分條件。 最後,為了驗證 LRG 法與干擾參數法的可行性,我們利用 MATLAB 設計了程式來處理求解過程中的運算,並以 Tian et al.(2009)文中四個可涵蓋各種情況的範例來解釋說明處理的流程,同時將所獲得的結果和 Tian et al. 的結果做比較。 / Given two finite discrete conditional distributions, we could study the compatibility and uniqueness issues. Tian et al.(2009) proposed a unified method by converting the compatibility problem into a system of linear equations with constraints, in which marginal probability values are assumed unknown. It locates the optimum solution by means of the error of l_2 - discrepancy. They also provided criteria for determining the compatibility and uniqueness. Because the condition of sum of the marginal probability values being equal to one is in Tian et al.s’linear system, it might not be fulfilled by the optimum solution. By separating this condition from the linear system and adding into constraints, we would look for the optimum solution after modification. We propose two new methods: (1) LRG method and (2) Perturbation method. LRG method ignores the requirement of the probability values being between zero and one initially, it then uses the Lagrange multipliers method to derive the solution for a quadratic optimization problem subject to the sum of the marginal probability values being equal to 1. Afterward we use the Rao-Ghangurde method to modify the computed value to meet the requirement. The perturbation method introduces tiny perturbation parameter in finding the generalized inverse for the optimum solution obtained by the Lagrange multipliers method. It can be shown that the increased error is less than the perturbation value introduced. Thus it is a practical and effective method in dealing with compatibility issues. We also find some sufficient conditions for checking the compatibility of conditional distributions from further analysis on the solution given by Lagrange multipliers method. To show the feasibilities of LRG method and Perturbation method, we use MATLAB to device a program to conduct them. Several numerical examples raised by Tian et al.(2009) in their article are applied to illustrate our methods. Some comparisons with their method are also presented.
1189

粒子群最佳化演算法於估測基礎矩陣之應用 / Particle swarm optimization algorithms for fundamental matrix estimation

劉恭良, Liu, Kung Liang Unknown Date (has links)
基礎矩陣在影像處理是非常重要的參數,舉凡不同影像間對應點之計算、座標系統轉換、乃至重建物體三維模型等問題,都有賴於基礎矩陣之精確與否。本論文中,我們提出一個機制,透過粒子群最佳化的觀念來求取基礎矩陣,我們的方法不但能提高基礎矩陣的精確度,同時能降低計算成本。 我們從多視角影像出發,以SIFT取得大量對應點資料後,從中選取8點進行粒子群最佳化。取樣時,我們透過分群與隨機挑選以避免選取共平面之點。然後利用最小平方中值表來估算初始評估值,並遵循粒子群最佳化演算法,以最小疊代次數為收斂準則,計算出最佳之基礎矩陣。 實作中我們以不同的物體模型為標的,以粒子群最佳化與最小平方中值法兩者結果比較。實驗結果顯示,疊代次數相同的實驗,粒子群最佳化演算法估測基礎矩陣所需的時間,約為最小平方中值法來估測所需時間的八分之一,同時粒子群最佳化演算法估測出來的基礎矩陣之平均誤差值也優於最小平方中值法所估測出來的結果。 / Fundamental matrix is a very important parameter in image processing. In corresponding point determination, coordinate system conversion, as well as three-dimensional model reconstruction, etc., fundamental matrix always plays an important role. Hence, obtaining an accurate fundamental matrix becomes one of the most important issues in image processing. In this paper, we present a mechanism that uses the concept of Particle Swarm Optimization (PSO) to find fundamental matrix. Our approach not only can improve the accuracy of the fundamental matrix but also can reduce computation costs. After using Scale-Invariant Feature Transform (SIFT) to get a large number of corresponding points from the multi-view images, we choose a set of eight corresponding points, based on the image resolutions, grouping principles, together with random sampling, as our initial starting points for PSO. Least Median of Squares (LMedS) is used in estimating the initial fitness value as well as the minimal number of iterations in PSO. The fundamental matrix can then be computed using the PSO algorithm. We use different objects to illustrate our mechanism and compare the results obtained by using PSO and using LMedS. The experimental results show that, if we use the same number of iterations in the experiments, the fundamental matrix computed by the PSO method have better estimated average error than that computed by the LMedS method. Also, the PSO method takes about one-eighth of the time required for the LMedS method in these computations.
1190

Itération sur les Politiques Optimiste et Apprentissage du Jeu de Tetris

Thiery, Christophe 25 November 2010 (has links) (PDF)
Cette thèse s'intéresse aux méthodes d'itération sur les politiques dans l'apprentissage par renforcement à grand espace d'états avec approximation linéaire de la fonction de valeur. Nous proposons d'abord une unification des principaux algorithmes du contrôle optimal stochastique. Nous montrons la convergence de cette version unifiée vers la fonction de valeur optimale dans le cas tabulaire, ainsi qu'une garantie de performances dans le cas où la fonction de valeur est estimée de façon approximative. Nous étendons ensuite l'état de l'art des algorithmes d'approximation linéaire du second ordre en proposant une généralisation de Least-Squares Policy Iteration (LSPI) (Lagoudakis et Parr, 2003). Notre nouvel algorithme, Least-Squares λ Policy Iteration (LSλPI), ajoute à LSPI un concept venant de λ-Policy Iteration (Bertsekas et Ioffe, 1996) : l'évaluation amortie (ou optimiste) de la fonction de valeur, qui permet de réduire la variance de l'estimation afin d'améliorer l'efficacité de l'échantillonnage. LSλPI propose ainsi un compromis biais-variance réglable qui peut permettre d'améliorer l'estimation de la fonction de valeur et la qualité de la politique obtenue. Dans un second temps, nous nous intéressons en détail au jeu de Tetris, une application sur laquelle se sont penchés plusieurs travaux de la littérature. Tetris est un problème difficile en raison de sa structure et de son grand espace d'états. Nous proposons pour la première fois une revue complète de la littérature qui regroupe des travaux d'apprentissage par renforcement, mais aussi des techniques de type évolutionnaire qui explorent directement l'espace des politiques et des algorithmes réglés à la main. Nous constatons que les approches d'apprentissage par renforcement sont à l'heure actuelle moins performantes sur ce problème que des techniques de recherche directe de la politique telles que la méthode d'entropie croisée (Szita et Lőrincz, 2006). Nous expliquons enfin comment nous avons mis au point un joueur de Tetris qui dépasse les performances des meilleurs algorithmes connus jusqu'ici et avec lequel nous avons remporté l'épreuve de Tetris de la Reinforcement Learning Competition 2008.

Page generated in 0.0295 seconds