• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 19
  • 19
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 51
  • 30
  • 21
  • 19
  • 18
  • 17
  • 16
  • 16
  • 15
  • 13
  • 9
  • 9
  • 9
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Optimization of Virtual Power Plantin Nordic Electricity Market

Desu, Jwalith January 2019 (has links)
With the world becoming more conscious about achieving 1.5-degree scenario as promisedby the most powerful economies of the world, much needed push was received by the renewable energy technology providers.  This has led to an increased a share of energy production from renewables and a decrease in the fossil-based energy production with the overall energy production.  As a result, a large share of inertia of the system is lost and a big challenge in the name of flexibility is presented to the world of energy.  Virtual Power Plant is quite a novel and new concept to address the new generation challenge of flexibility and can offer various other benefits like competitivity,reliability, accessibility etc.  In this thesis, a commercial virtual power plant is studied by developing a mixed integer linear model to emulate the trading for short term markets with the risk mea- sures in a Nordic Electricity Framework.  Further, the developed model is implemented in a quite a new mathematical programming language known as “Julia”.  The model is implemented using a hypothetical portfolio consisting of a dispatchable unit, a battery system and a wind farm in the SE3 bidding zone of Sweden.  An investigation on varia- tion of imbalance costs in three different modes also has been carried out, to demonstratethe advantage of such a virtual power plant concept in reducing the imbalance costs. / För att uppfylla 1,5-gradersmålet som beslutats av världens ledande ekonomier har olika‌typer av förnybar energiproduktion fått ett stort uppsving. Detta har lett till ökad energiproduktion från förnybara källor och minskad energiproduktion från fossila källor. För elsystemen innebär en högre andel förnybar produktion minskad svängmassa ochökat behov av flexibilitet för att kompensera för variationen hos förnybara energikällor. Virtuella kraftverk är ett nytt koncept för att tillgodose behovet av flexibilitet och kanäven ge andra fördelar som konkurrenskraft och tillförlitlighet. I denna uppsats studeras ett virtuellt kraftverk genom att utveckla en optimeringsmodell för att emulera handeln i elmarknader med riskmått inom ett ramverk för den nordiska elmarknaden. Modellen implementeras i det nya programmeringsspråket Julia. Modellen innehåller en hypotetisk blandning av resurser bestående av ett planerbart kraftverk, ett batterisystem och en vindpark i elområdet SE3 i Sverige. Balanseringskostnaderna i tre olika modeller undersöks för att visa potentialen hos det virtuella kraftverket att minska dessa kostnader.
32

控制風險值下的最適投資組合

洪幸資 Unknown Date (has links)
採用風險值取代標準差來衡量投資組合的下方風險,除了更符合投資人的對風險的態度,也更貼近目前金融機構多以風險值作為內部控管工具的情形。但除了風險的事後衡量,本篇論文希望能夠事前積極地控制投資組合風險值,求得最適投資組合的各資產配置權重。故本篇論文研究方法採用了Rockafellar and Uryasev.(2000)的極小條件風險值最適投資組合模型先建立Mean-CVaR效率前緣,並將此效率前緣上的投資組合風險以風險值衡量,再應用電腦上的探索方法進一步求得風險值更低的投資組合,逼近求得Mean-VaR效率前緣,最後利用Mean-VaR效率前緣採用Campbell,Huisman與Koedijk(2001)模型求得控制風險值下的最適投資組合。 在實證分析上,本篇論文採用國內三檔股票為標的,首先在實證標的資產報酬檢定為非常態分配下,使用歷史模擬法,以資產實際非常態報酬分配估計VaR,驗證了使用本篇論文研究方法極小CVaR投資組合與探索方法,可以適當逼近真實的Mean-VaR效率前緣。再者研究比較不同信賴水準、不同資產報酬分配假設與不同權重產生方式下的Mean-VaR效率前緣與Mean- 效率前緣效果差異,最後求得控制風險值下的最適投資組合。 / In contrast to the role of variance in the traditional Mean-Variance framework, in this thesis we introduce Value-at-Risk (VaR) as a shortfall-constraint into the portfolio selection decision. Doing so is much more in fitting with individual perception to risk and in line with the constraints which financial institutes currently face. However, mathematically VaR has some serious limitations making the portfolio selection problem difficult to attain optimal solution. In order to apply VaR to ex ante portfolio decision, we use the closely related tractable risk measure Conditional Value-at-Risk (CVaR) in this thesis as a proxy to find efficient portfolios. We utilize linear programming formulation developed by Rockafellar and Uryasev(2000) to construct a Mean-CVaR efficient frontier. Following which the VaR of resulting portfolios in the Mean-CVaR efficient frontier is reduced further by a simple heuristic procedure. After constructing an empirical Mean-VaR efficient frontier that can be proven an useful approximation to the true Mean-VaR efficient frontier, the Campbell, Huisman and Koedijk(2001) model is used to find the optimal portfolio. Three Taiwan listing stocks are used to build the Mean-VaR efficient frontier in the empirical study. And the Mean-VaR efficient frontier of different confident levels, under different asset return assumptions, and different optimal portfolio selection models are compared and results analyzed.
33

[en] FINANCIAL OPTIMIZATION OF A WIND FARM IN THE BRAZILIAN ENERGY MARKET / [pt] OTIMIZAÇÃO FINANCEIRA DE PARQUE EÓLICO NO MERCADO DE ENERGIA DO BRASIL

FERNANDO ORMONDE TEIXEIRA 23 March 2017 (has links)
[pt] Investigam-se modelos econométricos que sejam capazes de efetuar uma previsão mensal de vento em um parque eólico no Ceará. São testados modelos da família ARMA que consigam capturar a sazonalidade inerente ao movimento das massas de ar e que tragam benefícios aos empreendimentos eólicos localizados no Brasil e na região. Para tal, a previsão de vento é transformada em previsão de geração de energia. Em seguida, é elaborada uma metodologia para encontrar a melhor estratégia de ação a qual maximize o resultado da empresa tendo-se como meta o lucro e restrições de Value at Risk (VaR) e Conditional Value at Risk (CVaR). Os possíveis resultados de geração de energia são simulados concomitantemente com a simulação de preços de liquidação (PLD). / [en] We investigate econometric models that are capable of predicting the wind speed in a wind farm located in the state of Ceará, Brazil. ARMA models are tested to try to capture the seasonality inherent to the wind and that bring benefits to the firms operating wind farms in the region. Wind is converted in power generation to allow predictions to be more precise. Then, a methodology is created to find the best strategy, the one that maximizes the firm s profit. An optimization is made with VaR and CVaR as constraints. The simulated results of power generation are then put together with a simulation of liquidation s price (PLD).
34

Long-term asset allocation based on stochastic multistage multi-objective portfolio optimization

Chagas, Guido Marcelo Borma 19 August 2016 (has links)
Submitted by Guido Chagas (guido.chagas@fgv.br) on 2016-09-09T15:34:13Z No. of bitstreams: 1 Long-Term Asset Allocation Based on Stochastic Multistage Multi-Objective Portfolio Optimization.pdf: 6336618 bytes, checksum: 67d3dd1c3b982252c5012b3078278f95 (MD5) / Approved for entry into archive by Suzinei Teles Garcia Garcia (suzinei.garcia@fgv.br) on 2016-09-09T17:20:03Z (GMT) No. of bitstreams: 1 Long-Term Asset Allocation Based on Stochastic Multistage Multi-Objective Portfolio Optimization.pdf: 6336618 bytes, checksum: 67d3dd1c3b982252c5012b3078278f95 (MD5) / Made available in DSpace on 2016-09-09T17:21:47Z (GMT). No. of bitstreams: 1 Long-Term Asset Allocation Based on Stochastic Multistage Multi-Objective Portfolio Optimization.pdf: 6336618 bytes, checksum: 67d3dd1c3b982252c5012b3078278f95 (MD5) Previous issue date: 2016-08-19 / Multi-Period Stochastic Programming (MSP) offers an appealing approach to identity optimal portfolios, particularly over longer investment horizons, because it is inherently suited to handle uncertainty. Moreover, it provides flexibility to accommodate coherent risk measures, market frictions, and most importantly, major stylized facts as volatility clustering, heavy tails, leverage effects and tail co-dependence. However, to achieve satisfactory results a MSP model relies on representative and arbitrage-free scenarios of the pertaining multivariate financial series. Only after we have constructed such scenarios, we can exploit it using suitable risk measures to achieve robust portfolio allocations. In this thesis, we discuss a comprehensive framework to accomplish that. First, we construct joint scenarios based on a combined GJR-GARCH + EVT-GPD + t-Copula approach. Then, we reduce the original scenario tree and remove arbitrage opportunities using a method based on Optimal Discretization and Process Distances. Lastly, using the approximated scenario tree we perform a multi-period Mean-Variance-CVaR optimization taking into account market frictions such as transaction costs and regulatory restrictions. The proposed framework is particularly valuable to real applications because it handles various key features of real markets that are often dismissed by more common optimization approaches. / Programação Estocástica Multi-Período (MSP) oferece uma abordagem conveniente para identificar carteiras ótimas, particularmente para horizontes de investimento mais longos, pois incorpora adequadamente a incerteza no processo de otimização. Adicionalmente, ela proporciona flexibilidade para acomodar medidas coerentes de risco, fricções de mercado e fatos estilizados relevantes como agrupamento de volatilidade, caudas pesadas, efeitos de alavancagem e co-dependência nas caudas. No entanto, para alcançar resultados satisfatórios, um modelo MSP depende de cenários representativos e livres de arbitragem. Somente após construídos esses cenários, podemos explorá-los usando medidas de risco adequadas para alcançar alocações ótimas. Nessa tese, discutimos uma metodologia completa para alcançar esse objetivo. Em primeiro lugar, construímos cenários conjuntos baseados numa abordagem conjunta GJR-GARCH + EVT-GPD + t-Copula. Posteriormente, reduzimos a árvore original de cenários e removemos oportunidades de arbitragem utilizando um método de discretização ótima baseado nas distâncias de processos estocásticos. Por último, usando a árvore aproximada de cenários, realizamos uma otimização multi-período de média-variância-CVaR considerando fricções de mercado, custos de transação e restrições regulamentares. A metodologia proposta é particularmente útil para aplicações reais, porque considera várias características relevantes dos mercados reais que muitas vezes são ignorados por abordagens mais simples de otimização.
35

Vícestupňové stochastické programování s CVaR: modely, algoritmy a robustnost / Multi-Stage Stochastic Programming with CVaR: Modeling, Algorithms and Robustness

Kozmík, Václav January 2015 (has links)
Multi-Stage Stochastic Programming with CVaR: Modeling, Algorithms and Robustness RNDr. Václav Kozmík Abstract: We formulate a multi-stage stochastic linear program with three different risk measures based on CVaR and discuss their properties, such as time consistency. The stochastic dual dynamic programming algorithm is described and its draw- backs in the risk-averse setting are demonstrated. We present a new approach to evaluating policies in multi-stage risk-averse programs, which aims to elimi- nate the biggest drawback - lack of a reasonable upper bound estimator. Our approach is based on an importance sampling scheme, which is thoroughly ana- lyzed. A general variance reduction scheme for mean-risk sampling with CVaR is provided. In order to evaluate robustness of the presented models we extend con- tamination technique to the case of large-scale programs, where a precise solution cannot be obtained. Our computational results are based on a simple multi-stage asset allocation model and confirm usefulness of the presented procedures, as well as give additional insights into the behavior of more complex models. Keywords: Multi-stage stochastic programming, stochastic dual dynamic programming, im- portance sampling, contamination, CVaR
36

[en] RISK ANALYSIS IN A PORTFOLIO OF COMMODITIES: A CASE STUDY / [pt] ANÁLISE DE RISCOS NUM PORTFÓLIO DE COMMODITIES: UM ESTUDO DE CASO

LUCIANA SCHMID BLATTER MOREIRA 23 March 2015 (has links)
[pt] Um dos principais desafios no mercado financeiro é simular preços mantendo a estrutura de correlação entre os inúmeros ativos de um portfólio. Análise de Componentes Principais emerge como uma solução para este último problema. Além disso, dada a incerteza presente nos mercados de commodities de derivados de petróleo, o investidor quer proteger seus ativos de perdas potenciais. Como uma alternativa a esse problema, a otimização de várias medidas de risco, como Value-at-risk, Conditional Value-at-risk e medida Ômega, são ferramentas financeiras importantes. Além disso, o backtest é amplamente utilizado para validar e analisar o desempenho do método proposto. Nesta dissertação, trabalharemos com um portfólio de commodities de petróleo. Vamos unir diferentes técnicas e propor uma nova metodologia que consiste na diminuição da dimensão do portfólio proposto. O passo seguinte é simular os preços dos ativos na carteira e, em seguida, otimizar a alocação do portfólio de commodities de derivados do petróleo. Finalmente, vamos usar técnicas de backtest, a fim de validar nosso método. / [en] One of the main challenges in the financial market is to simulate prices keeping the correlation structure among numerous assets. Principal Component Analysis emerges as solution to the latter problem. Also, given the uncertainty present in commodities markets, an investor wants to protect his/her assets from potential losses, so as an alternative, the optimization of various risk measures, such as Value-at-risk, Conditional Value-at-risk and Omega Ratio, are important financial tools. Additionally, the backtest is widely used to validate and analyze the performance of the proposed methodology. In this dissertation, we will work with a portfolio of oil commodities. We will put together different techniques and propose a new methodology that consists in the (potentially) decrease the dimension of the proposed portfolio. The following step is to simulate the prices of the assets in the portfolio and then optimize the allocation of the portfolio of oil commodities. Finally, we will use backtest techniques in order to validate our method.
37

Portfolio selection and hedge funds : linearity, heteroscedasticity, autocorrelation and tail-risk

Bianchi, Robert John January 2007 (has links)
Portfolio selection has a long tradition in financial economics and plays an integral role in investment management. Portfolio selection provides the framework to determine optimal portfolio choice from a universe of available investments. However, the asset weightings from portfolio selection are optimal only if the empirical characteristics of asset returns do not violate the portfolio selection model assumptions. This thesis explores the empirical characteristics of traditional assets and hedge fund returns and examines their effects on the assumptions of linearity-in-the-mean testing and portfolio selection. The encompassing theme of this thesis is the empirical interplay between traditional assets and hedge fund returns. Despite the paucity of hedge fund research, pension funds continue to increase their portfolio allocations to global hedge funds in an effort to pursue higher risk-adjusted returns. This thesis presents three empirical studies which provide positive insights into the relationships between traditional assets and hedge fund returns. The first two empirical studies examine an emerging body of literature which suggests that the relationship between traditional assets and hedge fund returns is non-linear. For mean-variance investors, non-linear asset returns are problematic as they do not satisfy the assumption of linearity required for the covariance matrix in portfolio selection. To examine the linearity assumption as it relates to a mean-variance investor, a hypothesis test approach is employed which investigates the linearity-in-the-mean of traditional assets and hedge funds. The findings from the first two empirical studies reveal that conventional linearity-in-the-mean tests incorrectly conclude that asset returns are nonlinear. We demonstrate that the empirical characteristics of heteroscedasticity and autocorrelation in asset returns are the primary sources of test mis-specification in these linearity-in-the-mean hypothesis tests. To address this problem, an innovative approach is proposed to control heteroscedasticity and autocorrelation in the underlying tests and it is shown that traditional assets and hedge funds are indeed linear-in-the-mean. The third and final study of this thesis explores traditional assets and hedge funds in a portfolio selection framework. Following the theme of the previous two studies, the effects of heteroscedasticity and autocorrelation are examined in the portfolio selection context. The characteristics of serial correlation in bond and hedge fund returns are shown to cause a downward bias in the second sample moment. This thesis proposes two methods to control for this effect and it is shown that autocorrelation induces an overallocation to bonds and hedge funds. Whilst heteroscedasticity cannot be directly examined in portfolio selection, empirical evidence suggests that heteroscedastic events (such as those that occurred in August 1998) translate into the empirical feature known as tail-risk. The effects of tail-risk are examined by comparing the portfolio decisions of mean-variance analysis (MVA) versus mean-conditional value at risk (M-CVaR) investors. The findings reveal that the volatility of returns in a MVA portfolio decreases when hedge funds are included in the investment opportunity set. However, the reduction in the volatility of portfolio returns comes at a cost of undesirable third and fourth moments. Furthermore, it is shown that investors with M-CVaR preferences exhibit a decreasing demand for hedge funds as their aversion for tail-risk increases. The results of the thesis highlight the sensitivities of linearity tests and portfolio selection to the empirical features of heteroscedasticity, autocorrelation and tail-risk. This thesis contributes to the literature by providing refinements to these frameworks which allow improved inferences to be made when hedge funds are examined in linearity and portfolio selection settings.
38

[en] HEDGING RENEWABLE ENERGY SALES IN THE BRAZILIAN CONTRACT MARKET VIA ROBUST OPTIMIZATION / [pt] MODELO DE CONTRATAÇÃO PARA FONTES RENOVÁVEIS COM RUBUSTEZ AO PREÇO DE CURTO-PRAZO

BRUNO FANZERES DOS SANTOS 26 March 2018 (has links)
[pt] O preço da energia no mercado de curto-prazo é caracterizado pela sua alta volatilidade e dificuldade de previsão, representando um alto risco para agentes produtores de energia, especialmente para geradores por fontes renováveis. A abordagem típica empregada por tais empresas para obter a estratégia de contratação ótima de médio e longo prazos é simular um conjunto de caminhos para os fatores de incerteza a fim de caracterizar a distribuição de probabilidade da receita futura e, então, otimizar o portfólio da empresa, maximizando o seu equivalente certo. Contudo, na prática, a modelagem e simulação do preço de curto prazo da energia é um grande desafio para os agentes do setor elétrico devido a sua alta dependência a parâmetros que são difíceis de prever no médio e longo, como o crescimento do PIB, variação da demanda, entrada de novos agentes no mercado, alterações regulatórias, entre outras. Neste sentido, nesta dissertação, utilizamos otimização robusta para tratar a incerteza presente na distribuição do preço de curto-prazo da energia, enquanto a produção de energia renovável é tratada com cenários simulados exógenos, como é comum em programação estocástica. Mostramos, também, que esta abordagem pode ser interpretada a partir de dois pontos de vista: teste de estresse e aversão à ambiguidade. Com relação ao último, apresentamos um link entre otimização robusta e teoria de ambiguidade. Além disso, incluímos no modelo de formação de portfólio ótimo a possibilidade de considerar um contrato de opção térmica de compra para o hedge do portfólio do agente contra a irregularidade do preço de curto-prazo. Por fim, é apresentado um estudo de caso com dados realistas do sistema elétrico brasileiro para ilustrar a aplicabilidade da metodologia proposta. / [en] Energy spot price is characterized by its high volatility and difficult prediction, representing a major risk for energy companies, especially those that rely on renewable generation. The typical approach employed by such companies to address their mid- and long-term optimal contracting strategy is to simulate a large set of paths for the uncertainty factors to characterize the probability distribution of the future income and, then, optimize the company s portfolio to maximize its certainty equivalent. In practice, however, spot price modeling and simulation is a big challenge for agents due to its high dependence on parameters that are difficult to predict, e.g., GDP growth, demand variation, entrance of new market players, regulatory changes, just to name a few. In this sense, in this dissertation, we make use of robust optimization to treat the uncertainty on spot price distribution while renewable production remains accounted for by exogenously simulated scenarios, as is customary in stochastic programming. We show that this approach can be interpreted from two different point of views: stress test and aversion to ambiguity. Regarding the latter, we provide a link between robust optimization and ambiguity theory, which was an open gap in decision theory. Moreover, we include into the optimal portfolio model, the possibility to consider an energy call option contract to hedge the agent s portfolio against price spikes. A case study with realistic data from the Brazilian system is shown to illustrate the applicability of the proposed methodology.
39

[en] STOCHASTIC ANALYSIS OF ECONOMIC VIABILITY OF PHOTOVOLTAIC PANELS INSTALLATION IN LARGE CONSUMERS / [pt] ANÁLISE ESTOCÁSTICA DA VIABILIDADE ECONÔMICA DA INSTALAÇÃO DE PAINÉIS FOTOVOLTAICOS EM GRANDES CONSUMIDORES

ANDRES MAURICIO CESPEDES GARAVITO 25 May 2018 (has links)
[pt] A geração distribuída (GD) vem crescendo nos últimos anos no Brasil, particularmente a geração fotovoltaica, permitindo a pequenos e grandes consumidores ter um papel ativo no sistema elétrico, podendo investir em um sistema próprio de geração. Para os consumidores cativos, além da redução do custo de energia, o consumidor também pode ter uma redução no custo de demanda, que é calculado a partir de um contrato com a distribuidora que o atende. Assim, considerando a possibilidade de instalação de painéis fotovoltaicos, o desafio dos consumidores é estimar com maior acurácia possível sua energia, a energia gerada pelos painéis e as demandas máximas futuras de forma a determinar a quantidade ótima de painéis, bem como o contrato de demanda com a distribuidora. Nesta dissertação, propõe-se resolver este problema a partir da simulação de cenários futuros de consumo de energia, demanda máxima e correlacionando-os com cenários futuros de geração de energia. Em seguida, a partir de um modelo de otimização linear inteiro misto, calcula-se a quantidade ótima de painéis fotovoltaicos e a demanda a ser contratada. Na primeira parte da dissertação, a modelagem Box e Jenkins é utilizada para estimar os parâmetros do modelo estatístico de energia consumida e demanda combinados com a geração de energia dos painéis. Na segunda parte, é utilizado um modelo de otimização estocástica que utiliza uma combinação convexa de Valor Esperado (VE) e Conditional Value-at-Risk (CVaR) como métricas de risco para avaliar o número ótimo de painéis e a melhor contratação de demanda. Para ilustrar a abordagem proposta, é apresentado um caso de estudo real para um grande consumidor considerado na modalidade Verde A4 no Ambiente de Contratação Regulado. Os resultados obtidos mostraram que a utilização de painéis fotovoltaicos em um grande consumidor reduzem o custo anual de energia em até 20 por cento, comparado com o valor real faturado. / [en] Distributed Generation (GD) is growing up in the last years in Brazil, particularly photovoltaic generation, allowing small and large consumers play an important role in the electric system, investing in a own generation system. For the regulated consumers, besides the reduction of energy cost, they also may have a reduction in demand cost, which is computed from peak demand contract with the supply utility company. Therefore, taking into account the possibility of photovoltaic panels installation, the challenge of consumers is estimate with highest accuracy as possible its energy, the energy generation by the panels, and the future peak demand in order to estimate the optimum quantity of panels, as well as the peak demand contract with the utility. A way to solve this problem is to simulate future scenarios of energy consumption, peak demand, and correlate them with future scenarios of energy generation. After that, from a mixed integer linear stochastic optimization model, the optimum quantity of panels and peak demand to be contracted are computed. In the first part, the Box and Jenkins modelling is used to estimate the parameters of the energy consumption and peak demand by statistical model, combined with the energy generation of the panels. In the second part, a stochastic optimization model is applied using a convex combination of the Expected Value (VE) and Conditional Value-at-Risk (CVaR), which were used as risk metrics to rate the optimum number of panels and the best peak demand contract. To illustrate the proposed approach, a real case study of a large consumer presented considering the Green Tariff group A4 in the Regulated Environment. The results show that to use photovoltaic panels can reduce the annual cost by up to 20 per cent, compared with the billed real value.
40

Contributions to Multi-Armed Bandits : Risk-Awareness and Sub-Sampling for Linear Contextual Bandits / Contributions aux bandits manchots : gestion du risque et sous-échantillonnage pour les bandits contextuels linéaires

Galichet, Nicolas 28 September 2015 (has links)
Cette thèse s'inscrit dans le domaine de la prise de décision séquentielle en environnement inconnu, et plus particulièrement dans le cadre des bandits manchots (multi-armed bandits, MAB), défini par Robbins et Lai dans les années 50. Depuis les années 2000, ce cadre a fait l'objet de nombreuses recherches théoriques et algorithmiques centrées sur le compromis entre l'exploration et l'exploitation : L'exploitation consiste à répéter le plus souvent possible les choix qui se sont avérés les meilleurs jusqu'à présent. L'exploration consiste à essayer des choix qui ont rarement été essayés, pour vérifier qu'on a bien identifié les meilleurs choix. Les applications des approches MAB vont du choix des traitements médicaux à la recommandation dans le contexte du commerce électronique, en passant par la recherche de politiques optimales de l'énergie. Les contributions présentées dans ce manuscrit s'intéressent au compromis exploration vs exploitation sous deux angles spécifiques. Le premier concerne la prise en compte du risque. Toute exploration dans un contexte inconnu peut en effet aboutir à des conséquences indésirables ; par exemple l'exploration des comportements d'un robot peut aboutir à des dommages pour le robot ou pour son environnement. Dans ce contexte, l'objectif est d'obtenir un compromis entre exploration, exploitation, et prise de risque (EER). Plusieurs algorithmes originaux sont proposés dans le cadre du compromis EER. Sous des hypothèses fortes, l'algorithme MIN offre des garanties de regret logarithmique, à l'état de l'art ; il offre également une grande robustesse, contrastant avec la forte sensibilité aux valeurs des hyper-paramètres de e.g. (Auer et al. 2002). L'algorithme MARAB s'intéresse à un critère inspiré de la littérature économique(Conditional Value at Risk), et montre d'excellentes performances empiriques comparées à (Sani et al. 2012), mais sans garanties théoriques. Enfin, l'algorithme MARABOUT modifie l'estimation du critère CVaR pour obtenir des garanties théoriques, tout en obtenant un bon comportement empirique. Le second axe de recherche concerne le bandit contextuel, où l'on dispose d'informations additionnelles relatives au contexte de la décision ; par exemple, les variables d'état du patient dans un contexte médical ou de l'utilisateur dans un contexte de recommandation. L'étude se focalise sur le choix entre bras qu'on a tirés précédemment un nombre de fois différent. Le choix repose en général sur la notion d'optimisme, comparant les bornes supérieures des intervalles de confiance associés aux bras considérés. Une autre approche appelée BESA, reposant sur le sous-échantillonnage des valeurs tirées pour les bras les plus visités, et permettant ainsi de se ramener au cas où tous les bras ont été tirés un même nombre de fois, a été proposée par (Baransi et al. 2014). / This thesis focuses on sequential decision making in unknown environment, and more particularly on the Multi-Armed Bandit (MAB) setting, defined by Lai and Robbins in the 50s. During the last decade, many theoretical and algorithmic studies have been aimed at cthe exploration vs exploitation tradeoff at the core of MABs, where Exploitation is biased toward the best options visited so far while Exploration is biased toward options rarely visited, to enforce the discovery of the the true best choices. MAB applications range from medicine (the elicitation of the best prescriptions) to e-commerce (recommendations, advertisements) and optimal policies (e.g., in the energy domain). The contributions presented in this dissertation tackle the exploration vs exploitation dilemma under two angles. The first contribution is centered on risk avoidance. Exploration in unknown environments often has adverse effects: for instance exploratory trajectories of a robot can entail physical damages for the robot or its environment. We thus define the exploration vs exploitation vs safety (EES) tradeoff, and propose three new algorithms addressing the EES dilemma. Firstly and under strong assumptions, the MIN algorithm provides a robust behavior with guarantees of logarithmic regret, matching the state of the art with a high robustness w.r.t. hyper-parameter setting (as opposed to, e.g. UCB (Auer 2002)). Secondly, the MARAB algorithm aims at optimizing the cumulative 'Conditional Value at Risk' (CVar) rewards, originated from the economics domain, with excellent empirical performances compared to (Sani et al. 2012), though without any theoretical guarantees. Finally, the MARABOUT algorithm modifies the CVar estimation and yields both theoretical guarantees and a good empirical behavior. The second contribution concerns the contextual bandit setting, where additional informations are provided to support the decision making, such as the user details in the ontent recommendation domain, or the patient history in the medical domain. The study focuses on how to make a choice between two arms with different numbers of samples. Traditionally, a confidence region is derived for each arm based on the associated samples, and the 'Optimism in front of the unknown' principle implements the choice of the arm with maximal upper confidence bound. An alternative, pioneered by (Baransi et al. 2014), and called BESA, proceeds instead by subsampling without replacement the larger sample set. In this framework, we designed a contextual bandit algorithm based on sub-sampling without replacement, relaxing the (unrealistic) assumption that all arm reward distributions rely on the same parameter. The CL-BESA algorithm yields both theoretical guarantees of logarithmic regret and good empirical behavior.

Page generated in 0.043 seconds