• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Reinforcement Learning-based Human Operator Decision Support Agent for Highly Transient Industrial Processes

Jianqi Ruan (18066763) 03 March 2024 (has links)
<p dir="ltr"> Most industrial processes are not fully-automated. Although reference tracking can be handled by low-level controllers, initializing and adjusting the reference, or setpoint, values, are commonly tasks assigned to human operators. A major challenge that arises, though, is control policy variation among operators which in turn results in inconsistencies in the final product. In order to guide operators to pursue better and more consistent performance, researchers have explored the optimal control policy through different approaches. Although in different applications, researchers use different approaches, an accurate process model is still crucial to the approaches. However, for a highly transient process (e.g., the startup of a manufacturing process), modeling can be challenging and inaccurate, and approaches highly relying on a process model may not work well. One example is process startup in a twin-roll steel strip casting process and motivates this work. </p><p dir="ltr"><br></p><p dir="ltr"> In this dissertation, I propose three offline reinforcement learning (RL) algorithms which require the RL agent to learn a control policy from a fixed dataset that is pre-collected by human operators during operations of the twin-roll casting process. Compared to existing offline RL algorithms, the proposed algorithms focus on exploiting the best control policy used by human operators rather than exploring new control policies constrained by the existing policies. In addition, in existing offline RL algorithms, there is not enough consideration of the imbalanced dataset problem. In the second and the third proposed algorithms, I leverage the idea of cost sensitive learning to incentivize the RL agent to learn the most valuable control policy, rather than the most common one represented in the dataset. In addition, since the process model is not available, I propose a performance metric that does not require a process model or simulator for agent testing. The third proposed algorithm is compared with benchmark offline RL algorithms and achieves better and more consistent performance.</p>

Page generated in 0.0366 seconds