Abstract
Achieving energy maximizing control of a Wave Energy Converter (WEC) not only needs a comprehensive dynamic model of the system—including nonlinear hydrodynamic effects and nonlinear characteristics of Power Take-Off (PTO)—but to treat the entire system using an integrated approach, i.e., as a cyber–physical system considering the WEC dynamics, control strategy, and communication interface. The resulting energy-maximizing optimization formulation leads to a non-quadratic and nonstandard cost function. This article compares the (1) Nonlinear Model Predictive Controller (NMPC) and (2) Reinforcement Learning (RL) techniques as applied to a class of multiple-degrees-of-freedom nonlinear WEC–PTO systems subjected to linear as well as nonlinear hydrodynamic conditions in simulation, using the WEC-Sim™ toolbox. The results show that with an optimal choice of RL agent and hyperparameters, as well as suitable training conditions, the RL algorithm is more robust under more stringent operating requirements, for which the NMPC algorithm fails to converge. Further, RL agents are computationally efficient on real-time target machines with a significantly reduced Task Execution Time (TET).