This paper presents a comparison of a Reinforcement-Learning (RL) based wave energy conversion controller against standard reactive damping and model predictive control (MPC) approaches, in the presence of modeling errors. Wave energy converters (WECs) are under the influence of many non-linear hydrodynamic forces, yet for ease and expediency, it is common to formulate linear WEC models and control laws. Therefore it is expected that significant modeling errors may be present, which may degrade model-based control performance. Model-free RL approaches to control may offer a significant advantage in robustness to modeling errors, in that the model is learned by the controller by experience. It is shown that, for an annual average sea state, RL-based controllers can outperform model-based control – reactive control and MPC – by 19% and 16%, respectively, when significant modeling error is present. Furthermore, compared to similar studies of RL-based control, the proposed model can reduce the training time from 8.4 hr to 1.5 hr.