A general method for predicting and evaluating the performance of three candidate cross-flow turbine power-maximizing controllers is presented using low-order dynamic simulation, scaled laboratory experiments, and full-scale field testing. For each testing mode and candidate controller, performance metrics quantifying energy capture (ability of a controller to maximize power), variation in torque and rotation rate (related to drive train fatigue), and variation in thrust loads (related to structural fatigue) are quantified for two purposes. First, for metrics that could be evaluated across all testing modes, we considered the accuracy with which simulation or laboratory experiments could predict performance at full scale. Second, we explored the utility of these metrics to contrast candidate controller performance. For these turbines and set of candidate controllers, energy capture was found to only differentiate controller performance in simulation, while the other explored metrics were able to predict performance of the full-scale turbine in the field with various degrees of success. Effects of scale between laboratory and full-scale testing are considered, along with recommendations for future improvements to dynamic simulations and controller evaluation.