Environmental contours of extreme sea states are often utilized for the purposes of reliability-based offshore design. Many methods have been proposed to estimate environmental contours of extreme sea states, including, but not limited to, the traditional inverse first-order reliability method (I-FORM) and subsequent modifications, copula methods, and Monte Carlo methods. These methods differ in terms of both the methodology selected for defining the joint distribution of sea state parameters and in the method used to construct the environmental contour from the joint distribution. It is often difficult to compare the results of proposed methods to determine which method should be used for a particular application or geographical region. The comparison of the predictions from various contour methods at a single site and across many sites is important to making environmental contours of extreme sea states useful in practice. The goal of this paper is to develop a comparison framework for evaluating methods for developing environmental contours of extreme sea states. This paper develops generalized metrics for comparing the performance of contour methods to one another across a collection of study sites, and applies these metrics and methods to develop conclusions about trends in the wave resource across geographic locations, as demonstrated for a pilot dataset. These proposed metrics and methods are intended to judge the environmental contours themselves relative to other contour methods, and are thus agnostic to a specific device, structure, or field of application. The metrics developed and applied in this paper include measures of predictive accuracy, physical validity, and aggregated temporal performance that can be used to both assess contour methods and provide recommendations for the use of certain methods in various geographical regions. The application and aggregation of the metrics proposed in this paper outline a comparison framework for environmental contour methods that can be applied to support design analysis workflows for offshore structures. This comparison framework could be extended in future work to include additional metrics of interest, potentially including those to address issues pertinent to a specific application area or analysis discipline, such as metrics related to structural response across contour methods or additional physics-based metrics based on wave dynamics.