This webinar series is hosted by the Marine Environmental Data and Information Network (MEDIN), a UK initiative dedicated to improving access to high-quality marine data. MEDIN works with organisations across sectors to promote best practices in marine data management, ensuring data is discoverable, accessible, and reusable for the long term.
Throughout the year, once per month, this series will feature one-hour online sessions led by expert guest speakers, each focusing on specialised topics not typically covered in MEDIN’s regular free online workshops. These webinars are designed to support better data stewardship and highlight emerging tools, standards, and approaches in marine data.
Sessions will be recorded and made available on the MEDIN YouTube channel, creating a lasting resource for the marine data community.
Whether you are new to marine data management or have years of experience, and no matter which sector you work in, these sessions are designed to be inclusive, informative, and accessible to all who are interested in improving the way marine data is managed and shared.
Webinar 5 - Ocean Data at Scale: Autonomous Data Management and High-Volume Archiving at the British Oceanographic Data Centre (BODC)
-
Autonomous Platform Data Management at BODC - Robyn Owen (BODC – NOC)
The use of autonomous platforms is drastically increasing as technologies advance and science works towards becoming Net Zero and therefore autonomous platforms are established as a key commodity in multidisciplinary oceanographic research, often able to reach locations inaccessible to traditional research vessels be that due to remoteness or time of year. To account for this increase, data centres need to provide reliant data management systems that are operational all year round.
The British Oceanographic Data Centre (BODC), part of the Digital Ocean Division at the UK’s National Oceanography Centre (NOC) provides data management for three autonomous platform types: Argo floats, gliders and Autosub Long Range (ALR). BODC supports scientific and commercial deployments in both near-real time and delayed mode.
This talk will cover BODC data management practices for each of the three platform types including data processing, data dissemination and how we work with the wider platform communities. Data processing has a strong focus on metadata, applying controlled vocabularies from the NERC Vocabulary Server and for glider and ALR deployments utilising a Semantic Sensor Network (SSN) database to register sensors and platforms. There are multiple data dissemination pathways across the platforms with Argo float and glider data contributing to the Global Ocean Observing System (GOOS).
-
High Volume Data Archiving and Challenges - Monica Hanley & Danielle Wright (BODC – NOC)
Archiving data can be a challenge for both data originators and data repositories, and high-volume data can add an extra layer of complexity due to the size and format of the data. High volume in this instance is considered to be data that are above 100 GBs. The British Oceanographic Data Centre handle a few different data types that regularly fall into this category, including model outputs, geophysical and geospatial data, and image and video data. All data, regardless of its type, should be in a standard, non-proprietary format with rich metadata accompanying it.
Model outputs should be in CF-NetCDF files, if they are deemed appropriate for archiving, with global attributes and the code published somewhere accessible so they can be linked to the outputs more easily. These datasets range heavily in size and file compression is recommended for larger outputs.
Geophysical data such as bathymetry and seismics can range from a few hundred MB to tens of TB in volume, and industry standard formats such as SEGY (seismic) and XYZ or TIF (bathymetry) are preferred. BODC also handles still imagery and video data, generally from camera surveys from the NOC fleet of remotely operated or autonomous underwater vehicles. Volumes can range from 1-10 TB per dive/mission, with anywhere from 5-15 dives/missions per cruise/campaign. Raw data are generally archived on offline tape servers, with useful re-usable formats made available online. Transfer and archive of such high volumes requires dedicated servers and collaboration with CEDA to use their high-volume JASMIN infrastructure, ensuring adequate metadata is supplied according to community standards to enable discovery and access. Bathymetry data at BODC is also made available at the European level through the EMODnet Bathymetry project and at the international level through Seabed 2030 and GEBCO.
Past Events
- MEDIN Webinar Series 2025: Webinar 3 - Speaking the Same Language: How Controlled Vocabularies Facilitate Data Sharing and Interoperability, Online, 22 October 2025 06:00-07:00 PDT
- MEDIN Webinar Series 2025: Webinar 2 - Interoperability in Action: Data Standards and Marine Applications, Online, 17 September 2025 06:00-07:00 PDT
- MEDIN Webinar Series 2025: Webinar 1 - Navigating Marine Data: Planning, Stewardship, and the Value of MEDIN, Online, 20 August 2025 06:00-07:00 PDT