NetCDF and its Conversion

NetCDF (Network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. It is also a community standard for sharing scientific data. The Unidata Program Center supports and maintains netCDF programming interfaces for C, C++, Java, and Fortran. Programming interfaces are also available for Python, IDL, MATLAB, R, Ruby, and Perl.

網絡通用數據格式(英語:Network Common Data Form,NetCDF)是一種自描述、與機器無關、基於陣列的科學數據格式,同時也是支援建立、存取和共用這一數據格式的函數庫。該專案首頁位於美國大氣科學研究大學聯盟(UCAR)的Unidata規劃網站。它也是netCDF軟件、標準開發、更新等的主要來源。NetCDF格式是一種開放標準。NetCDF的經典格式和64位元偏移量格式是開放地理空間協會採用的國際標準。該專案開始於1989年,UCAR目前對其積極支援,在新發行版中改進效能、增加功能並修正缺陷,目前版本系列是netCDF-4,在編譯時也可以選擇只建造netCDF-3庫。

File Formats and Conversion

link > Availability: ncap2, nces, ncecat, ncflint, ncks, ncpdq, ncra, ncrcat, ncwa > Short options: ‘-3’, ‘-4’, ‘-5’, ‘-6’, ‘-7’ > Long options: ‘--3’, ‘--4’, ‘--5’, ‘--6’, ‘--64bit_offset’, ‘--7’, ‘--fl_fmt’, ‘--netcdf4’

All NCO operators support (read and write) all three (or four, depending on how one counts) file formats supported by netCDF4. The default output file format for all operators is the input file format. The operators listed under “Availability” above allow the user to specify the output file format independent of the input file format. These operators allow the user to convert between the various file formats. (The operators ncatted and ncrename do not support these switches so they always write the output netCDF file in the same format as the input netCDF file.)

File Formats

netCDF supports five types of files: CLASSIC, 64BIT_OFFSET, 64BIT_DATA, NETCDF4, and NETCDF4_CLASSIC. The CLASSIC (aka CDF1) format is the traditional 32-bit offset written by netCDF2 and netCDF3. As of 2005, nearly all netCDF datasets were in CLASSIC format. The 64BIT_OFFSET (originally called plain old 64BIT) (aka CDF2) format was added in Fall, 2004. As of 2010, many netCDF datasets were in 64BIT_OFFSET format. As of 2013, an increasing number of netCDF datasets were in NETCDF4_CLASSIC format. The 64BIT_DATA (aka CDF5 or PNETCDF) format was added to netCDF in January, 2016 and immediately supported by NCO. Support for Zarr and NCZarr backend storage formats was added to netCDF in 2021 and supported by NCO in 2022.

The NETCDF4 format uses HDF5 as the file storage layer. The files are (usually) created, accessed, and manipulated using the traditional netCDF3 API (with numerous extensions). The NETCDF4_CLASSIC format refers to netCDF4 files created with the NC_CLASSIC_MODEL mask. Such files use HDF5 as the back-end storage format (unlike netCDF3), though they incorporate only netCDF3 features. Hence NETCDF4_CLASSIC files are entirely readable by applications that use only the netCDF3 API (though the applications must be linked with the netCDF4 library). NCO must be built with netCDF4 to write files in the new NETCDF4 and NETCDF4_CLASSIC formats, and to read files in these formats. Datasets in the default CLASSIC or the newer 64BIT_OFFSET formats have maximum backwards-compatibility with older applications. NCO has deep support for NETCDF4 formats. If backwards compatibility is important, and your datasets are too large for netCDF3, use NETCDF4_CLASSIC instead of CLASSIC format files. NCO support for the NETCDF4 format is complete and many high-performance disk/RAM efficient workflows utilize this format.

As mentioned above, all operators write use the input file format for output files unless told otherwise. Toggling the short option ‘-6’ or the long option ‘--6’ or ‘--64bit_offset’ (or their key-value equivalent ‘--fl_fmt=64bit_offset’) produces the netCDF3 64-bit offset format named 64BIT_OFFSET. NCO must be built with netCDF 3.6 or higher to produce a 64BIT_OFFSET file. As of NCO version 4.6.9 (September, 2017), toggling the short option ‘-5’ or the long options ‘--5’, ‘--64bit_data’, ‘--cdf5’, or ‘--pnetcdf’ (or their key-value equivalent ‘--fl_fmt=64bit_data’) produces the netCDF3 64-bit data format named 64BIT_DATA. This format is widely used by MPI-enabled modeling codes because of its long association with PnetCDF. NCO must be built with netCDF 4.4 or higher to produce a 64BIT_DATA file.

Using the ‘-4’ switch (or its long option equivalents ‘--4’ or ‘--netcdf4’), or setting its key-value equivalent ‘--fl_fmt=netcdf4’ produces a NETCDF4 file (i.e., with all supported HDF5 features). Using the ‘-7’ switch (or its long option equivalent ‘--7’ 30, or setting its key-value equivalent ‘--fl_fmt=netcdf4_classic’ produces a NETCDF4_CLASSIC file (i.e., with all supported HDF5 features like compression and chunking but without groups or new atomic types). Operators given the ‘-3’ (or ‘--3’) switch without arguments will (attempt to) produce netCDF3 CLASSIC output, even from netCDF4 input files.

Note that NETCDF4 and NETCDF4_CLASSIC are the same binary format. The latter simply causes a writing application to fail if it attempts to write a NETCDF4 file that cannot be completely read by the netCDF3 library. Conversely, NETCDF4_CLASSIC indicates to a reading application that all of the file contents are readable with the netCDF3 library. NCO has supported reading/writing basic NETCDF4 and NETCDF4_CLASSIC files since October, 2005.

Determining File Format

First, examine the first line of global metadata output by ncks -M:

1
2
$ ncks -M foo_3.nc
Summary of foo_3.nc: filetype = NC_FORMAT_CLASSIC, 0 groups ...
ncks will also print the extended or underlying format of the input file
1
$ ncks -D 2 -M foo_4.nc
Second, query the file with ‘ncdump -k’:
1
2
$ ncdump -k foo_3.nc
classic/netCDF-4/cdf5/...

File Conversion

Let us demonstrate converting a file from any netCDF-supported input format into any netCDF output format (subject to limits of the output format). Here the input file in.nc may be in any of these formats: netCDF3 (classic, 64bit_offset, 64bit_data), netCDF4 (classic and extended), HDF4, HDF5, HDF-EOS (version 2 or 5), and DAP. The switch determines the output format written in the comment:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
ncks --fl_fmt=classic in.nc foo_3.nc # netCDF3 classic
ncks --fl_fmt=64bit_offset in.nc foo_6.nc # netCDF3 64bit-offset
ncks --fl_fmt=64bit_data in.nc foo_5.nc # netCDF3 64bit-data
ncks --fl_fmt=cdf5 in.nc foo_5.nc # netCDF3 64bit-data
ncks --fl_fmt=netcdf4_classic in.nc foo_7.nc # netCDF4 classic
ncks --fl_fmt=netcdf4 in.nc foo_4.nc # netCDF4
ncks -3 in.nc foo_3.nc # netCDF3 classic
ncks --3 in.nc foo_3.nc # netCDF3 classic
ncks -6 in.nc foo_6.nc # netCDF3 64bit-offset
ncks --64 in.nc foo_6.nc # netCDF3 64bit-offset
ncks -5 in.nc foo_5.nc # netCDF3 64bit-data
ncks --5 in.nc foo_5.nc # netCDF3 64bit-data
ncks -4 in.nc foo_4.nc # netCDF4
ncks --4 in.nc foo_4.nc # netCDF4
ncks -7 in.nc foo_7.nc # netCDF4 classic
ncks --7 in.nc foo_7.nc # netCDF4 classic

Of course since most operators support these switches, the “conversions” can be done at the output stage of arithmetic or metadata processing rather than requiring a separate step. Producing (netCDF3) CLASSIC or 64BIT_OFFSET or 64BIT_DATA files from NETCDF4_CLASSIC files always works.


NetCDF and its Conversion
https://waipangsze.github.io/2023/06/24/netcdf/
Author
wpsze
Posted on
June 24, 2023
Updated on
October 8, 2024
Licensed under