Describe the bug
When writing datasets with string/character coordinates (e.g., type coordinate in land variables), the current write() function has two issues:
- String coordinates become variables: Coordinates like
type are written as data variables instead of remaining as coordinates
- Incorrect dtype: The dtype changes from
|S11 (byte string) to <U11 (Unicode string), which is not CMIP6 compliant
Expected Behavior
String coordinates should:
- Remain as coordinates (appear in
ds.coords, not ds.data_vars)
- Use CF-compliant character encoding with
dtype='|S1' and a strlen dimension
- Match the encoding of standard CMIP6 datasets
Expected encoding:
{
'dtype': dtype('S1'),
'char_dim_name': 'type_strlen',
'original_shape': (11,)
}
Actual Behavior (Before Fix)
Before writing:
ds.coords:
type |S11 11B b'bare_ground' # Coordinate
After writing and re-reading:
ds.data_vars:
type <U11 'bare_ground' # ❌ Now a variable, wrong dtype
ds.coords:
# ❌ type is missing here
Root Cause
The write() function uses netCDF4 library directly but doesn't properly handle string coordinates:
- Doesn't distinguish between string coordinates and regular variables
- Doesn't apply CF-compliant character array encoding (S1 + strlen dimension)
- Doesn't add string coordinates to the main variable's
coordinates attribute (required for auxiliary/scalar coordinates per CF conventions)
Describe the bug
When writing datasets with string/character coordinates (e.g.,
typecoordinate in land variables), the currentwrite()function has two issues:typeare written as data variables instead of remaining as coordinates|S11(byte string) to<U11(Unicode string), which is not CMIP6 compliantExpected Behavior
String coordinates should:
ds.coords, notds.data_vars)dtype='|S1'and astrlendimensionExpected encoding:
{ 'dtype': dtype('S1'), 'char_dim_name': 'type_strlen', 'original_shape': (11,) }Actual Behavior (Before Fix)
Before writing:
After writing and re-reading:
Root Cause
The
write()function usesnetCDF4library directly but doesn't properly handle string coordinates:coordinatesattribute (required for auxiliary/scalar coordinates per CF conventions)