dsbuilder.dataset_util module

Utilities for creating xarray dataset variables in specified forms

class dsbuilder.dataset_util.DatasetUtil

Bases: object

Class to provide utilities for generating standard xarray DataArrays and Variables

static add_encoding(variable, dtype, scale_factor=1.0, offset=0.0, fill_value=None, chunksizes=None)

Add encoding to xarray Variable to apply when writing netCDF files

Parameters
  • variable (Variable) – data variable

  • dtype ({'Character': 'c', 'Integer': 'bhilqp', 'UnsignedInteger': 'BHILQP', 'Float': 'efdg', 'Complex': 'FDG', 'AllInteger': 'bBhHiIlLqQpP', 'AllFloat': 'efdgFDG', 'Datetime': 'Mm', 'All': '?bhilqpBHILQPefdgFDGSUVOMm'}) – numpy data type

  • scale_factor (Optional[float]) – variable scale factor

  • offset (Optional[float]) – variable offset value

  • fill_value (Union[int, float, None]) – fill value

  • chunksizes (Optional[float]) – chucksizes

static create_default_array(dim_sizes, dtype, dim_names=None, fill_value=None)

Return default empty xarray DataArray

Parameters
  • dim_sizes (List[int]) – dimension sizes, i.e. [dim1_size, dim2_size, dim3_size] (e.g. [2,3,5])

  • dtype ({'Character': 'c', 'Integer': 'bhilqp', 'UnsignedInteger': 'BHILQP', 'Float': 'efdg', 'Complex': 'FDG', 'AllInteger': 'bBhHiIlLqQpP', 'AllFloat': 'efdgFDG', 'Datetime': 'Mm', 'All': '?bhilqpBHILQPefdgFDGSUVOMm'}) – numpy data type

  • dim_names (Optional[List[str]]) – dimension names, i.e. ["dim1_name", "dim2_name", "dim3_name"]

  • fill_value (Union[int, float, None]) – fill value (if None CF compliant value used)

Return type

DataArray

Returns

Default empty array

static create_flags_variable(dim_sizes, meanings, dim_names=None, attributes=None)

Return default empty 1d xarray flag Variable

Parameters
  • dim_sizes (List[int]) – dimension sizes, i.e. [dim1_size, dim2_size, dim3_size] (e.g. [2,3,5])

  • meanings (List[str]) – flag meanings by bit

  • dim_names (Optional[List[str]]) – dimension names, i.e. ["dim1_name", "dim2_name", "dim3_size"]

  • attributes (Optional[Dict]) – dictionary of variable attributes, e.g. standard_name

Return type

Variable

Returns

Default empty flag vector variable

static create_unc_variable(dim_sizes, dtype, dim_names, attributes=None, pdf_shape='gaussian', err_corr=None)

Return default empty 1d xarray uncertainty Variable

Parameters
  • dim_sizes (List[int]) – dimension sizes, i.e. [dim1_size, dim2_size, dim3_size] (e.g. [2,3,5])

  • dtype ({'Character': 'c', 'Integer': 'bhilqp', 'UnsignedInteger': 'BHILQP', 'Float': 'efdg', 'Complex': 'FDG', 'AllInteger': 'bBhHiIlLqQpP', 'AllFloat': 'efdgFDG', 'Datetime': 'Mm', 'All': '?bhilqpBHILQPefdgFDGSUVOMm'}) – data type

  • dim_names (List[str]) – dimension names, i.e. ["dim1_name", "dim2_name", "dim3_size"]

  • attributes (Optional[Dict]) – dictionary of variable attributes, e.g. standard_name

  • pdf_shape (str) – (default: “gaussian”) pdf shape of uncertainties

  • err_corr (Optional[List[Dict[str, Union[str, List]]]]) – uncertainty error-correlation structure definition, defined as below.

Return type

Variable

Returns

Default empty flag vector variable

Each element of err_corr is a dictionary that defines the error-correlation along one or more dimensions, which should include the following entries:

  • dim (str/list) - name of the dimension(s) as a str or list of str’s (i.e. from dim_names)

  • form (str) - error-correlation form name, functional form of error-correlation structure for dimension(s)

  • params (list) - (optional) parameters of the error-correlation structure defining function for dimension if required. The number of parameters required depends on the particular form.

  • units (list) - (optional) units of the error-correlation function parameters for dimension (ordered as the parameters)

For more information on the required form of these entries, see the uncertainties section of the user guide.

Note

If the error-correlation structure is not defined along a particular dimension (i.e. it is not included in err_corr), the error-correlation is assumed random. Variable attributes are populated to the effect of this assumption.

static create_variable(dim_sizes, dtype, dim_names=None, attributes=None, fill_value=None)

Return default empty xarray Variable

Parameters
  • dim_sizes (List[int]) – dimension sizes, i.e. [dim1_size, dim2_size, dim3_size] (e.g. [2,3,5])

  • dtype ({'Character': 'c', 'Integer': 'bhilqp', 'UnsignedInteger': 'BHILQP', 'Float': 'efdg', 'Complex': 'FDG', 'AllInteger': 'bBhHiIlLqQpP', 'AllFloat': 'efdgFDG', 'Datetime': 'Mm', 'All': '?bhilqpBHILQPefdgFDGSUVOMm'}) – numpy data type

  • dim_names (Optional[List[str]]) – dimension names as strings, i.e. ["dim1_name", "dim2_name", "dim3_size"]

  • attributes (Optional[Dict]) – dictionary of variable attributes, e.g. standard_name

  • fill_value (Union[int, float, None]) – fill value (if None CF compliant value used)

Return type

Variable

Returns

Default empty variable

static get_default_fill_value(dtype)

Returns default fill_value for given data type

Parameters

dtype ({'Character': 'c', 'Integer': 'bhilqp', 'UnsignedInteger': 'BHILQP', 'Float': 'efdg', 'Complex': 'FDG', 'AllInteger': 'bBhHiIlLqQpP', 'AllFloat': 'efdgFDG', 'Datetime': 'Mm', 'All': '?bhilqpBHILQPefdgFDGSUVOMm'}) – numpy data type

Return type

Union[int, float]

Returns

CF-conforming fill value

static return_flags_dtype(n_masks)

Return required flags array data type

Parameters

n_masks (int) – number of masks required in flag array

Return type

{‘Character’: ‘c’, ‘Integer’: ‘bhilqp’, ‘UnsignedInteger’: ‘BHILQP’, ‘Float’: ‘efdg’, ‘Complex’: ‘FDG’, ‘AllInteger’: ‘bBhHiIlLqQpP’, ‘AllFloat’: ‘efdgFDG’, ‘Datetime’: ‘Mm’, ‘All’: ‘?bhilqpBHILQPefdgFDGSUVOMm’}

Returns

data type