Getting Started¶
This section will get you up and running with dsbuilder in 10 minutes or so. For more information checkout the User Guide.
Setting up¶
dsbuilder can be installed using pip:
pip install dsbuilder
And then imported into your work:
import dsbuilder
Defining a template dataset¶
Dataset specifications are described in python dictionaries. Most importantly is the variables dictionary, which defines the dataset variable structure. Each entry in this dictionary is a new variable. The key of the dictionary entry is the name of the variable, the value is a further dictionary that defines the variable with the following entries:
dim- list of variable dimension names.dtype- variable data type, generally anumpy.dtype, though for some special variables particular values may be required.attributes- dictionary of variable metadata, for some special variables particular entries may be required.encoding- (optional) variable encoding.
Therefore, a variables dictionary for a dataset containing red, green and blue radiance band variables may look as follows:
variable_dict = {
"band_red": {
"dim": ["x", "y"],
"dtype": np.float32,
"attributes": {"units": "W m-2 sr-1 m-1"},
}
"band_green": {
"dim": ["x", "y"],
"dtype": np.float32,
"attributes": {"units": "W m-2 sr-1 m-1"},
}
"band_blue": {
"dim": ["x", "y"],
"dtype": np.float32,
"attributes": {"units": "W m-2 sr-1 m-1"},
}
}
Creating a template dataset¶
With the variables dictionary prepared, only two more specifications are required to build a template dataset. First a dictionary that defines the sizes of all the dimensions used in the variables dictionary, e.g.:
dim_size_dict = {"x": 1000, "y": 2000}
Secondly, a dictionary of dataset global metadata, e.g.:
metadata = {"dataset_name": "my cool image"}
Combining the above together a template dataset can be created as follows:
ds = dsbuilder.create_template_dataset(
variables_dict,
dim_sizes_dict,
metadata
)
Where ds is an empty xarray.Dataset with variables defined by the template definition. Fill values for the empty arrays are chosen using the cf convention values.
Populating and writing the dataset¶
Populating and writing the dataset can be achieved using xarray’s builtin functionality. Here’s a dummy example:
ds["band_red"] = ... # populate variable with red image array
ds["band_green"] = ... # populate variable with green image array
ds["band_blue"] = ... # populate variable with blue image array
ds.to_netcdf("path/to/file.nc")