carpet_concentrations.input4MIPs.dataset
Input4MIPsDataset and associated metadata
Input4MIPsMetadata
- class Input4MIPsMetadata(activity_id, contact, Conventions, dataset_category, frequency, further_info_url, grid_label, institution, institution_id, mip_era, nominal_resolution, realm, source_version, source_id, source, target_mip, title)[source]
Bases:
objectInput4MIPs metadata
These are all required fields.
Notes
variable_id is not included here because it should be derived from the data (which is combined with the metadata elsewhere).
Input4MIPsMetadataOptional
- class Input4MIPsMetadataOptional(comment=None, data_specs_version=None, external_variables=None, grid=None, history=None, product=None, references=None, region=None, release_year=None, source_description=None, source_type=None, table_id=None, table_info=None, license=None)[source]
Bases:
objectInput4MIPs optional metadata
These are all optional fields.
Notes
This is currently written such that no fields outside of these can be provided. We don’t fully understand the input4MIPs rules, so this could easily be the wrong choice. Refactoring should be relatively straightforward if needed. It would make sense that these fields are locked to avoid clashes with compulsory metadata…?
- comment: str | None
Comment on the dataset
- data_specs_version: str | None
Data specs version used when creating the dataset
- external_variables: str | None
Variables relevant to the dataset that aren’t included in the dataset itself
For example, cell area variables like ‘areacella’
- grid: str | None
Human-readable version of the grid on which the dataset applies
- history: str | None
File modification history
- license: str | None
License information
- product: str | None
Product the data represents
- references: str | None
References related to the dataset
- region: str | None
Region to which the dataset applies
- release_year: str | None
Release year of the dataset
- source_description: str | None
Description of the dataset’s source
- source_type: str | None
Description of the type of the dataset’s source
- table_id: str | None
No idea, maybe the CMOR table used to write the dataset
- table_info: str | None
No idea, maybe info about the CMOR table used to write the dataset
Input4MIPsDataset
- class Input4MIPsDataset(ds, directory_template='{activity_id}/{mip_era}/{target_mip}/{institution_id}/{source_id}/{realm}/{frequency}/{variable_id}/{grid_label}/v{version}', filename_template='{variable_id}_{activity_id}_{dataset_category}_{target_mip}_{source_id}_{grid_label}_{start_date}_{end_date}.nc')[source]
Bases:
objectInput4MIPs dataset
Holds input4MIPs data and also helps write them to disk in a way that conforms to input4MIPs standards
-
ds:
xarray.core.dataset.Dataset Dataset
- classmethod from_metadata_autoadd_bounds_to_dimensions(ds, dimensions, metadata, metadata_optional=None, time_dimension='time', monthly_time_bounds=True, copy=True, **kwargs)[source]
Create instance from metadata and an unbounded dataset
For the given dimensions, bounds are checked and added if needed. The metadata is then used to fill out
ds’s metadata before initialising.- Parameters
ds (xr.Dataset) – Dataset
dimensions (tuple[str, ...]) – Dimensions of the dataset, these are checked for appropriate bounds.
metadata (Input4MIPsMetadata) – Metadata (required)
metadata_optional (Input4MIPsMetadataOptional | None) – Optional metadata
time_dimension (str) – The name of the time dimension. This is provided to give full control of the application of
monthly_time_boundsto the user.monthly_time_bounds (bool) – Should added time bounds cover each month? This is needed for data on a monthly timestep because the middle of each timestep is not the start and end of the month in the case when subsequent months don’t have the same number of days.
copy (bool) – Should a copy of the dataset be made? If no, the data is modified in place which can cause unexpected changes if references are not appropriately managed.
**kwargs (Any) – Other initialisation arguments for the instance. They are passed directly to the constructor.
- Returns
Input4MIPsDataset – Prepared instance
- Raises
AssertionError –
ds.attrsis already set or there is more than one variable inds
- get_filepath(ds_disk, root_data_dir)[source]
Get filepath
- Parameters
ds_disk (
xarray.core.dataset.Dataset) – Disk ready datasetroot_data_dir (
pathlib.Path) – Root directory in which to generate the filepath
- Returns
pathlib.Path– Filepath
-
ds:
format_date
get_version
add_time_bounds
- add_time_bounds(ds, monthly_time_bounds=False, output_dim='bounds')[source]
Add time bounds to a dataset
This should be pushed upstream to cf-xarray at some point probably
- Parameters
ds (
xarray.core.dataset.Dataset) – Dataset to which to add time boundsmonthly_time_bounds (
bool) – Are we looking at monthly data i.e. should the time bounds run from the start of one month to the next (which isn’t regular spacing but is most often what is desired/required)
- Returns
xarray.core.dataset.Dataset– Dataset with time bounds
Notes
There is no copy here,
dsis modified in place (callxarray.Dataset.copy()before passing if you don’t want this).
verify_disk_ready
- verify_disk_ready(ds)[source]
Verify that a dataset is disk ready
- Parameters
ds (
xarray.core.dataset.Dataset) – Dataset to check- Return type
Notes
Very rough, doesn’t really do anything right now