API documentation for the readers module#

The readers module provides the function load_to_dataarray(), which is used to load a set of data variables from a file and convert them into a dictionary of DataArray objects. The DataArray values can then be added to a Data instance for use in a Virtual Ecosystem simulation.

The module also supports the registration of different reader functions, used to convert files in different storage formats into a DataArray. The load_to_dataarray() automatically uses an appropriate reader based on the file suffix.

The FILE_FORMAT_REGISTRY#

The FILE_FORMAT_REGISTRY is used to register a set of known file formats for use in load_to_dataarray(). This registry is extendable, so that new functions that implement data loading for a given file format can be added.

New file format readers are made available using the register_file_format_loader() decorator, which needs to specify the file formats supported (as a tuple of file suffixes) and then decorates a function that returns a DataArray that can be added to a Data instance and validated using validate_dataarray(). For example:

@register_file_format_loader(('.tif', '.tiff'))
def new_function_to_load_tif_data(...):
    # code to turn tif file into a data array

Data:

FILE_FORMAT_REGISTRY

A registry for different file format loaders

Functions:

load_csv(file, var_names)

Loads a DataArray from a csv file.

load_excel(file, var_names)

Loads a DataArray from an excel file.

load_netcdf(file, var_names)

Loads a DataArray from a NetCDF file.

load_to_dataarray(file, var_names)

Loads data from a file into a DataArray.

register_file_format_loader(file_types)

Adds a data loader function to the data loader registry.

virtual_ecosystem.core.readers.FILE_FORMAT_REGISTRY: dict[str, Callable] = {'.csv': <function load_csv>, '.nc': <function load_netcdf>, '.xlsx': <function load_excel>}#

A registry for different file format loaders

This dictionary maps a tuple of file format suffixes onto a function that allows the data to be loaded. That loader function should coerce the data into an xarray DataArray.

Users can register their own functions to load from a particular file format using the register_file_format_loader() decorator. The function itself should have the following signature:

func(file: Path, var_names: str) -> dict[str, DataArray]
virtual_ecosystem.core.readers.load_csv(file: Path, var_names: list[str]) dict[str, DataArray][source]#

Loads a DataArray from a csv file.

Parameters:
  • file – A Path for a csv or excel file containing the variable to load.

  • var_names – A list of strings providing the names of the variables to be loaded from the file.

Raises:
  • FileNotFoundError – with bad file path names.

  • ParserError – if the csv data is not readable.

virtual_ecosystem.core.readers.load_excel(file: Path, var_names: list[str]) dict[str, DataArray][source]#

Loads a DataArray from an excel file.

Parameters:
  • file – A Path for a csv or excel file containing the variable to load.

  • var_names – A list of strings providing the names of the variables to be loaded from the file.

Raises:
  • FileNotFoundError – with bad file path names.

  • BadZipFile – if the excel file is corrupted.

  • Exception – catches other exceptions from openpyxl.

Note

BadZipFile is the most common error thrown by openpyxl for corrupted excel files, which is based on their internal processing files as zips. The general exception is included to cover other possible issues from openpyxl, as it has various other potential failure modes.

virtual_ecosystem.core.readers.load_netcdf(file: Path, var_names: list[str]) dict[str, DataArray][source]#

Loads a DataArray from a NetCDF file.

Parameters:
  • file – A Path for a NetCDF file containing the variable to load.

  • var_names – A list of strings providing the names of the variables to be loaded from the file.

Raises:
virtual_ecosystem.core.readers.load_to_dataarray(file: Path, var_names: list[str]) dict[str, DataArray][source]#

Loads data from a file into a DataArray.

The function takes a path to a file format supported in the FILE_FORMAT_REGISTRY and a list of variable names that are asserted to be stored in the file. It uses the appropriate data loader function to load the data and convert it to a {class}`~xarray.DataArray`, ready for insertion into a Data instance.

Parameters:
  • file – A Path for the file containing the variable to load.

  • var_names – A list of strings providing the names of variables in the file.

Raises:

ValueError – if there is no loader provided for the file format.

virtual_ecosystem.core.readers.register_file_format_loader(file_types: tuple[str, ...]) Callable[source]#

Adds a data loader function to the data loader registry.

This decorator is used to register a function that loads data from a given file type and coerces it to a DataArray.

Parameters:

file_types – A tuple of strings giving the file type that the function will map onto the Grid. The strings should match expected file suffixes for the file type.