API documentation for the utils module#
The core.utils module contains functions that are used across the
Virtual Ecosystem, but which don’t have a natural home in a specific module. Adding
functions here can be a good way to reduce the amount boiler plate code generated for
tasks that are repeated across modules.
Functions:
|
Check that final output file is not already in the output folder. |
|
Check a list of arrays form a data frame. |
Split a data frame by a grouping variable. |
- virtual_ecosystem.core.utils.check_outfile(merge_file_path: Path) None[source]#
Check that final output file is not already in the output folder.
- Parameters:
merge_file_path – Path to save merged config file to (i.e. folder location + file name)
- Raises:
ConfigurationError – If the path is invalid or the final output file already exists.
- virtual_ecosystem.core.utils.confirm_variables_form_data_frame(var_arrays: dict[str, ndarray[tuple[Any, ...], dtype[_ScalarT]]]) None[source]#
Check a list of arrays form a data frame.
This is a utility method to check if a set of arrays form a data frame: a set of equal length, one dimensional arrays, providing consistent tuples of values across the variables.
Note
This function and
split_arrays_by_grouping_variable()could be methods of theDataclass, but then would only be usable for arrays stored within aDatainstance. At present, they are provided within theutilsmodule so that they can be used independently.- Parameters:
var_arrays – A dictionary of arrays keyed by variable name.
- Raises:
ValueError – The input values do not form a data frame.
- virtual_ecosystem.core.utils.split_arrays_by_grouping_variable(var_arrays: dict[str, ndarray[tuple[Any, ...], dtype[_ScalarT]]], group_by: str) dict[Any, dict[str, ndarray[tuple[Any, ...], dtype[_ScalarT]]]][source]#
Split a data frame by a grouping variable.
This function takes a set of one dimensional arrays of equal length - forming a data frame - and splits the values into lists of subarrays by a grouping variable. It sorts the arrays by the grouping variable before splitting the data.
Note
This function and
confirm_variables_form_data_frame()could be methods of theDataclass, but then would only be usable for arrays stored within aDatainstance. At present, they are provided within theutilsmodule so that they can be used independently.- Parameters:
var_arrays – A dictionary of arrays keyed by variable name.
group_by – The variable name to be used to split the arrays.
- Returns:
A dictionary of lists of subarrays for each group, keyed by unique values in the grouping variable.