API documentation for the config_builder module#
The config_builder provides tools to load a set of
TOML formatted configuration dictionaries, either from files or from strings. String
inputs are primarily intended for use in configuring models for testing, where it is
more convenient to simply provide a string.
The main class ConfigurationLoader handles the loading of configuration data
and compiling multiple sources into a single dictionary of configuration data.
The generate_configuration() function then:
takes a compiled dictionary of configuration settings,
assembles a pydantic validation model class using the configuration validators for each of the requested science modules, and
passes the data through the validator to return a validated configuration model for the simulation.
Canonical usage patterns for the module would be:
config_data = ConfigurationLoader(...)
config_object = generate_configuration(config_data.data)
Classes:
|
Configuration loading. |
Functions:
|
Resolve paths in a configuration file. |
|
Build a configuration model for a simulation. |
Compile a combined configuration multiple configuration dictionaries. |
|
|
Generate a configuration model from configuration data. |
|
Recursively merge two configuration dictionaries. |
- class virtual_ecosystem.core.config_builder.ConfigurationLoader(cfg_paths: str | Path | Sequence[str | Path] = [], cfg_strings: str | list[str] = [], cli_config: dict[str, Any] | None = None, autoload: bool = True)[source]#
Configuration loading.
The
ConfigurationLoaderclass is used to load and compile configuration data for a Virtual Ecosystem simulation. Configuration data can be passed in as one of:a list of paths to individual TOML configuration files or directories of TOML files (the
cfg_pathsargument) ora list of TOML strings providing configuration data (the
cfg_stringsargument).
In both cases, there is initial input validation of the argument values and then two data handling steps are run.
Data loading#
The
_load_data()method handles the parsing of the TOML inputs. For configuration data passed as strings, this is largely checking that the data is valid TOML.For configuration data passed as paths, the following steps occur:
The
_collect_config_paths()method is used to compile a complete list of the individual TOML files to be used to build the configuration from the provided paths.The
_load_config_toml()method is then used to parse the TOML content of each file, verifying that is valid TOML, and then store the parsed contents.The
_resolve_config_file_paths()method is then used to to update file paths in configuration inputs to resolve them to absolute file paths. This is so that the file paths in the final compiled configuration data are all mutually resolvable, as the input files may use relative paths and do not necessarily all live in the same directory.
At the end of this step, the
toml_contentsattribute will have been populated with individual parsed dictionaries of configuration data from each file or input string.Data compilation#
The
_compile_data()method is then run to compile the different individual dictionaries into a single configuration document. This method checks that configuration settings are uniquely set across the various configuration data sources. Thedataattribute then contains the complete compiled set of configuration data from the provided sources.- param cfg_paths:
A string, Path or list of strings or Paths giving configuration file or directory paths.
- param cfg_strings:
A string or list of strings containing TOML formatted configuration data.
- param cli_config:
Configuration settings provided by the user at the command line, used to override configuration settings in files.
- param autoload:
A boolean flag that can be used to turn off automatic data loading and compilation.
Methods:
Collect TOML config files from provided paths.
Compile configuration data.
Load the contents of resolved configuration files.
Load the contents of a config provided as a string.
Load configuration data.
Resolve the locations of configured file paths.
Attributes:
The configuration file paths, normalised from the cfg_paths argument.
A list of strings containing TOML content, provided by the
cfg_stringsargument.An optional dictionary of configuration settings passed at the command line that can be used to override configuration data loaded from file.
Configuration errors, as a list of tuples of key path and error details.
A dictionary of the compiled configuration data from the provided data sources.
A boolean flag indicating whether paths or strings were used to create the instance.
A list of configuration keys duplicated across configuration files.
A dictionary of the model classes specified in the configuration, keyed by model name.
A dictionary of the parsed TOML contents of config files or strings, keyed by file path or string index.
A list of TOML file paths resolved from the initial config paths.
- _collect_config_paths() None[source]#
Collect TOML config files from provided paths.
The
ConfigurationLoaderclass is initialised with a list of paths to either individual TOML config files or directories containing possibly multiple config files. This method examines that list to collect all the individual TOML config files in the provided locations and then populates thetoml_filesattribute.- Raises:
ConfigurationError – this is raised if any of the paths: do not exist, are directories that do not contain TOML files, are not TOML files or if the resolved files contain duplicate entries.
- _compile_data()[source]#
Compile configuration data.
This method compiles loaded configuration data into a single data dictionary, warning of conflicting or repeated settings across the sources.
- _load_config_toml() None[source]#
Load the contents of resolved configuration files.
This method populates the
toml_contentsdictionary with the contents of the configuration files set intoml_files.- Raises:
ConfigurationError – Invalid TOML content in config files.
- _load_config_toml_string() None[source]#
Load the contents of a config provided as a string.
This method populates the
toml_contentsdictionary with the parsed contents of a provided TOML formatted string.- Raises:
ConfigurationError – Invalid TOML string.
- _load_data()[source]#
Load configuration data.
This method loads configuration data from the sources set when the class instance was created.
- _resolve_config_file_paths() None[source]#
Resolve the locations of configured file paths.
Configuration files can contain paths to other resources, such as the paths to files containing input data variables. These paths can be absolute, but may also be relative to the location of the configuration file itself. This method is used to resolve the location of files to the common root of the provided set of configuration files, typically the path where a simulation is started.
- cfg_strings: list[str]#
A list of strings containing TOML content, provided by the
cfg_stringsargument.
- cli_config: dict[str, Any] | None#
An optional dictionary of configuration settings passed at the command line that can be used to override configuration data loaded from file.
- config_errors: list[tuple[str, Any]]#
Configuration errors, as a list of tuples of key path and error details.
- data: dict[str, Any]#
A dictionary of the compiled configuration data from the provided data sources.
- from_cfg_strings: bool#
A boolean flag indicating whether paths or strings were used to create the instance.
- model_classes: dict[str, Any]#
A dictionary of the model classes specified in the configuration, keyed by model name.
- virtual_ecosystem.core.config_builder._resolve_config_paths(config_dir: Path, config_dict: dict[str, Any]) None[source]#
Resolve paths in a configuration file.
Configuration files may contain keys providing file paths for data and other settings: these paths may be absolute but also could be relative to the specific configuration file. This becomes a problem when configurations are compiled across multiple configuration files, possibly in different locations, so this function searches the configuration dictionary loaded from a single file and updates configured relative paths to their absolute paths.
At present, the configuration schema does not have an explicit mechanism to type a configuration option as being a path, so we currently use the _path suffix to indicate configuration options setting a path. So, this function recursively search a configuration file payload for values stored under keys ending in _path and resolves the paths.
It does not attempt to resolve paths when the value starts with
$as these are taken to be marker values for used in path substitution.- Parameters:
config_dir – A folder containing a configuration file.
config_dict – A dictionary of contents of the configuration file, which may contain file paths to resolve.
- Raises:
ValueError – if a key ending in
_pathhas a non-string value.
- virtual_ecosystem.core.config_builder.build_configuration_model(requested_modules: list[str], requested_disturbances: list[str]) type[CompiledConfiguration][source]#
Build a configuration model for a simulation.
This function identifies the modules to be configured from the top-level configuration keys in a compiled configuration dictionary. It then registers the required modules to populate the module registry and to access the BaseModel and root configuration models for each requested model.
The configuration models are then combined dynamically to give a single combined pydantic base model for the model elements requested for a given simulation. This is returned and can then be used to validate the data provided in the configuration files.
The returned model class also provides the class variables
_model_classesthat provides a dictionary of the requested modules and their BaseModel instances.
- virtual_ecosystem.core.config_builder.compile_configuration_data(data: list[dict]) tuple[dict, set[str]][source]#
Compile a combined configuration multiple configuration dictionaries.
This method sequentially merges configuration dictionaries, such as those loaded from multiple individual configuration files, into a single configuration dictionary. It returns the merged dictionary and a set of keys that have duplicated definitions in the input files.
- virtual_ecosystem.core.config_builder.generate_configuration(data: dict[str, Any] = {}, context: Any | None = None) CompiledConfiguration[source]#
Generate a configuration model from configuration data.
This method takes a dictionary of configuration data and tries to build a validated configuration model. The input data is typically loaded and compiled using the
ConfigurationLoaderclass.The first step is to take the root sections in the configuration data - indicating the various science models requested for a simulation - and uses those to build a composite configuration validator class.
The provided data is then passed into the validator. If validation is successful then a validated configuration object is returned, otherwise the specific validation errors are written to the log and the function raises a :class`ConfigurationError`
The pydantic validation process allows validation context to be passed to a validator object and this context is shared with daughter validators. At the moment, this is only used to pass path substitutions to validation.
- Parameters:
data – A dictionary of unvalidated configuration data.
context – Additional context to be passed to validation.
- virtual_ecosystem.core.config_builder.merge_configuration_dicts(dest: dict, source: dict, **kwargs) tuple[dict, set[str]][source]#
Recursively merge two configuration dictionaries.
This function returns a copy of the input
destdictionary that has been extended recursively with the entries from the inputsourcedictionary.The merging process looks for duplicated settings. In general, if two input dictionaries share complete key paths (that is a set of nested dictionary keys leading to a value) then that indicates a duplicated setting. The values might be identical, but the configuration files should not duplicate settings. When duplicated key paths are found, the value from the source dictionary is used and the function extends the returned
conflictsset with the duplicated key path.However an exception is where both entries are lists - for example, resulting from a TOML array of tables (https://toml.io/en/v1.0.0#array-of-tables). In this case, it is reasonable to append the source values to the destination values. The motivating example here are [[core.data.variable]] entries, which can quite reasonably be split across configuration sources. Note that no attempt is made to check that the combined values are congruent - this is deferred to error handling when the configuration data is loaded.
- Parameters:
dest – A dictionary to extend
source – A dictionary of key value pairs to extend
dest**kwargs – Additional arguments used in recursion
- Returns:
A copy of dest, extended recursively with values from source, and a tuple of duplicate key paths.