Code quality and static typing#

We use:

  • pre-commit to ensure common code standards and style, and

  • mypy to provide static typing of the virtual_ecosystem codebase.

Using pre-commit#

As described in the developer overview, pre-commit is installed as by poetry as part of the virtual_ecosystem developer dependencies. At this point, it just need to be set up to run using:

poetry run pre-commit install
poetry run pre-commit run --all-files

This can take a while on the first run, and when the configuration updates, as the tool needs to install or update all the hooks that are applied to changes within a commit. Usually the hooks only run on files changed by a particular git commit but using pre-commit run --all-files scans the entire codebase and is a commonly used check to make sure all is well.

The pre-commit configuration#

The project root includes a configuration file for pre-commit that sets the hooks that will be run on each commit. The contents of the file can be revealed below along with a short description of the roles of each hook.

The pre-commit-config.yaml file
default_language_version:
    # force all unspecified python hooks to run python3
    python: python3.12
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v6.0.0
    hooks:
      - id: check-merge-conflict
      - id: debug-statements
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.15.15
    hooks:
      - id: ruff-check   #  Run the linter.
        args: [--fix, --exit-non-zero-on-fix, --target-version, py312]
      - id: ruff-format  #  Run the formatter.
  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: "v2.1.0"
    hooks:
      - id: mypy
        additional_dependencies: [types-jsonschema, xarray, types-tabulate, numpy]
  - repo: https://github.com/igorshubovych/markdownlint-cli
    rev: v0.48.0
    hooks:
    - id: markdownlint
  - repo: https://github.com/mwouts/jupytext
    rev: v1.19.3
    hooks:
    - id: jupytext
      args: [--pipe, black]
      files: docs/source
      exclude: |
        (?x)(
            ^docs/source/conf.py|
            ^docs/source/genindex.md|
            ^docs/source/modindex.md|
            ^docs/source/using_the_ve/[a-z_]+.py|
            ^docs/source/using_the_ve/variables/[a-z_]+.py
        )
      additional_dependencies:
        - black==24.4.2 # Matches hook
  - repo: https://github.com/codespell-project/codespell
    rev: v2.4.2
    hooks:
    - id: codespell
      args: ["--toml", "pyproject.toml"]
      additional_dependencies:
        - tomli
pre-commit-hooks

We use these basic hooks to check for remaining git merge conflict markers in code files (check-merge-conflicts hook) and for debugger imports and breakpoint() calls (dubug-statements hook), which should not end up in code in the repository.

ruff-pre-commit

This tool wraps the ruff code linter and formatter and we use both the linting (ruff) and formatting (ruff-format) hooks.

mypy

We use a hook here to run the mypy static typing checks on newly committed code. See below for more information.

markdownlint

Checks all markdown files for common formatting issues.

jupytext

This tool is used to pass all python code within notebooks through code formatting. At present, this still uses the black code formatter and not ruff-format as above.

codespell

This tool checks files for common mis-spellings of words. If codespell complains about a word that you think is correct, you can add it to .codespellignore.txt.

Output and configuration#

When pre-commit runs, you may see some lines about package installation and update, but the key information is the output below, which shows the status of the checks set up by each hook:

check for merge conflicts............................................Passed
debug statements (python)............................................Passed
ruff.................................................................Passed
ruff-format..........................................................Passed
mypy.................................................................Passed
markdownlint.........................................................Passed
jupytext.............................................................Passed
codespell............................................................Passed

Updating pre-commit#

The hooks used by pre-commit are constantly being updated to provide new features or to update code to deal with changes in the implementation. You can update the hooks manually using pre-commit autoupdate, but the configuration is regularly updated through the pre-commit.ci service.

Typing with mypy#

Unlike many programming languages, Python does not require variables to be declared as being of a particular type. For example, in C++, this code creates a variable that is explicitly an integer and a function that explicitly requires an integer and returns an integer value. This is called typing.

int my_integer = 15;

int fun(int num) {

  printf("num = %d \n", num);

  return 0;
}

Python does not require explicit typing. That can be very useful but it can also make it very difficult to be clear what kinds of variables are being used. The virtual_ecosystem project requires static typing of the source code: the syntax for this started with PEP 484 and a set of quality assurance tools have developed to help support clear and consistent typing. We use mypy to check static typing. It does take a bit of getting used to but is a key tool in maintaining clear code and variable structures.

Suppressing checking#

The pre-commit tools sometimes complain about things that we do not want to change. Almost all of the tools can be told to suppress checking, using comments with a set format to tell the tool what to do.

This should not be done lightly: we are using these QA tools for a reason.

  • Code linting issued identified by ruff can be ignored by either using # noqa: E501 to ignore the issue for that line.

  • Code formatting changes suggested by ruff-format can be suppressed by using the # fmt: off tag at the end of a specific line or wrapping a section in # fmt: off and then # fmt: on.

  • mypy uses the syntax # type: ignore comment to suppress warnings. Again, virtual_ecosystem requires that you provide the specific mypy error code to be ignored to avoid missing other issues: # type: ignore[operator].

  • markdownlint catches issues in Markdown files and uses a range of HTML comment tags to suppress format warnings. An example is <!-- markdownlint-disable-line MD001 --> but see the full list of the rule codes for details.