Skip to content

Feature: Test suite partitioning #449

@frenchy64

Description

@frenchy64

CI platforms like Actions and CircleCI support matrix builds which can be used to fan-out a number of parallel jobs executing a test suite.

For this to result in faster builds, the test runner must be able to partition a test suite.

An Actions build might look like this:

jobs:
  test:
    runs-on: ubuntu-22.04
    strategy:
      matrix:
        id: [0,1,2,3,4]
    steps:
      - run: ./bin/kaocha --partition-index ${{ strategy.job-index }} --partitions ${{ strategy.job-total }}

This would cover the entire test suite by running:

./bin/kaocha --partition-index 0 --partitions 5
./bin/kaocha --partition-index 1 --partitions 5
./bin/kaocha --partition-index 2 --partitions 5
./bin/kaocha --partition-index 3 --partitions 5
./bin/kaocha --partition-index 4 --partitions 5

You could imagine different strategies for partitioning:

  • split by test namespace
    • don't need to load tests you don't need
  • load all namespaces, split by deftest
    • share fixtures?
  • use timing results from prior runs to load-balance tests
  • have Kaocha inform CI how many partitions are needed in order to build in a certain timeframe, e.g.,
jobs:
  setup:
    runs-on: ubuntu-22.04
    outputs:
      partitions: ${{steps.partitions.outputs.partitions}}
    steps:
      - uses: actions/cache/restore@v4
        with:
          path: timings.edn
      - id: partitions
        run: echo "partitions=$(./bin/kaocha --print-partitions --target-time 5m --prior-timings timing.edn | bb -e '(-> *input* range json/encode println)')" >> GITHUB_OUTPUTS

  test:
    runs-on: ubuntu-22.04
    needs: setup
    strategy:
      matrix:
        id: ${{ fromJSON(needs.setup.outputs.partitions)}}
    steps:
      - run: ./bin/kaocha --partition-index ${{ strategy.job-index }} --partitions ${{ strategy.job-total }}

The partitioning algorithm must be deterministic and reproducible, with every test being run. It should assume that each --partition-index is covered, which is the user's responsibility (or could be packaged in a reusable Action). The simplest algorithm might be to sort tests by name before partitioning. Test runs could be randomized by using the current git sha as a seed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions