Skip to content

Commit

Permalink
ci: Add GPU benchmarks and configure with just script (#790)
Browse files Browse the repository at this point in the history
* Upgrade bench runner: 8 vCPUs/32GB RAM -> 32/64

* Add justfile and GPU benchmarks

* Add configurable GPU benchmarks

* fix: checkout & other details (#8)

* fix: name & checkout

* fix just version

* Refactor job triggers

---------

Co-authored-by: François Garillot <4142+huitseeker@users.noreply.github.com>
  • Loading branch information
samuelburnham and huitseeker authored Oct 31, 2023
1 parent 233bf4b commit 2771b3d
Show file tree
Hide file tree
Showing 9 changed files with 297 additions and 114 deletions.
59 changes: 59 additions & 0 deletions .github/workflows/bench-deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: GPU benchmark on `master`
on:
push:
branches:
- master

jobs:
# TODO: Account for different `justfile` and `bench.env` files
# One option is to upload them to gh-pages for qualitative comparison
# TODO: Fall back to a default if `justfile`/`bench.env` not present
benchmark:
name: Bench and deploy
runs-on: [self-hosted, gpu-bench, gh-pages]
steps:
# Install deps
- uses: actions/checkout@v4
- uses: actions-rs/toolchain@v1
- uses: Swatinem/rust-cache@v2
- uses: taiki-e/install-action@v2
with:
tool: just@1.15.0
# Set up GPU
# Check we have access to the machine's Nvidia drivers
- run: nvidia-smi
# Check that CUDA is installed with a driver-compatible version
# This must also be compatible with the GPU architecture, see above link
- run: nvcc --version
# Run benchmarks and deploy
- name: Get old benchmarks
uses: actions/checkout@v4
with:
ref: gh-pages
path: gh-pages
- run: mkdir -p target; cp -r gh-pages/benchmarks/criterion target;
- name: Install criterion
run: cargo install cargo-criterion
- name: Run benchmarks
run: just --dotenv-filename bench.env gpu-bench fibonacci_lem
# TODO: Prettify labels for easier viewing
# Compress the benchmark file and metadata for later analysis
- name: Compress artifacts
run: |
echo $LABELS > labels.md
tar -cvzf ${{ github.sha }}.tar.gz Cargo.lock ${{ github.sha }}.json labels.md
- name: Deploy latest benchmark report
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./target/criterion
destination_dir: benchmarks/criterion
- name: Copy benchmark json to history
run: mkdir history; cp ${{ github.sha }}.tar.gz history/
- name: Deploy benchmark history
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: history/
destination_dir: benchmarks/history
keep_files: true
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ concurrency:
cancel-in-progress: true

jobs:
run-benchmark:
cpu-benchmark:
name: run end2end benchmark
runs-on: ubuntu-benchmark-runner
runs-on: buildjet-32vcpu-ubuntu-2204
if:
github.event.issue.pull_request
&& github.event.issue.state == 'open'
Expand All @@ -35,34 +35,56 @@ jobs:
- uses: boa-dev/criterion-compare-action@v3
with:
# Optional. Compare only this benchmark target
benchName: "end2end"
benchName: "fibonacci_lem"
# Needed. The name of the branch to compare with
branchName: ${{ github.ref_name }}

# TODO: Check it works with forked PRs when running
# `gh pr checkout {{ github.event.issue.number}}` with `env: GH_TOKEN`
gpu-benchmark:
name: run fibonacci benchmark on GPU
runs-on: [self-hosted, gpu-bench]
if:
github.event.issue.pull_request
&& github.event.issue.state == 'open'
&& contains(github.event.comment.body, '!benchmark')
&& contains(github.event.comment.body, '!gpu-benchmark')
&& (github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER')
steps:
# Set up GPU
# Check we have access to the machine's Nvidia drivers
- run: nvidia-smi
# The `compute`/`sm` number corresponds to the Nvidia GPU architecture
# In this case, the self-hosted machine uses the Ampere architecture, but we want this to be configurable
# See https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/
# Writes env vars to `bench.env` to be read by `just` command
- name: Set env for CUDA compute
run: echo "CUDA_ARCH=$(nvidia-smi --query-gpu=compute_cap --format=csv,noheader | sed 's/\.//g')" >> bench.env
- name: set env for EC_GPU
run: echo 'EC_GPU_CUDA_NVCC_ARGS=--fatbin --gpu-architecture=sm_${{ env.CUDA_ARCH }} --generate-code=arch=compute_${{ env.CUDA_ARCH }},code=sm_${{ env.CUDA_ARCH }}' >> bench.env
# Check that CUDA is installed with a driver-compatible version
# This must also be compatible with the GPU architecture, see above link
- run: nvcc --version

- uses: xt0rted/pull-request-comment-branch@v2
id: comment-branch

- uses: actions/checkout@v4
if: success()
with:
ref: ${{ steps.comment-branch.outputs.head_ref }}
# Set the Rust env vars
- uses: actions-rs/toolchain@v1
- uses: Swatinem/rust-cache@v2
# Strict load => panic if .env file not found
- name: Load env vars
uses: xom9ikk/dotenv@v2
with:
path: bench.env
load-mode: strict

- uses: boa-dev/criterion-compare-action@v3
with:
# Optional. Compare only this benchmark target
benchName: "fibonacci"
benchName: "fibonacci_lem"
# Optional. Features activated in the benchmark
features: "cuda,opencl"
features: "cuda"
# Needed. The name of the branch to compare with
branchName: ${{ github.ref_name }}
37 changes: 0 additions & 37 deletions .github/workflows/benchmark.yml

This file was deleted.

9 changes: 6 additions & 3 deletions .github/workflows/gpu.yml → .github/workflows/gpu-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@
name: GPU tests

on:
push:
branches:
- master
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
branches: [master]
merge_group:

env:
CARGO_TERM_COLOR: always
Expand Down Expand Up @@ -36,6 +37,7 @@ concurrency:
jobs:
cuda:
name: Rust tests on CUDA
if: github.event_name != 'pull_request' || github.event.action == 'enqueued'
runs-on: [self-hosted, gpu-ci]
env:
NVIDIA_VISIBLE_DEVICES: all
Expand Down Expand Up @@ -68,6 +70,7 @@ jobs:
opencl:
name: Rust tests on OpenCL
if: github.event_name != 'pull_request' || github.event.action == 'enqueued'
runs-on: [self-hosted, gpu-ci]
env:
NVIDIA_VISIBLE_DEVICES: all
Expand Down
128 changes: 128 additions & 0 deletions .github/workflows/merge-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# Run final tests only when attempting to merge, shown as skipped status checks beforehand
name: Merge group tests

on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
branches: [master]
merge_group:

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
linux-ignored:
if: github.event_name != 'pull_request' || github.event.action == 'enqueued'
runs-on: buildjet-16vcpu-ubuntu-2204
env:
RUSTFLAGS: -D warnings
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
- uses: actions-rs/toolchain@v1
- uses: taiki-e/install-action@nextest
- uses: Swatinem/rust-cache@v2
- name: Linux Tests
run: |
cargo nextest run --profile ci --workspace --cargo-profile dev-ci --run-ignored ignored-only -E 'all() - test(groth16::tests::outer_prove_recursion) - test(test_make_fcomm_examples) - test(test_functional_commitments_demo) - test(test_chained_functional_commitments_demo)'
linux-arm:
if: github.event_name != 'pull_request' || github.event.action == 'enqueued'
runs-on: buildjet-16vcpu-ubuntu-2204-arm
env:
RUSTFLAGS: -D warnings
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
- uses: actions-rs/toolchain@v1
- uses: taiki-e/install-action@nextest
- uses: Swatinem/rust-cache@v2
- name: Linux Tests
run: |
cargo nextest run --profile ci --workspace --cargo-profile dev-ci
- name: Linux Gadget Tests w/o debug assertions
run: |
cargo nextest run --profile ci --workspace --cargo-profile dev-no-assertions -E 'test(circuit::gadgets)'
mac-m1:
if: github.event_name != 'pull_request' || github.event.action == 'enqueued'
runs-on: macos-latest-xlarge
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
- uses: actions-rs/toolchain@v1
- uses: taiki-e/install-action@nextest
- uses: Swatinem/rust-cache@v2
- name: Linux Tests
run: |
cargo nextest run --profile ci --workspace --cargo-profile dev-ci
- name: Linux Gadget Tests w/o debug assertions
run: |
cargo nextest run --profile ci --workspace --cargo-profile dev-no-assertions -E 'test(circuit::gadgets)'
# TODO: Make this a required status check
# Run comparative benchmark against master, reject on regression
gpu-benchmark:
if: github.event_name != 'pull_request' || github.event.action == 'enqueued'
name: Run fibonacci bench on GPU
runs-on: [self-hosted, gpu-bench]
steps:
# TODO: Factor out GPU setup into an action or into justfile, it's used in 4 places
# Set up GPU
# Check we have access to the machine's Nvidia drivers
- run: nvidia-smi
# Check that CUDA is installed with a driver-compatible version
# This must also be compatible with the GPU architecture, see above link
- run: nvcc --version
- uses: actions/checkout@v4
# Install dependencies
- uses: actions-rs/toolchain@v1
- uses: Swatinem/rust-cache@v2
- uses: taiki-e/install-action@v2
with:
tool: just@1.15
- name: Install criterion
run: |
cargo install cargo-criterion
cargo install criterion-table
# Checkout base branch for comparative bench
- uses: actions/checkout@v4
with:
ref: master
path: master
# Copy the script so the base can bench with the same parameters
- name: Copy source script to base branch
run: cd benches && cp justfile bench.env ../master/benches
- name: Set base ref variable
run: cd master && echo "BASE_REF=$(git rev-parse HEAD)" >> $GITHUB_ENV
- run: echo ${{ env.BASE_REF }}
- name: Run GPU bench on base branch
run: cd master/benches && just --dotenv-filename bench.env gpu-bench fibonacci_lem
- name: Copy bench output to PR branch
run: cp master/${{ env.BASE_REF }}.json .
- name: Run GPU bench on PR branch
run: cd benches && just --dotenv-filename bench.env gpu-bench fibonacci_lem
# Create a `criterion-table` and write in commit comment
- name: Run `criterion-table`
run: cat ${{ github.sha }}.json | criterion-table > BENCHMARKS.md
- name: Write bench on commit comment
uses: peter-evans/commit-comment@v3
with:
body-path: BENCHMARKS.md
# TODO: Use jq for JSON parsing if needed
# Check for benchmark regression based on Criterion's configured noise threshold
- name: Performance regression check
id: check-regression
run: |
echo "regress_count=$(grep -c 'Regressed' ${{ github.sha }}.json)" >> $GITHUB_OUTPUT
# Fail job if regression found
- uses: actions/github-script@v6
if: ${{ steps.check-regression.outputs.regress_count }} > 0
with:
script: |
core.setFailed('Fibonacci bench regression detected')
65 changes: 0 additions & 65 deletions .github/workflows/merge_group.yml

This file was deleted.

Loading

0 comments on commit 2771b3d

Please sign in to comment.