Feature: general purpose data distribution testing to support non-uniform data #129

jprorama · 2024-09-06T14:45:01Z

Adding a feature to support data distributions that express non-uniform workloads for HDF5 performance testing. This builds on the h5bench_write_normal_dist.c approach and extends it to allow per-rank data footprint specification via a data distribution configuration file.

Additional extensions to h5bench to allow selection and configuration of the new h5bench_data_dist.c test that encapsulates this feature.

Copy h5bench_write_normal_dist.c as baseline.

Created a new benchmark test name and added selection support into h5bench.py test wrapper.

Added parsing of DATA_DIST_PATH into params struct to record the data distrbution based on an input file.

Takes the provided DATA_DIST_PATH file and reads the data sizes per rank from the file feeding into holder array.

Create make targets and add file to install list.

Remove rank 0 output summary limit so all processes report performance rather than just accepting rank 0 results. This is especially important for non-uniform data distribution but should be considered for uniform since no summary stats are computed. Recommend --tag-output to track per-rank stdout. Remove naive total size compute and limit it to a per rank size value.

Add scaling paramter to tests to scale particle count in order to create memory footprints that more accurately reflect a data distribution. Data distribution inputs are in particle counts. Particles are 32-byte structures so a data distribution measured in bytes needs to be scaled down so the particals instiated match the actual data footprint, in multiples of 32-byte particles.

Change the deployed binary to "write_var_data_dist" so it matches the configured test reference in h5bench. The binary name needs to match the test name so that it can be called by the hbench wrapper. This follows convention of "write_var_normal_dist". Update code to log the correct benchmark name.

Change action/artifacts-upload from v2 to v4 to remove dependence on deprecated v2. The v4 syntax for the artifact-upload remains the same so a simple update of the version number should be sufficient. See blog post for details: https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/

Update to latest version of container to see if it avoids the missing distutils dependency reported with the @0.11 version.

…ama/h5bench into feat-data-dist Grab the github actions fixes to ensure current feature branch validation tests run successfully.

jprorama and others added 12 commits August 20, 2024 00:41

Start data distribution out as varient of normal distribution

8ead940

Copy h5bench_write_normal_dist.c as baseline.

Add write_var_data_dist benchmark support to h5bench.py

bddc4c9

Created a new benchmark test name and added selection support into h5bench.py test wrapper.

Add DATA_DIST_PATH support to params struct and config file

776cc51

Added parsing of DATA_DIST_PATH into params struct to record the data distrbution based on an input file.

Add data dist file parsing to populate memory allocation pattern

2088710

Takes the provided DATA_DIST_PATH file and reads the data sizes per rank from the file feeding into holder array.

Add build and install targets for new data distribution test

d899116

Create make targets and add file to install list.

Update clang-format container version to 0.18.2

7610377

Update to latest version of container to see if it avoids the missing distutils dependency reported with the @0.11 version.

Merge branch 'fix-github-upload-artifact-version' of github.com:jpror…

dff8efc

…ama/h5bench into feat-data-dist Grab the github actions fixes to ensure current feature branch validation tests run successfully.

Committing clang-format changes

fd64d7a

jeanbez self-requested a review September 19, 2024 21:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: general purpose data distribution testing to support non-uniform data #129

Feature: general purpose data distribution testing to support non-uniform data #129

jprorama commented Sep 6, 2024

Feature: general purpose data distribution testing to support non-uniform data #129

Are you sure you want to change the base?

Feature: general purpose data distribution testing to support non-uniform data #129

Conversation

jprorama commented Sep 6, 2024