Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm: Implement SVE2 str2int #95

Merged
merged 1 commit into from
Oct 14, 2024
Merged

Conversation

supermartian
Copy link
Contributor

@supermartian supermartian commented Sep 11, 2024

This patch implements a SVE2 version of str2int which improves number decoding on ARM SVE2 CPUs.

The algorithm utilizes SVMATCH to count valid digits in the string, and SDOT for calculating the value. With this change the naive byte-by-byte method for ARM is substituted. Note that these instructions are SVE2 only.

Enable it by compiling with adding cmake option "-DENABLE_SVE2_128=ON".

Numbers were tested on NVIDIA Grace.

| Benchmark                      | Original     | SVE2       | Improvement |
|--------------------------------|--------------|------------|-------------|
| testdata/gsoc-2018.json        | 3406307500   | 3407847500 | 0.05%       |
| testdata/twitter.json          | 2040162500   | 2039502500 | -0.03%      |
| testdata/fgo.json              | 992470750    | 988730000  | -0.38%      |
| testdata/citm_catalog.json     | 2209855000   | 2205997500 | -0.17%      |
| testdata/twitterescaped.json   | 1767812500   | 1814302500 | 2.63%       |
| testdata/github_events.json    | 2142690000   | 2147452500 | 0.22%       |
| testdata/lottie.json           | 625985250    | 602068500  | -3.82%      |
| testdata/poet.json             | 2867745000   | 2880562500 | 0.45%       |
| testdata/otfcc.json            | 685481500    | 683760750  | -0.25%      |
| testdata/book.json             | 687309000    | 674716000  | -1.83%      |
| testdata/canada.json           | 775475000    | 1051945000 | 35.65%      |

This PR is contributed by NVIDIA

@supermartian supermartian force-pushed the sve2-str2int branch 2 times, most recently from 895c590 to 5b2d104 Compare September 11, 2024 16:49
@xiegx94
Copy link
Collaborator

xiegx94 commented Oct 8, 2024

@supermartian format codes with clang-format

This patch implements a SVE2 version of str2int which improves
number decoding on ARM SVE2 CPUs.

The algorithm utilizes SVMATCH to count valid digits in the string, and SDOT
for calculating the value. With this change the naive byte-by-byte method for ARM
is substituted.

Enable it by compiling with adding cmake option "-DENABLE_SVE2_128=ON".

| Benchmark                      | Original     | SVE2       | Improvement |
|--------------------------------|--------------|------------|-------------|
| testdata/gsoc-2018.json        | 3406307500   | 3407847500 | 0.05%       |
| testdata/twitter.json          | 2040162500   | 2039502500 | -0.03%      |
| testdata/fgo.json              | 992470750    | 988730000  | -0.38%      |
| testdata/citm_catalog.json     | 2209855000   | 2205997500 | -0.17%      |
| testdata/twitterescaped.json   | 1767812500   | 1814302500 | 2.63%       |
| testdata/github_events.json    | 2142690000   | 2147452500 | 0.22%       |
| testdata/lottie.json           | 625985250    | 602068500  | -3.82%      |
| testdata/poet.json             | 2867745000   | 2880562500 | 0.45%       |
| testdata/otfcc.json            | 685481500    | 683760750  | -0.25%      |
| testdata/book.json             | 687309000    | 674716000  | -1.83%      |
| testdata/canada.json           | 775475000    | 1051945000 | 35.65%      |

Signed-off-by: Yuzhong Wen <yuwen@nvidia.com>
@supermartian
Copy link
Contributor Author

@supermartian format codes with clang-format

Thanks for the review! Formatting done.

@athosServer
Copy link

Does the v1.0.1 release version support ARM architecture CPUs?

@xiegx94
Copy link
Collaborator

xiegx94 commented Oct 14, 2024

Does the v1.0.1 release version support ARM architecture CPUs?

yes

Copy link
Collaborator

@xiegx94 xiegx94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiegx94 xiegx94 merged commit d43066f into bytedance:master Oct 14, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants