Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for zstd compression #1247

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

EdgarModesto23
Copy link

@EdgarModesto23 EdgarModesto23 commented Oct 6, 2024

Following #1200

This patch adds support for zstd compression and it updates build scripts and the main README to include the new -DPISTACHE_USE_CONTENT_ENCODING_ZSTD flag. I mostly followed the style suggested by @kiplingw on #1177 and #1178.

It uses version 1.5.6 of the C implementation of zstd library.

@kiplingw kiplingw added enhancement fix in progress dependencies Pull requests that update a dependency file experimental labels Oct 6, 2024
// Allocate a smart buffer to contain compressed data...
std::unique_ptr compressedData = std::make_unique<std::byte[]>(estimated_size);

// Compress data using compresion_level = 5: https://facebook.github.io/zstd/zstd_manual.html#Chapter5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently don't have a public API for the user to specify the compression level (each compressor can use a different range). Until we figure out an elegant way to do that, it might be best to set the compression level consistent with whatever we're using on the others (I think max).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe max is really max on zstd. We could also set compression level dynamically, or partially dynamically, taking into account available CPU bandwidth, available network bandwidth, number of cores (zstd is good with multithreading) etc.

But maybe a middling value is wisest until we do a well-thought-through API/algorithm for compression level?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgreatwood @kiplingw I agree. Using a middle value makes the most sense to me. At least until we find a better solution as both of you guys mentioned.

version.txt Outdated Show resolved Hide resolved
@@ -77,7 +81,8 @@ foreach test_name : pistache_test_files
gmock_dep,
curl_dep,
cpp_httplib_dep,
brotli_dep
brotli_dep,
zstd_dep
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that libzstd-dev also needs to be added to Build-Depends in debian/control. That's in a separate branch and so you'll need a separate PR for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @EdgarModesto23 -

I see a number of the github workflows (https://github.com/pistacheio/pistache/actions) are failing, e.g. with:
tests/meson.build:71:1: ERROR: Unknown variable "zstd_dep".
Some of that may be related to the need for an update to debian/control. Certainly the way brotli is dealt with should be a good model.

The workflows linuxflibev, linux and macOS libevent should all pass (occasionally the debian:testing tests under linux workflow will fail if debian:testing is in a funny state, but in fact it seems OK at present). The conventional-commits and abidiff are less important (just make sure you bump version number if abidiff changes).

Copy link
Contributor

@dgreatwood dgreatwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding http_server_test.cpp -
There are a lot of formatting changes that amount to indenting code that lies within encapsulating (scope limiting) curly braces, like:

{ // encapsulate
    // Indent this line by 4 more spaces
    // Etc.
}

The extra indentation that has been added is 100% logical, however it was not added before so that the encapsulating braces can be added or removed without causing a "diff" for each of the enclosed lines.

Would it be practical to revert the formatting changes from http_server_test.cpp, leaving only the substantive changes to be reviewed?

Copy link

codecov bot commented Oct 12, 2024

Codecov Report

Attention: Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.

Project coverage is 76.26%. Comparing base (e07dc0b) to head (acb30d4).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/common/http_header.cc 33.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1247      +/-   ##
==========================================
- Coverage   76.29%   76.26%   -0.04%     
==========================================
  Files          58       58              
  Lines       10027     9698     -329     
==========================================
- Hits         7650     7396     -254     
+ Misses       2377     2302      -75     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kiplingw
Copy link
Member

@Tachi107, your feedback is appreciated.

Copy link
Member

@Tachi107 Tachi107 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some feedback in a few different comments, check it out!

Also, the 1.5.6 release of zstd seems to have specific improvements to usage in web contexts. You might want to see what changed and evaluate if tweaking your implementation is worthwhile.

include/pistache/http.h Outdated Show resolved Hide resolved
include/pistache/http.h Outdated Show resolved Hide resolved
src/common/http.cc Outdated Show resolved Hide resolved
meson_options.txt Outdated Show resolved Hide resolved
Comment on lines 935 to 936
auto compress_size = ZSTD_compress((void*)compressedData.get(), estimated_size,
data, size, 11);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use reinterpret_cast instead of C-style casts. Also, what's this "11"? You've added a contentEncodingZstdLevel variable, why don't you use it?

Also, in any case, I believe that it should be initialized to the value returned by ZSTD_defaultCLevel() (or ZSTD_CLEVEL_DEFAULT). Alternatively, you might be able to specify 0, as by reading zstd's manual it seems that a value of 0 for the compression level means default - I'm not sure though!

@@ -1297,6 +1301,12 @@ struct ContentEncodingHandler : public Http::Handler
switch (encoding)
{

#ifdef PISTACHE_USE_CONTENT_ENCODING_ZSTD
case Http::Header::Encoding::Zstd:
writer.setCompressionZstdLevel(ZSTD_btultra2);
Copy link
Member

@Tachi107 Tachi107 Oct 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that ZSTD_btultra2 is doing what you expect here? That enum value is part of ZSTD_strategy, which as far as I can tell has a different meaning compared to the raw "compression level". You might want to use ZSTD_maxCLevel() instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the ZSTD_lazy2 issue above. I got a bit confused with the documentation. I changed it to be ZSTD_maxCLevel() as you suggested.

version.txt Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file enhancement experimental fix in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants