Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exposed window buffer mode #3017

Open
wants to merge 6 commits into
base: dev
Choose a base branch
from
Open

Add exposed window buffer mode #3017

wants to merge 6 commits into from

Conversation

fgenesis
Copy link

@fgenesis fgenesis commented Jan 20, 2022

Related to and discussed in #3004

This change renames ZSTD_d_stableOutBuffer to ZSTD_d_outBufferMode and adds a 3rd setting:

typedef enum {
    ZSTD_bufmode_buffered = 0,
    ZSTD_bufmode_stable = 1,
    ZSTD_bufmode_expose = 2, /* <-- new */
} ZSTD_bufmode_e;

The change is ABI-compatible.
Temporary source compatibility could be done by re-adding the original

#define ZSTD_d_stableOutBuffer ZSTD_d_experimentalParam2

to zstd.h until the parameter is no longer considered experimental.

--- What this does ---

ZSTD_bufmode_expose gives a user direct read access into the LZ-window instead of flushing it with a memcpy.

Instead of fiddling with an external buffer like the usual code:

ZSTD_inBuffer in = { /* compressed data */ };
for(...) {
    char buf[SOME_SIZE];
    ZSTD_outBuffer out = { &buf[0], sizeof(buf), 0 };
    size_t r = ZSTD_decompressStream(dctx, &out, &in)
    .... handle error ....
    fwrite(out.dst, 1, out.pos, file);
}

we can now do this, completely avoiding an extra copy:

ZSTD_DCtx_setParameter(dctx, ZSTD_d_outBufferMode, ZSTD_bufmode_expose)
ZSTD_inBuffer in = { /* compressed data */ };
ZSTD_outBuffer out = { 0, 0, 0 };
for(...) {
    size_t r = ZSTD_decompressStream(dctx, &out, &in)
    .... handle error ....
    if(out.size) { /* stays 0 until there is output */
        fwrite(out.dst, 1, out.size, file);
        out.size = 0; out.dst = NULL; /* acknowledge */
    }
}

This is very nice for a streaming scenario where the consumer does not care about the exact size of the decompressed data and where the buffer is located in memory, as described in the ryg blog.
I guess avoiding the extra copy should also cause less cache thrashing but i didn't try to measure that.

Things left to discuss:
ZSTD_bufmode_expose is currently only implemented for decompression.
Since compression is quite resource-hungry this would not benefit from an exposed window. But it'd be nice to have the same API available for compression as well.
What about renaming ZSTD_c_stable{In|Out}Buffer to ZSTD_c_outBufferMode for consistency? (Didn't touch that so far because i wanted to avoid cluttering up the PR more that it is already.)
Anything else left to do?

EDIT: Except for the C90 compat. Clang and vs19 are happy so i didn't see this. Oh well.
EDIT2: Fixed derp in the external buffer example block above, that was supposed to fwrite using pos, not size. In the exposed buffer block below, pos is always == 0. I considered setting output pos=size to ease a transition to this type of buffer handling but it caused some annoying side effects in the code that checks/handles pos so i didn't go there (mainly the pos<=size invariant, didn't want to touch that)

(cc @terrelln)

@terrelln
Copy link
Contributor

This is an interesting feature. I haven't looked at the code yet, but I don't see anything obviously wrong with the API you're describing. I like the requirement to acknowledge that you've consumed the buffer by setting it to 0.

I'm going to want to think about the API implications a bit with @Cyan4973, but I think this feature has a strong chance of being accepted.

@felixhandte
Copy link
Contributor

This is a really interesting idea, and it would be great to give users a tool to avoid this copy.

My fundamental concern is that the ZSTD_outBuffer struct keeps a non-const pointer to the buffer. This is necessary for user-managed buffers, but we don't want to expose the internal output buffer as writable to the user. Which is unfortunate, because that means this won't work through overloading the outBuffer semantics.

So this is probably not a solution we want to ship. But I would very much be open to discussing other possible solutions.

@fgenesis
Copy link
Author

fgenesis commented Feb 8, 2022

Why is this a problem? It's an advanced option to be enabled by someone who wants this feature and knows the implications.
The assembly generated for a const vs non-const ptr is the same, and nobody can stop users from casting a const away from any pointer and going to town. Then it's a classic case of RTFM =)

But ok, if there's a better solution please go for it.
A possible solution would be to add an extra function that fills a small struct similar to ZSTD_outBuffer with a const pointer + size to the window, and make that exposed bugmode flag just skip the memcpy() and get out.
The user then has to call this function to get the window, do whatever with the pointer, and call the decompressor again to refill the window with new data.

Pros:

  • Don't have to set to 0 to achnowledge. That was mainly to prevent old pointers being stuck in the struct, giving the user the idea that something was decompressed while there really was not

Cons:

  • Does the function return the same pointer everytime? Probably not in case the window reallocates?
  • It's an extra function

Thoughts?

@felixhandte
Copy link
Contributor

Why is this a problem?

It's certainly something we could disclaim in documentation, but I would very strongly prefer not to have the code lie about its contract with the user. Most people don't read the documentation, and the code itself needs as much as possible to communicate and enforce expectations about its use.

A new API is probably the right way to proceed. Especially since we also need some way to coordinate with zstd how much of the buffer has been consumed. I guess in this existing mode, it's implicit that the user consumes the whole window that's being returned, but I don't see that the code actually updates what will be returned on subsequent calls based on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants