Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for optional progress updates channel #661

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

deitch
Copy link
Contributor

@deitch deitch commented Dec 21, 2023

If CopyOptions.UpdateChannel is non-nil, then with each block copied during a Copy*(), it sends an update down the channel. The update includes the amount copied in this update, and the descriptor of what is being copied. This descriptor lets the consumer determine what blob it applies to, and what the expected total size is (which can be used, e.g. to calculate percentages or create a progress bar).

It does not include relative sizes of the whole root desc or tag in the Copy(), because we have no way of knowing that; you would have to walk the entire tree first. Granted, you can get it from manifests first, and only then the more expensive other elements, but that would require a complete restructure. This is more than good enough for know.

Discussed in Slack with @Wwwsylvia

@deitch
Copy link
Contributor Author

deitch commented Dec 21, 2023

Sylvia did raise the following questions. Adding here for follow-on discussion:

  1. It is not clear to me that how does the caller read the channel. Is the channel generated internally or is it passed in by the caller? Is it per-blob?
  2. How do we report progress for the skipped (existing) blobs? What if the caller (e.g. ORAS CLI) wants to show 100% for the skipped blobs (as shown in the screenshot)?
  3. What if the caller wants to track the progress of PushBytes? Is there a way to reuse this mechanism?

Basically, the idea is that we want to make sure the APIs of oras-go are generic and extensible.

@codecov-commenter
Copy link

codecov-commenter commented Dec 21, 2023

Codecov Report

Attention: Patch coverage is 6.25000% with 15 lines in your changes missing coverage. Please review.

Project coverage is 75.27%. Comparing base (0b78aa6) to head (cd02b49).
Report is 34 commits behind head on main.

Files with missing lines Patch % Lines
progress_reader.go 0.00% 8 Missing ⚠️
copy.go 12.50% 5 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #661      +/-   ##
==========================================
- Coverage   75.46%   75.27%   -0.19%     
==========================================
  Files          59       60       +1     
  Lines        5640     5654      +14     
==========================================
  Hits         4256     4256              
- Misses       1019     1032      +13     
- Partials      365      366       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@deitch
Copy link
Contributor Author

deitch commented Dec 21, 2023

Some responses:

It is not clear to me that how does the caller read the channel. Is the channel generated internally or is it passed in by the caller?

Passed in by the caller, CopyOptions.UpdateChannel. If it is nil, we do nothing different; if it is non-nil, we send updates to it with each copy.

Is it per-blob?

No, one channel, but each update includes the descriptor. See the extended comment on UpdateChannel.

How do we report progress for the skipped (existing) blobs?

I hadn’t thought about that. I think I would send a single update of 100% on existing blobs. But that can be subject for discussion. I think I would do this first, then do that in a follow-on. One improvement at a time.

What if the caller wants to track the progress of PushBytes? Is there a way to reuse this mechanism?

I don’t see why not. All this is, is a mechanism for sending updates in a channel. The issue here is that there are no “opts” on which to add it

@deitch
Copy link
Contributor Author

deitch commented Dec 21, 2023

Rebased

Copy link
Contributor

@shizhMSFT shizhMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A complete and sound progress updating framework is non-trivial for both oras-go v1 and v2. Thanks for the initial attempt and let's see if we can iterate it to a better design.

// Updates are sent each time a block is copied. The number of bytes copied
// depends upon io.Copy, which, by default, is 32KB. As of now, this cannot
// be changed. We may provided that capability in a future update.
UpdateChannel chan<- CopyUpdate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Channels are blocking. If someone sets opts.UpdateChannel = make(chan<- CopyUpdate) and never consumes the updates, the Copy() will be blocked forever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is a good point. I was thinking about that, wasn't sure quite how to handle it. Here are some possibilities:

  • send to the channel only from within a goroutine. That could lead to a lot of goroutines if it blocks. See below.
  • instead of passing a channel, pass a function call. Each such call would be in a goroutine, but again, same issue. See below.

For either of the above, maybe we have one goroutine and one channel we own. Main routine publishes to our channel (we control it, so we can ensure no blocking issues), which then either publishes to the passed channel (option 1 above) or calls the passed function (option 2). At least we don't block our main routine, although if their function blocks, our goroutine still is blocked.

Is a func call better than a goroutine? In theory, they could use a channel in there. Or have a slow func call.

Comment on lines +116 to +127
// UpdateChannel is an optional channel to receive progress updates.
// Each update will include the number of bytes copied for a particular blob
// or manifest, the expected total size, and the descriptor of the blob or
// manifest. It is up to the consumer of the channel to differentiate
// between updates among different blobs and manifests; no mechanism is
// provided for distinguishing between them, other than the descriptor
// passed with each update. The total size of downloads of all blobs and
// manifests is not provided, as it is not known. You can calculate the
// percentage downloaded for a particular blob in an individual update
// based on the total size of that blob, which is provided in the
// descriptor, and the number of bytes copied, which is provided in the
// update.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presenting a progress bar of an operation isn't an easy job. Eventually, we will end up with an MVC pattern if we continue iterating.

Therefore, we need to define data models first for oras operations like Copy, PushBytes, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, oras CLI has a data model so as the buildkit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @qweeah

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presenting a progress bar of an operation isn't an easy job

True, but I am not concerned with that here. oras CLI might, but the Copy() and such library calls should not care. They only should worry about some library-centric way of publishing updates. A CLI (like oras) or other consumer can do with them what they want.

we need to define data models

The data model is simple: stream of defined struct (or interface, if you prefer), each of which contains the amount of bytes transferred and descriptor to which it applies. Anything higher level would be outside the scope of this type of option, I think.

copy.go Outdated
Comment on lines 128 to 130
// Updates are sent each time a block is copied. The number of bytes copied
// depends upon io.Copy, which, by default, is 32KB. As of now, this cannot
// be changed. We may provided that capability in a future update.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. The copy operation does not necessarily depends on io.Copy. It is purely based on the implementation of the destination Target. For instance, copying from any target to memory / oci targets does not call io.Copy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. We pass the target an io.ReadCloser, so something is calling Read() multiple times. We don't really care what it is, as long as it calls Read() (which it does). I can update the comment here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this comment; take a look.

Comment on lines +291 to +298
if ch != nil {
rc = &progressReader{
c: ch,
r: rc,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping a ReadCloser may have perf penalties as some types may implement io.WriteTo which can trigger an optimization by the built-in io.Copy() method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that it is performance impact by wrapping the Read() call? Or that by not having the wrapper also implement WriterTo, if the original ReadCloser does, then we lose the ability to WriteTo? I think the second of these.

What do you suggest? That we check for WriterTo, like io.Copy() does, and if it does, implement WriteTo().

How would we then capture the updates?

}

func (p *progressReader) Close() error {
close(p.c)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closing the channel is dangerous as the channel is shared across all nodes. It panics if other nodes get updates.

In other words, the caller has the responsibility to close the channel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, very good point. I will change that and add the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@deitch deitch force-pushed the progress-updates branch 2 times, most recently from ebc7070 to 7fa4d37 Compare December 21, 2023 15:34
@shizhMSFT shizhMSFT added the experimental Issues or pull requests depending on WIP specs label Dec 26, 2023
@deitch
Copy link
Contributor Author

deitch commented Dec 27, 2023

Going to rebase to get it back in line with main; any thoughts on the feedback @shizhMSFT ?

@deitch
Copy link
Contributor Author

deitch commented Dec 27, 2023

Rebased

@shizhMSFT
Copy link
Contributor

Going to rebase to get it back in line with main; any thoughts on the feedback @shizhMSFT ?

@deitch I think this PR is good brainstorming idea and worths an issue or discussion so that we can design it holistically with production-level quality.

@deitch
Copy link
Contributor Author

deitch commented Dec 28, 2023

PR is good brainstorming idea and worths an issue or discussion so that we can design it holistically with production-level quality

I have no idea what that means. If the overall direction is good, let's fix this and get it in. If not, let's redo it.

@shizhMSFT
Copy link
Contributor

It means that we need an issue or discussion for a systematical design as this PR is a feature request proposal. The reason behind it is that oras-go/v2 is now at a stable state and we should introduce new a feature (i.e. merge to main) with the best design we could achieve to avoid future breaking changes as much as we can. (note: breaking changes requires us to introduce oras-go/v3).

Since issue and discussion are good place to propose a feature request, would you like to open one with details like the motivation of the feature request (i.e. what would you like to be added? why is this needed for oras?)? We should have an issue template for oras-go but it is still in progress (tracked by #641; the oras repo has one). With an open issue / discussion, more maintainers like @sajayantony, @TerryHowe, and @sabre1041 can join the discussion and give inputs so that this conversation is not just between you and me or @Wwwsylvia. Of course, you may reference this PR in the new issue / discussion as an implementation candidate (so do others). #650 and #651 by @ktarplee are a good example.

Meanwhile, I would like to continue the discussions on the aspects of API design, concurrency (go-routine safety), performance, and security so that we can iterate your idea and make oras-go a better library.

@deitch
Copy link
Contributor Author

deitch commented Dec 28, 2023

Sure. This all started with an extended discussion we had on Slack with @Wwwsylvia . Part of the issue is that this capability existed (in a different way, but still) in v1. The move to v2, with all of its many benefits, lost this capability. This is trying to get it back, so that downstream dependencies that want to switch from v1 to v2 can do so without loss of ability.

BTW, if this PR changes an API, and there is a way to do it without changing it (i.e. backwards-compatible), then by all means. If I recall correctly, that was the main driver behind the variadic options in the main calls in v1. We knew we could add features going forward without changing the API.

@sajayantony
Copy link
Contributor

sajayantony commented Jan 3, 2024

I think this is general goodness and so I'm in support of landing something that is feasible and generally stable.
Can someone given an code example of how this API might be consumed with maybe a warning of how it can be abused as well?

@deitch maybe a short hackmd doc or something that we can use for API review from the consumer aspect?

@deitch
Copy link
Contributor Author

deitch commented Jan 4, 2024

Can someone given an code example of how this API might be consumed
maybe a short hackmd doc or something

I will put something in a comment here. It looks like there is question about whether or not this API is the right one, or even about putting in the changes. Once we have general agreement on adding it and the approach, I am happy to put a formal example and doc anywhere we want.

Signed-off-by: Avi Deitcher <avi@deitcher.net>
@deitch
Copy link
Contributor Author

deitch commented Jan 4, 2024

Also rebased while I was at it

@deitch
Copy link
Contributor Author

deitch commented Jan 4, 2024

Simple example:

ch := make(chan oras.CopyUpdate, 1000)
opts := oras.CopyOptions{
    UpdateChannel: ch        
}
desc, err := oras.Copy(ctx, src, tagName, dst, tagName, opts)
go func(ch <- chan CopyUpdate) {
    for msg := range ch {
        fmt.Printf("copied %d bytes out of %d total for %s\n", msg.Copied, msg.Descriptor.Size, msg.Descriptor.Digest)
    }
}(ch)

I realize that we probably would need to extend it the channel type so there is a way to send "complete" or "error". But I am waiting until the approach is agreed upon.

As for abusing, it is pretty straightforward. Have a channel, do not buffer it, or do not read from it until the buffer is full.

We could make the channel send (in this PR) non-blocking. That means that it would risk lost messages, but I find that an acceptable risk, much more so than, "user abuses it and the whole thing blocks."

Copy link

This PR is stale because it has been open for 45 days with no activity. Remove the stale label or comment to prevent it from being closed in 30 days.

@github-actions github-actions bot added the stale Inactive issues or pull requests label Oct 15, 2024
@deitch
Copy link
Contributor Author

deitch commented Oct 15, 2024

Remove stale

@deitch
Copy link
Contributor Author

deitch commented Oct 15, 2024

@sajayantony how can we bring this back to life?

@github-actions github-actions bot removed the stale Inactive issues or pull requests label Oct 16, 2024
@shizhMSFT
Copy link
Contributor

According to the discussions on the slack, more topics could be

  • What does the view model look like?
  • What's the model of an update? How to determine the frequency of the update?
  • How to emit the update?
  • How do we ensure API flexibility? Not everyone likes channels.
  • How do we ensure performance is not degraded?
  • ...

@deitch will open a new feature request issue to track properly.

@deitch
Copy link
Contributor Author

deitch commented Oct 21, 2024

Opened #839 ; I am sure I missed something, but that should be good as a start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experimental Issues or pull requests depending on WIP specs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants