Question: behavior for large messages and/or histories #198

sebastianburckhardt · 2022-11-03T18:36:15Z

Netherite does not impose any hard limits on the size of messages or histories. But of course, the question remains what happens as messages or histories get very large, i.e. what breaks first. I created this issue to track discussion, testing, and documentation around this question.

Some thoughts on this:

Everything in the system (i.e. not only the specific orchestrations which contain large messages or histories) is likely to slow down substantially when storage bandwidth, or inter-partition bandwidth (e.g. event hub throughput) becomes a bottleneck. The system should keep working under such circumstances but may be too slow to meet its intended purpose.
All in-flight messages are kept in memory (in the outboxes and the session buffers on the source and destination partitions respectively), so we may hit OOM when using large messages and not processing them quickly.
Netherite keeps the in-memory instance caching in line with the specified cache limits. If workers need to process histories or messages that exceed the memory limits of the cache, the result is thrashing which makes a system perform horribly. It is therefore important to increase the cache size when trying to handle such situations.
Page blobs have a maximum size of 1TB, and all of the data in the task hub partition, including all instance states, histories, and in-flight messages, have to fit into that. Also, the FASTER log can be quite a bit larger than the data it stores because it may contain multiple versions of orchestration states. FASTER does run compaction periodically but it remains to be determined what expansion factors would be typical. I would guess something like 3x.

A lot of that is just my guesses, we need experiments to validate these statements.

ericleigh007 · 2023-07-28T22:07:05Z

it is worth noting now that V1.4.0 has a much more efficient algorithm for handling large messages, I'm sure @sebastianburckhardt will attest to.

gha-zund · 2024-01-25T21:22:51Z

does the event hub message size limit of 1 MB plays a role for large inputs (entity states)?
In our application, it's not uncommon for the state to grow near to (or even higher than) 1MB in size...

ericleigh007 · 2024-05-16T20:15:39Z

@sebastianburckhardt is correct in saying that large messages (the data that is serialized between the event hub and orchestrators and activities) does certainly affect latency.

In my tests with an actual application (not a benchmark, but real-world stuff) we trigger off changes in cosmos, but then for expediency we have to gel together several documents. Our messages can balloon to many megabytes.
Then we have a choice of either slowing down because of the event hubs transit time, or going quicker based on the storage blobs "lookaside" storage, but then again increasing the load on the durable functions Netherite storage account that contains the taskhub.

We also have some experience with querying large histories and there we have found that status and purge history queries can take a good deal of time. To combat this, we had to place such queries out of the critical time windows within our system. We were also concerned what a large status or history query or purge would do to the utilization on our taskhub storage account and we had some scant evidence that suggested that these operations interfered with the "real time" work of running orchestrators and activities.

cticevans · 2024-08-16T15:01:05Z

what's the easiest way to see size of history? any notion of "large" is? KB, MB, GB?

sebastianburckhardt added documentation Improvements or additions to documentation question Further information is requested labels Nov 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: behavior for large messages and/or histories #198

Question: behavior for large messages and/or histories #198

sebastianburckhardt commented Nov 3, 2022

ericleigh007 commented Jul 28, 2023

gha-zund commented Jan 25, 2024

ericleigh007 commented May 16, 2024

cticevans commented Aug 16, 2024

Question: behavior for large messages and/or histories #198

Question: behavior for large messages and/or histories #198

Comments

sebastianburckhardt commented Nov 3, 2022

ericleigh007 commented Jul 28, 2023

gha-zund commented Jan 25, 2024

ericleigh007 commented May 16, 2024

cticevans commented Aug 16, 2024