Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure all state recovery changes are serialized #213

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

glopesdev
Copy link
Contributor

@glopesdev glopesdev commented May 15, 2024

Summary

The original implementation for StateRecoverySubject serialized the changes only on Dispose. This PR changes these semantics to instead serialize all changes as soon as they are received by the subject.

Motivation

The purpose of StateRecoverySubject is to allow persistence of environment logic state when the workflow needs to be stopped for maintenance or in case of an unhandled exception. To minimize file IO, the previous implementation only serialized the state on subject disposal, which under normal situations will always happen either on successful or exceptional termination of the workflow.

However, this does not account for situations where the process itself might terminate abnormally in a non-recoverable way, e.g. BSOD, forceful termination of the process by either the OS or the user, or stack overflow exceptions.

Proposed Design

This PR modifies the behavior of StateRecoverySubject by ensuring that all changes to the persistent state are serialized to disk immediately upon reception of the notification.

Drawbacks

This proposal will significantly increase the pressure on disk for high-frequency state changes. The current implementation is not optimized for streaming, so files are deleted, overwritten and flushed for each new value write. There is also no scheduling mechanism, so writes are synchronized with notifications, i.e. the stream blocks while the state is fully flushed out to disk, which might also slow down the sequence that is pushing the changes.

There is also still the possibility that state is corrupted anyway, if forceful termination of the process happens during one of these disk writes (the file might get corrupted in this case). Again, this would be more likely for high-frequency streams, but could happen with any state change.

Unresolved Questions

  • Will we ever want to support state recovery for high-frequency state changes?
  • Might there be alternative mechanisms whereby we could ensure that state persists on termination while minimizing file IO?

@glopesdev glopesdev added feature New planned feature proposal Request for a new feature labels May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New planned feature proposal Request for a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants