-
-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC: config write guard #2360
base: master
Are you sure you want to change the base?
PoC: config write guard #2360
Conversation
@schlimmchen I have not looked at the changes in detail yet, but it's definitely an improvement. There are more synchronization issue between the WebApi and the main loop though, like here (if I'm not mistaken): OpenDTU/src/WebApi_inverter.cpp Line 144 in dc5eb96
|
Looks great! I am just curious if we would ever have to write the config in the context of the main loop (as it will generate a dead lock). Currently I cannot imagine any use case. |
Thanks for your feedback!
Yes. I saw that, too. If one would inadvertently try to do this, the deadlock would happend when testing the change for the first time, and the config would not be saved at all, so it should be noticed early. And if we need to do that, we can adjust the WriteGuard c'tor to see whether the current task it the main loop task and then only lock the mutex, not waiting for the signal. Let me polish and double-check this, then I will un-draft this PR (next week). |
@tbnobody are you passionate about
versus
or whether it's consistent across the code base? |
the configuration write guard is now required when the configuration struct shall be mutated. the write guards locks multiple writers against each other and also, more importantly, makes the writes synchronous to the main loop. all code running in the main loop can now be sure that (1) reads from the configuration struct are non-preemtive and (2) the configuration struct as a whole is in a consistent state when reading from it. NOTE that acquiring a write guard from within the main loop's task will immediately cause a deadlock and the watchdog will trigger a reset. if writing from inside the main loop should ever become necessary, the write guard must be updated to only lock the mutex but not wait for a signal.
9f4f4fa
to
b1edb13
Compare
I am still happy with this. I double checked that generating a DTU serial works as expected using an erased ESP32:
|
bool read(); | ||
bool write(); | ||
void migrate(); | ||
CONFIG_T& get(); | ||
CONFIG_T const& get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this getter could become problematic: how it ensures that reads going through it will be protected by the mutex ?
What if another task / core modifies the config while another one calls this method to access a value ?
I was trying to tackle application of changed Dynamic Power Limiter settings in my feature branch implementing support for multiple inverters managed by the DPL. Then I noticed, that a bunch of nasty problem would go away if I knew that I could trust the config being consistent, i.e., if the config was only written synchronously with the main loop.
The latter ist currently not true: The config is usually manipulated in the context of the async webserver task, while it is read in the context of the main loop.
I wanted to keep the overhead when reading the config to a minimum, so locking a mutex each time the config shall be read is out of the picture, IMHO.
Writing the config happens rather seldom. So it should be okay if that takes a couple of milliseconds longer -- in the worst case.
My approach: Writing the config is only possible when acquiring a WriteGuard, which locks the "config write mutex" and waits for the signal (from the main loop) to start writing. The signal is needed to make writing the config synchronous with the main loop, and the mutex is needed to use the condition_variable but also to lock multiple writers from each other (which is less of a concern, to be honest).
I like this approach, because I like RAII. However, I am not sure if this is the best approach. Another idea would be to enqueue a callback which is then executed within the context of the main loop. We would need to copy and bind to that callback a significant amount of data, which seems off-putting. Also, access to the queue of callbacks then also need to be locked by a mutex.
@tbnobody I would like to know your opinion on this. Do you like this approach?
@LennartF22 We talked about this briefly. What are your thoughts on this approach?