Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tested against power failure? #68

Open
disaster123 opened this issue Jul 13, 2015 · 13 comments
Open

tested against power failure? #68

disaster123 opened this issue Jul 13, 2015 · 13 comments

Comments

@disaster123
Copy link

How well is dm-writeboost tested against power failures? If now i would like to start doing so.

@akiradeveloper
Copy link
Owner

First of all, I am doing tests on https://github.com/akiradeveloper/device-mapper-test-suite

Sudden power failure is very difficult to test so I haven't.
But disk corruption can be emulated by dm-flakey I had test case locally, but it's not in upstream because the code was too immature.

In real sense, dm-writeboost is sub-optimal for the power failure issue. I used the word theory because of it. Cache hit case may writeback data to backing store so the said principle is only partially kept.

But yes, I can. The copy_bio_payload I implemented in 2.0.2 copies bio segments to buffer. For the principle, I need to implement the reverse one. Why I keep it sub-optimal is that power failure is very less frequent but the code will be bit complicated.

I will implement it in near future if you really start to test power failure case.

@akiradeveloper
Copy link
Owner

Well, I misunderstand your question.

The writeback stuffs are irrelevant for power failure.

dm-writeboost's designed to be robust for power failure issues written in the paper below:
https://www.usenix.org/conference/fast13/understanding-robustness-ssds-under-power-fault

Typically, partial write or bit corruption should be considered and the current (v2.0.3) implementation is good enough for these failures. But testing is, as I said, only done with dm-flakey to emulate bit corruption.

dm-writeboost ignores logs whose checksum is inconsistent. Partial write and bit corruption due to power failure are typically the reason for this. So, please use the latest version for your test.

@disaster123
Copy link
Author

THanks will start doing some tests in the next weeks.

@akiradeveloper
Copy link
Owner

Great. What kind of tests?

@disaster123
Copy link
Author

We have an API to some extermal power managemt systems so i can stress the FS with bonnie++ automatically pull power and see the consistency of the FS afterwards. This can be done 100 or 100 times.

@akiradeveloper
Copy link
Owner

Great!
Don't clear the caches because it's not successful shutdown. You need to replay the logs after reboot.

@disaster123
Copy link
Author

Yes sure

@zhouyuan
Copy link

@akiradeveloper thanks for the detail explanation. if I understand this correctly, under a power failure:

  • This will only affect the write-back policy
  • Since you're trying to aggregate io requests in ram buffer(512KB), those un-flushed data will be lost
  • The corresponding data blocks on SSD(out of date) can be reused when the system restarted

Is my understanding here right?

@akiradeveloper
Copy link
Owner

@zhouyuan Yes for all. As for the second question, losing un-flushed data is ok because client of block device should submit barrier request (bio flagged with REQ_FLUSH) to ensure that the preceding data are persistent. Writeboost guarantees this but may lose the data after the latest barrier.

Thank you for the questions.

@zhouyuan
Copy link

@akiradeveloper thanks a lot!
One more question: if the application sends a flush request on each 4KB write, in the flush-job it would be at per-write(4KB) level or 512KB level?

@akiradeveloper
Copy link
Owner

@zhouyuan It's the worst case scenario. Writeboost may flush the log for each 4KB write. But the log size is 8KB (4KB header + 4KB data), not 512KB.

I call it partial log and you can see the counting in <nr_partial_flushed> of `dmsetup status". (please see the doc)

FYI, there is an optimization for the flush request handling. Because there should be threads other than the application sending writes to the device, there is a chance to fill the 512KB ram buffer in a short moment after the flush request is sent. Writeboost defers ack to the flush request a bit to wait for other I/Os.

The overhead of partial logs can be reduced if the cache device itself is enough responsive for the flush request. For example, some enterprise-level SSDs are BBU equipped or use non-volatile memory for the internal buffer. These types of SSDs doesn't need to flush the internal buffer to NAND medium per flush request and response quickly to the flush requests.

@zhouyuan
Copy link

@akiradeveloper Thanks for the detail answer! It helps a lot

@cHolzberger
Copy link

did those power-failure test lead to any results?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants