Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow once cache completely full #239

Open
mocksoul opened this issue Nov 14, 2022 · 5 comments
Open

Very slow once cache completely full #239

mocksoul opened this issue Nov 14, 2022 · 5 comments
Assignees

Comments

@mocksoul
Copy link

mocksoul commented Nov 14, 2022

In my setup:

1 hdd ("slow"), can write ~150mb/s directly
another hdd ("fast") can read 250+ mb/s

I have ssd cache (50GiB) via dmwb in between

(writeback 100, nr_max_batched_writeback 32)

During large files copy from fast hdd to slow with dmwb:

  1. it sits around 150mb/s writing to slow hdd while cache filling up (due to much faster "source" hdd)
  2. once cache fills completely speed will be around 35-40mb/s

If I do temporary stop copy process, all data in cache will be written back in full 150mb/s. If cache fills again (after copy being resumed) - it will once again drop to ~1/3 of hdd bandswitch.

Also I did tried to set writeback thershold to 0. It fill cache without any writes to slow hdd (this is expected, I guess). But once cache is filled completely - it will slowly write to "slow" hdd with ~1/3 (45MiB/s) of its max speed.

So, I guess there some is some very not-optimal thing happens if cache is completely saturated with dirty data. Or maybe im doing something wrong? :)

This is the only bad case I found yet. Great library!!

p.s. suspend/resume wb dm device quickly (100 times/sec) not allowing cache to fill more than 95% works -- transfer rate will sit around 120-130 in that case for me :-D

@mocksoul mocksoul changed the title Very slow once cache if completely full Very slow once cache completely full Nov 14, 2022
@akiradeveloper akiradeveloper self-assigned this Nov 14, 2022
@mocksoul
Copy link
Author

mocksoul commented Nov 14, 2022

follow up

despite I'm always set writeback_threshold to 100 and nr_max_batched_writeback to 32, I see on slow disk (iostat):

  1. 85..99 util during reads
  2. 70..72 util during seq writes while cache filling
  3. 25..35 util if cache filled up (this is described in ticket itself)

Point 1 looks fine.

About Point 2: initially I thought this is due to SSD being on 100% util itself and thats why simultaneous reads from it during writeback is quite slow. But later I saw writeback speed dont change at all if I stop filling cache untill cache is completely written back to slow HDD. So, this is code optimisation issue I guess.

btw, filsystem used in all "tests" is BTRFS with dup metadata and single data profiles.

@akiradeveloper
Copy link
Owner

@mocksoul

You are doing a sequential copy from HDD1 to dmwb and you see very a slow write on HDD2 when the cache is saturated.

graph TD
  subgraph  dmwb
  HDD2(HDD2 150MB/s)
  SSD
  SSD --> HDD2
  end
  HDD1(HDD1 250MB/s) --> SSD
Loading

This should be because of e3c98a6.

You can see that the number of segment in one writeback is not constant but is adaptively changed based on the situation.

static u32 calc_nr_writeback(struct wb_device *wb)
{
	u32 nr_writeback_candidates =
		atomic64_read(&wb->last_flushed_segment_id)
		- atomic64_read(&wb->last_writeback_segment_id);

	u32 nr_max_batch = read_once(wb->nr_max_batched_writeback);
	if (wb->nr_writeback_segs != nr_max_batch)
		try_alloc_writeback_ios(wb, nr_max_batch, GFP_NOIO | __GFP_NOWARN);

	return min3(nr_writeback_candidates, wb->nr_writeback_segs, wb->nr_empty_segs + 1);
}

@mocksoul
Copy link
Author

mocksoul commented Dec 6, 2022

You are doing a sequential copy from HDD1 to dmwb and you see very a slow write on HDD2 when the cache is saturated.

yep, this is what I see.

This should be because of e3c98a6.

you want me to try dmwb without that commit?

but is adaptively changed based on the situation.

I'm not sure i understood correctly - you mean this is expected behaviour? And writes to HDD2 should stagger in this case?

@akiradeveloper
Copy link
Owner

you want me to try dmwb without that commit?

No.

I'm not sure i understood correctly - you mean this is expected behaviour? And writes to HDD2 should stagger in this case?

Yes. It is an expected behavior.

The use case of writeboost is not the sequential writes which is rather artificial. The intention of the commit is throttling the writeback when there are not enough space in the cache device because cache device should allocate new empty segment as soon as possible in such saturation. If we don't have this throttling, the write to the SSD will wait until all 32 segments are written back. This may cause the upper layer timeout.

For me, 1/3 of the max throughput in such worst case sounds enough good.

@mocksoul
Copy link
Author

mocksoul commented Dec 6, 2022

I see. Probably that could be made at least tunable. Because for my scenario "write to the SSD will wait" is completely fine in this case, because it happens only if we write much faster than HDD2 can handle and saturate cache completely.

My scenario is: often small random writes + occasional big almost-sequential writes. Random writes do not fill cache completely and speedup things a lot and DMWB shines here. But when big sequential writes happen it is unusable for me in current form.

Upper layer timeouts are completely fine, because they are tunable in linux vfs.

Thnx for tipping code piece, i'll try to hack and post results here ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants