Don't count Done
bytes for backpressure
#1208
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stemming from this comment, I investigated why #1192 decreased write speeds (from > 700 MiB/s → 550ish for 1 MiB random writes).
It turns out there was a problem with our byte-based backpressure implementation: we were counting all un-retired bytes, including bytes in jobs that were
Done
.This is nonsensical: the point of backpressure is to normalize write time (as experienced by the Guest) with the actual time that a write takes. Jobs remain in the
Done
bucket until a flush, but that's unrelated to how long the write itself took (i.e. in our current implementation, changing the flush timeout would change backpressure, which is silly).In other words, backpressure looks like a sawtooth graph: it drops to a minimal value after each flush, then climbs up as more jobs are issued, completed, and not retired (until the next flush).
#1192 makes this worse because it increases the backpressure gain, so the slope of the sawtooth is steeper. In other words, our backpressure was always unnecessarily high, but this change made the average higher. Here's the aforementioned sawtooth under slow and fast builds:
This PR changes the byte-based backpressure to only track bytes in flight. This matches the existing job-based backpressure, which uses the sum of
self.io_state_count.new + self.io_state_count.in_progress
(note the lack ofio_state_count.done
in that summation!).Sure enough, this improves performance back to > 700 MiB/s.
It's not perfect – doing fine-grained sampling of backpressure, I see that it looks slightly unstable:
Still, this is a noticeable performance improvement and seems philosophically correct.