Don't count `Done` bytes for backpressure #1208

mkeeter · 2024-03-15T14:22:45Z

Stemming from this comment, I investigated why #1192 decreased write speeds (from > 700 MiB/s → 550ish for 1 MiB random writes).

It turns out there was a problem with our byte-based backpressure implementation: we were counting all un-retired bytes, including bytes in jobs that were Done.

This is nonsensical: the point of backpressure is to normalize write time (as experienced by the Guest) with the actual time that a write takes. Jobs remain in the Done bucket until a flush, but that's unrelated to how long the write itself took (i.e. in our current implementation, changing the flush timeout would change backpressure, which is silly).

In other words, backpressure looks like a sawtooth graph: it drops to a minimal value after each flush, then climbs up as more jobs are issued, completed, and not retired (until the next flush).

#1192 makes this worse because it increases the backpressure gain, so the slope of the sawtooth is steeper. In other words, our backpressure was always unnecessarily high, but this change made the average higher. Here's the aforementioned sawtooth under slow and fast builds:

This PR changes the byte-based backpressure to only track bytes in flight. This matches the existing job-based backpressure, which uses the sum of self.io_state_count.new + self.io_state_count.in_progress (note the lack of io_state_count.done in that summation!).

Sure enough, this improves performance back to > 700 MiB/s.

It's not perfect – doing fine-grained sampling of backpressure, I see that it looks slightly unstable:

Still, this is a noticeable performance improvement and seems philosophically correct.

leftwo

So, I think this is fine.
We do have to have some way of capping the total amount of memory an upstairs
will require. But trying to use guest backpressure as a way to do that might
not be the right way. Especially since the amount of memory we need is more
about how quickly we are flushing ACK'd jobs and that is opaque to the guest.

Maybe in the future we adjust the flush timeout to accomplish that, but that's
not for this PR.

In #1208, we changed backpressure to stop counting jobs once they had been completed, rather than retired. This was to accurately reflect jobs in flight, rather than "how long has it been since a flush". However, we messed up the accounting, leading to #1236: - Once a job was completed (or skipped) by all 3x Downstairs, its contribution to backpressure was subtracted out - If a job was replayed, its contribution to backpressure **was not** added back - A replayed job could be completed **again**. If it had previously been completed, the second completion would cause the backpressure counter to underflow, which we detect and panic This PR adds a `backpressure_bytes: Option<u64>` member to the `DownstairsIO` to make this accounting more fool-proof. This member tracks whether a job currently counts for backpressure, meaning we can make completion and requeueing idempotent. I also add a fail-safe check at job retirement, which prints an error message if we messed something up.

jmpesp approved these changes Mar 15, 2024

View reviewed changes

mkeeter mentioned this pull request Mar 15, 2024

Use FIOFFS on illumos instead of fsync (again) #1199

Closed

leftwo added this to the 7 milestone Mar 15, 2024

jmpesp assigned mkeeter Mar 15, 2024

Don't count Done bytes for backpressure

ad94bd2

mkeeter force-pushed the fix-byte-backpressure branch from 9ec3c2c to ad94bd2 Compare March 15, 2024 22:16

leftwo approved these changes Mar 15, 2024

View reviewed changes

mkeeter merged commit 952c7d6 into oxidecomputer:main Mar 15, 2024
18 checks passed

mkeeter deleted the fix-byte-backpressure branch March 16, 2024 01:38

mkeeter mentioned this pull request Mar 29, 2024

Correctly (and robustly) count bytes #1237

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't count `Done` bytes for backpressure #1208

Don't count `Done` bytes for backpressure #1208

mkeeter commented Mar 15, 2024 •

edited

Loading

leftwo left a comment

Don't count Done bytes for backpressure #1208

Don't count Done bytes for backpressure #1208

Conversation

mkeeter commented Mar 15, 2024 • edited Loading

leftwo left a comment

Choose a reason for hiding this comment

Don't count `Done` bytes for backpressure #1208

Don't count `Done` bytes for backpressure #1208

mkeeter commented Mar 15, 2024 •

edited

Loading