-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate some scans + store jobs in BTree #738
Conversation
The `block_contexts` table has a primary key of all the columns, but for the unencrypted case only three have values, and this doesn't seem to reject duplicate unencrypted contexts. Add a separate UNIQUE INDEX for unencrypted block context rows (namely, no nonce and tag), and change the INSERT to INSERT OR IGNORE to elide duplicate rows.
The `block_contexts` table has a primary key of all the columns, but for the unencrypted case only three have values, and this doesn't seem to reject duplicate unencrypted contexts. Add a separate UNIQUE INDEX for unencrypted block context rows (namely, no nonce and tag), and change the INSERT to INSERT OR IGNORE to elide duplicate rows.
Submit different block data as part of the IO loop. Add an option to submit all zeroes instead if that workload is still desired.
Love the improvements, a marked difference. I did hit a new panic running the The panic happens when the upstairs has decided that a downstairs has been "gone too long" However, when this same situation happens on the perf bits, I'm hitting this panic:
The panic is in
Now, I don't know what I can re-create this panic pretty easy, so I'll just dig a little further and try to understand where exactly we |
Ah, I think what is missing is having something like this:
In Adding that and I no longer get the panic when a faulted downstairs attempts to rejoin the region set. |
Here is the diff that will fix the panic I found, plus an improvement on the panic message.
|
I was able to reproduce this by pstop'ing a downstairs while running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I'm good with this PR (IOPS GO BRRRR), except:
Doing this revealed several problems: there were three other places that
ds_new
could become out of sync withds_active
; those have now all been fixed.
I'm anxious about the validity of our current tests, and about future cases of drift here. All tests pass on this PR, but 68 tests fail if I add an assertion that ds_new
doesn't contain the job we're trying to mark InProgress:
diff --git a/upstairs/src/lib.rs b/upstairs/src/lib.rs
index 99df635..124946e 100644
--- a/upstairs/src/lib.rs
+++ b/upstairs/src/lib.rs
@@ -2884,6 +2885,7 @@ impl Downstairs {
let new_state = IOState::InProgress;
let old_state = job.state.insert(client_id, new_state.clone()).unwrap();
assert_eq!(old_state, IOState::New);
+ assert!(!self.ds_new[client_id as usize].contains(&ds_id));
self.io_state_count.decr(&old_state, client_id);
self.io_state_count.incr(&new_state, client_id);
The tests that fail call ds.in_progress
directly instead of getting io_send
to do it, which properly takes care of keeping ds_new
correct. If I instead assert that the ds_id is contained in ds_new
, then I instead see the two new dummy downstairs tests fail (where io_send
has removed the job id from ds_new
before calling in_progress
). I don't think that assertion is correct or should be added in, but I do think more checks should be added in. I've suggested a few in the comments, keeping them as debug_assert
so that they are (I assume!) compiled out to preserve the performance gains seen here.
(not blocking this PR) I wonder if it's worth it to separate ds_active, ds_new, io_state_count, and possibly more into one new struct, where correctness can be maintained through the functions of that struct.
* [`new_work`], presumably due to flow control. | ||
*/ | ||
fn requeue_work(&mut self, client_id: u8, work: &[u64]) { | ||
self.ds_new[client_id as usize].extend_from_slice(work); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff --git a/upstairs/src/lib.rs b/upstairs/src/lib.rs
index 99df635..80cf58f 100644
--- a/upstairs/src/lib.rs
+++ b/upstairs/src/lib.rs
@@ -3351,6 +3368,10 @@ impl Downstairs {
* [`new_work`], presumably due to flow control.
*/
fn requeue_work(&mut self, client_id: u8, work: &[u64]) {
+ for job_id in work {
+ let job = self.ds_active.get_mut(&job_id).unwrap();
+ debug_assert_eq!(*job.state.get(&client_id).unwrap(), IOState::New);
+ }
self.ds_new[client_id as usize].extend_from_slice(work);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have the one concern about guaranteeing that our ds_active
, when iterating,
will indeed always return in key_id lowest to highest. I know we always allocate in
order, and add to the end of the list in order. I'm just a little nervous that there is a
path back in where we could put something out of order. The LiveRepair
stuff
does do some reservations for job IDs before they actually exist, and this might be
a place where it could happen.
I'll dig a little further to see if I can find a situation like I'm concerned about.
Ah, yes, it's a b-tree now. Ignore my fears about this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, BTree, I'm all good to go here.
In addition to fixing #731, this also fixed #757 -- and pulls in #733.