-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify block context for reads #1391
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question inline about a match that returns None
, when maybe it shoudl be an error?
upstairs/src/deferred.rs
Outdated
"expected encrypted but got unencrypted \ | ||
block context" | ||
); | ||
None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we return None
here, and the data
is all blank, will this result in
returniung Ok(Validation::Empty)
from validate_encrypted_read_response()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, fixed in 410a874 (returning an error)
upstairs/src/deferred.rs
Outdated
"expected unencrypted but got encrypted \ | ||
block context" | ||
); | ||
None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here as above about returning None
when we found encrypted context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same fix as above!
Isn't it necessary? Without it, the Upstairs can't distinguish between corruption in transit and a block where either the data or encryption context is corrupted. Those are two different scenarios where the Upstairs should take different actions (retry or bail). |
I don't think adding the hash will help us to distinguish between these two cases. Right now, the Downstairs doesn't verify the hash before sending it over the wire, so corruption in transit versus on disk both look the same: the hash or encryption context doesn't match at the Upstairs. Sending the hash doesn't give us any extra information about where corruption happens, because we don't guarantee that the hash is valid before sending it. (I think this is intentional: having the Downstairs check the hash would add a bunch of overhead, and we've already got ZFS doing checksums under the hood for us) |
I think it depends on whether or not we will ever receive a corrupted data back from a ZFS* read instead of an error. If it is possible for the Downstairs to receive corrupted data back from a read, then there isn't a meaningful difference between what the integrity hash allows the Upstairs to distinguish over having the encryption context: a corrupted read would mean that both the integrity hash doesn't match and that the encryption context won't decrypt the block, and the Upstairs cannot distinguish between these. Asking for a retry because the integrity hash is wrong (thinking that it was corrupted on the wire) won't fix it. If it is not possible to receive corrupted data back from a read, then there is a difference: the Downstairs does validate the integrity hash before accepting writes, which (if validation succeeds) means that the data was not corrupted in transit, and if the data comes back at all from a read then it wasn't corrupted at rest either. Then Downstairs doesn't have to validate the hash before sending back to the Upstairs because the only other possible corruption of the data is in transit back to the Upstairs. *: this doesn't apply to non-checksumming filesystems, but our production deployments always use ZFS :) |
Yes, agreed – in a world where ZFS can give us corrupted data, we can't tell the difference between in-transit and on-disk corruption, so sending the hash doesn't give us extra information.
I also agree with this, but think it proves more than what you're saying – in fact, it proves that sending the hash is superfluous 😄 In a world where ZFS cannot give us corrupted data:
This analysis ignores cases where someone deliberately corrupts files on disk; in such a situation, the attacker has full control over whether the hash matches or not, so it still doesn't give us an extra information. |
Ignoring in-transit corruption for a sec, if the Downstairs can receive corrupted data back from ZFS, then it can tell if the integrity hash is wrong and send an appropriate error to the Upstairs. We don't currently check the hash because we believe that ZFS can't return us corrupted data: I think this is a safe assumption to keep making, but if it isn't then we can invent a new error type as appropriate.
I still don't follow. Right now the behaviour of // We'll start with decryption, ignoring the hash; we're using authenticated
// encryption, so the check is just as strong as the hash check.
//
// Note: decrypt_in_place does not overwrite the buffer if
// it fails, otherwise we would need to copy here. There's a
// unit test to validate this behaviour.
use aes_gcm_siv::{Nonce, Tag};
let decryption_result = encryption_context.decrypt_in_place(
data,
Nonce::from_slice(&block_encryption_ctx.nonce[..]),
Tag::from_slice(&block_encryption_ctx.tag[..]),
);
if decryption_result.is_ok() {
Ok(Validation::Encrypted(*block_encryption_ctx))
} else {
// Validate integrity hash before decryption
let computed_hash = integrity_hash(&[
&block_encryption_ctx.nonce[..],
&block_encryption_ctx.tag[..],
data,
]);
if computed_hash == context.hash {
// No encryption context combination decrypted this block, but
// one valid hash was found. This can occur if the decryption
// key doesn't match the key that the data was encrypted with.
error!(log, "Decryption failed with correct hash");
Err(CrucibleError::DecryptionError)
} else {
error!(
log,
"No match for integrity hash\n\
Expected: 0x{:x} != Computed: 0x{:x}",
context.hash,
computed_hash
);
Err(CrucibleError::HashMismatch)
}
} I agree with the ordering here: attempt authenticated decryption first, then only if it fails check the integrity hash. The authenticated decryption is a "stronger" check anyway. If the block was corrupted in transit, as it's written today this function will likely return HashMismatch instead of DecryptionError. Following where that error is checked though, both
Some(CrucibleError::DecryptionError) => {
// We should always be able to decrypt the data. If we
// can't, then we have the wrong key, or the data (or key)
// is corrupted.
error!(
self.clients[client_id].log,
"Authenticated decryption failed on job: {:?}", ds_id
);
panic!(
"[{}] Authenticated decryption failed on job: {:?}",
client_id, ds_id
);
} and // If a read job fails, we sometimes need to panic.
IOop::Read { .. } => {
// It's possible we get a read error if the
// downstairs disconnects. However XXX, someone
// should be told about this error.
//
// Some errors, we need to panic on.
match e {
CrucibleError::HashMismatch => {
panic!(
"[{}] {} read hash mismatch {:?} {:?}",
self.client_id, ds_id, e, job
);
}
CrucibleError::DecryptionError => {
panic!(
"[{}] {} read decrypt error {:?} {:?}",
self.client_id, ds_id, e, job
);
} We don't have anything where the Upstairs would re-request the read result or rerun the read job if it detects a hash mismatch like this. Given a network problem that corrupts packets, a whole bunch of Upstairs could panic, then on some sort of reboot (of the instance or whatnot) all go into reconciliation due to the panics leaving them in a state that requires it. The Upstairs could be smarter about not panicking on a hash mismatch, and somehow degrade gracefully instead. Writing all this out I think there's a broader problem to address, because such a network problem could potentially corrupt any frame that is sent between the Upstairs and Downstairs, probably leading to a whole bunch of |
Yeah, I think I get it now. I had to draw this out, assuming that network corruption would manifest as bit flips:
Having the hash doesn't help us, because when decryption fails there's a hash mismatch, and when decryption succeeds the hash doesn't matter. Sorry I was so confused about this. In fact I think the only part where we can reliably detect packet corruption is the case where the decryption succeeds and there's a hash mismatch, meaning we can say there was a bit flip in the integrity hash part of the frame. I was hoping to come out of this with some ability of the Upstairs to tell the difference between a decryption failure it can somehow retry (by re-requesting the block or something), and a decryption failure it knows not to retry (due to on-disk corruption). If the Downstairs only validates the hash that is sent by the Upstairs before a write, and the write is successful, then corruption can occur via the disk or network and the Upstairs can't tell the difference. This makes me uncomfortable w.r.t the potential for cascading reconciliations. I was (again) hoping that we could detect network corruption and not attempt reconciliation over what we can determine to be an unreliable network. I'll copy what you said in a DM here for posterity: we either have to
where unfortunately both of those options require computing a full hash of the block data in the Downstairs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, we can follow up with "the broader issue" separately. Some comments and questions:
upstairs/src/client.rs
Outdated
// this block context is missing an encryption context! | ||
return Err(CrucibleError::DecryptionError); | ||
}; | ||
|
||
// We'll start with decryption, ignoring the hash; we're using authenticated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment needs updating, and perhaps an additional line or two about how packet corruption currently causes decryption failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, done in 5ae6bd0
@@ -570,12 +573,19 @@ pub struct WriteHeader { | |||
pub contexts: Vec<BlockContext>, | |||
} | |||
|
|||
#[derive(Debug, PartialEq, Copy, Clone, Serialize, Deserialize)] | |||
pub enum ReadBlockContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this stronger type over the Option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too 😀
// We should get back our old data, because block 0 was written. | ||
assert_eq!( | ||
resp.blocks, | ||
vec![ReadBlockContext::Unencrypted { hash: prev_hash }] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the value being tested changed here - was this test wrong before? How did this previously pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, it was
assert_ne!(resp.blocks, vec![Some(block_context)]);
i.e. checking that we don't read back the new value
Now, it's assert_eq!
and checks that we do read back the old value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool cool
Right now, the
ReadResponse
sends back aVec<Option<BlockContext>>
, i.e. anOption<BlockContext>
for each block. TheBlockContext
in turn containsIf
encryption_context
is populated, then sendinghash
to the Upstairs is unnecessary: we are using authenticated encryption, so we know whether the block data + context is valid based on whether it decrypted successfully.This PR removes
hash
from theReadResponse
message in favor of a newenum
:This does not change the on-disk format or write message format, which both continue to use hashes:
BlockContext
) lets us detect corruption in transit. We can't use the encryption context here, because the Downstairs isn't allowed to have encryption keyson_disk_hash
is used to figure out which context slot is valid.on_disk_hash
is never sent over the network¹This PR is a step towards RFD 490 § Metadata reduction, which proposes to not store block hashes for encrypted data on disk. If in the future we don't store block hashes for encrypted data, we would not be able to send them over the network; this PR removes that future hurdle.
However, the PR stands alone as a small optimization (39 → 32 bytes per block) that simplifies program behavior (no need to think about what happens if encryption fails but the hash matches, or vice versa).
¹
on_disk_hash
is also technically superfluous if we already havehash
(see #1161), but this PR doesn't change it