-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving spawn and exec syscalls #4785
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase onto develop
.
Just a note that a significant portion of the changes here, is due to the fact that the PR is now applied directly to the develop branch. Per CKB-VM's release schedule, CKB will for now stick to |
script/src/syscalls/exec_v2.rs
Outdated
machine.set_register(A0, Mac::REG::from_u8(SLICE_OUT_OF_BOUND)); | ||
return Ok(true); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should ExecV2#ecall
do when length == 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will try to load an empty elf file, which will result in elf parsing errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we handle the check length == 0
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please forgive me, I made a mistake earlier, when length == 0, data[offset...] will be read instead of empty bytes;
The current code behavior is correct.
script/src/syscalls/exec_v2.rs
Outdated
return Ok(true); | ||
} | ||
if length > 0 { | ||
let end = offset.checked_add(length).ok_or(VMError::MemOutOfBound)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both offset
and length
are <= u32::MAX
, so offset.checked_add(length)
will be always a Some
So I think L92->96 can be omitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both offset and length are <= u32::MAX, so offset.checked_add(length) will be always a Some.
This description is correct, and we can probably remove this redundant check.
L92->L96 is used to determine whether the data offset exceeds the total length of the data. Why can it be omitted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L92->L96 is used to determine whether the data offset exceeds the total length of the data. Why can it be omitted?
I think
if offset >= full_length {
machine.set_register(A0, Mac::REG::from_u8(SLICE_OUT_OF_BOUND));
return Ok(true);
}
if length > 0 {
let end = offset.checked_add(length).ok_or(VMError::MemOutOfBound)?;
if end > full_length {
machine.set_register(A0, Mac::REG::from_u8(SLICE_OUT_OF_BOUND));
return Ok(true);
}
}
is equal to:
let end = offset.checked_add(length).ok_or(VMError::MemOutOfBound)?;
if end >= full_length {
machine.set_register(A0, Mac::REG::from_u8(SLICE_OUT_OF_BOUND));
return Ok(true);
}
Is above replacement correct?
And should ExecV2#ecall
check length == 0
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, we can combine the two checks and don't need to check the case where length == 0.
We can't combine it. Considering offset == 0 and length == data.len(), we should expect it to succeed, but in the code you gave, it will return SLICE_OUT_OF_BOUND error,
script/src/syscalls/exec_v2.rs
Outdated
let argc = machine.registers()[A4].clone(); | ||
let argv = machine.registers()[A5].clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L53->L55 is:
let index = machine.registers()[A0].to_u64();
let mut source = machine.registers()[A1].to_u64();
let place = machine.registers()[A2].to_u64();
So how about use .to_u64()
too
let argc = machine.registers()[A4].clone(); | |
let argv = machine.registers()[A5].clone(); | |
let argc = machine.registers()[A4].to_u64(); | |
let argv = machine.registers()[A5].to_u64(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we don't have to clone the data here.
data_piece_id: &DataPieceId, | ||
offset: u64, | ||
length: u64, | ||
args: Option<(u64, u64, u64)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about creating a new struct:
struct VmArgs {
vm_id: u64,
argc: u64,
argv: u64,
}
to replace the anonymouse turple (u64, u64, u64)
? This would improve code readability and make the parameters more self-explanatory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the previous internal review, Xuejie suggested that I use Option<(u64, u64, u64)>
. I am not sure whether I should make a struct for them now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did re-check offline communication logs, previously this is represented using the following structure:
pub enum BootArgsType {
Static,
Stream { vm_id: u64, argc: u64, argp: u64 },
}
Since the enum only had 2 variant: one that has 3 arguments, while the other is empty. I just believe that this 2-value enum, could in fact simply be an Option
.
I have no opinion whether it is Option<(u64, u64, u64)>
or Option<VmArgs>
, both feel the same to me. I never say (u64, u64, u64)
is better in readability to VmArgs
or ArgvPointer
or another name for the structure, both feel the same to me but I do understand a different opinion might occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor comments on the necessity of following old code behaviors
script/src/scheduler.rs
Outdated
args.location.length, | ||
) { | ||
Ok(val) => val, | ||
Err(Error::SnapshotDataLoadError) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed this, for a V2 clean design, can we just return the snapshot data load error here? I think there is no need to mimic v1 behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should return a VMError
script/src/syscalls/exec_v2.rs
Outdated
let index = machine.registers()[A0].to_u64(); | ||
let mut source = machine.registers()[A1].to_u64(); | ||
let place = machine.registers()[A2].to_u64(); | ||
// To keep compatible with the old behavior. When Source is wrong, a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do want to ask here: if old behavior is not a concern, what a proper solution here should be?
Same question goes for Place
parsing below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest removing this part of the compatibility code.
@eval-exec I have no idea why |
Ok, That error occurred before, and it seems to be related to ckb-rocksdb. Investigating... |
Latest unit test failed:
https://github.com/nervosnetwork/ckb/actions/runs/12885771772/job/35970732805?pr=4785 |
The incorrect test case has been fixed, please re-execute ci |
What problem does this PR solve?
Problem Summary:
The spawn and exec system calls have some usage issues. The overall operation is too complicated and requires additional checks when reading args from the vm. We found that these checks are unnecessary and can be avoided by other means.
What is changed and how it works?
What's Changed:
FlattenedArgsReader
to simplify reading args in spawn and exec syscalls.Related changes
owner/repo
:Check List
Tests
Side effects
Release note