-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHD: improve ctrl-c handling #634
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
hawkw
reviewed
Jan 31, 2024
hawkw
reviewed
Feb 1, 2024
bd828df
to
230a53a
Compare
Merged
3baee13
to
9200778
Compare
hawkw
approved these changes
Feb 12, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good to me! i have one very small style nit to pick, but it's not as blocker.
Install a signal handler in the PHD runner that catches SIGINT and relays it via `tokio::sync::watch` to any tasks that are currently executing tests. This allows the test tasks to exit at the next available `await` point. Since these tasks are themselves awaited before the runner exits, this ensures that all VMs and disks created by a test can be dropped and cleaned up before the runner goes away. Ctrl-C characters sent to a shell will typically cause SIGINT to be sent to all processes in the foreground process group. This is bad for PHD because any Propolis servers the runner spawned will be killed before `TestVm::drop` gets a chance to gracefully stop the VMs inside. Since dropping a `PropolisServer` kills the process in any case, get around this by configuring PHD-spawned Propolis servers to ignore SIGINT. Tested with local Debian 11 runs, observing in particular that interrupting a run with Ctrl-C does not leak any VMMs or temporary disk objects.
13a27d1
to
c655dfb
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
(N.B. Depends on #633.)
Install a SIGINT handler in the PHD runner and use it to broadcast to all test execution tasks that this signal was received. Add a
select!
to the tasks to preempt further execution of test cases when this happens. Dropping the test case futures also drops all the VMs and disks they create, giving the runner a chance to clean up kernel VMMs and disk backing files before exiting. (If one SIGINT is handled and a second is received, exit immediately to allow the runner to be killed even if a test case hasn't reached anawait
point.)To ensure that Propolis processes get a chance to stop gracefully, add a pre-
exec
hook so that the framework's Propolis servers ignore SIGINT. This is needed so that hitting Ctrl-C to interrupt a foreground runner doesn't take out the Propolis servers in the same process group beforeTestVm::drop
gets a chance to clean them up.Tested with runs of Debian 11 guests, both running to completion (all tests still pass) and interrupted at various points midway (verified that kernel VMMs and disks get cleaned up as expected when they didn't before).