Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHD: improve ctrl-c handling #634

Merged
merged 4 commits into from
Feb 14, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
reduce scope of unsafe
  • Loading branch information
gjcolombo committed Feb 14, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit 7f54723708fc45b10813b340d49f796a74772561
79 changes: 36 additions & 43 deletions phd-tests/framework/src/test_vm/server.rs
Original file line number Diff line number Diff line change
@@ -62,52 +62,45 @@ impl PropolisServer {
let (server_stdout, server_stderr) =
log_mode.get_handles(&data_dir, vm_name)?;

// In the normal course of events, dropping a `TestVm` will spawn a task
// that takes ownership of its `PropolisServer` and then sends Propolis
// client calls to gracefully stop the VM running in that server (if
// there is one). This ensures (modulo any bugs in Propolis itself) that
// any Propolis servers hosting a running VM will have a chance to
// destroy their kernel VMMs before the servers themselves go away.
let mut server_cmd = std::process::Command::new("pfexec");
server_cmd
.args([
server_path.as_str(),
"run",
config_toml_path.as_str(),
server_addr.to_string().as_str(),
vnc_addr.to_string().as_str(),
])
.stdout(server_stdout)
.stderr(server_stderr);

// Gracefully shutting down a Propolis server requires PHD to send an
// instance stop request to the server before it is actually terminated.
// This ensures that the server has a chance to clean up kernel VMM
// resources. It's desirable for the server to do this and not PHD
// because a failure to clean up VMMs on stop is a Propolis bug.
//
// The PHD runner tries to give VMs a chance to tear down gracefully on
// Ctrl-C by installing a custom SIGINT handler and using it to tell
// running test tasks to shut down instead of completing their tests.
// This allows the tests' VMs to be dropped gracefully. The trouble is
// that with no additional intervention, pressing Ctrl-C to interrupt a
// PHD runner process will typically cause SIGINTs to be delivered to
// every process in the foreground process group, including both the
// runner and all its Propolis server children. This breaks the `TestVm`
// cleanup logic: by the time the VM is cleaned up, its server process
// is already gone (due to the signal), so its VMM can't be cleaned up.
// The PHD runner sets up a SIGINT handler that tries to give the
// framework an opportunity to issue these requests before the runner
// exits. However, pressing Ctrl-C in a shell will typically broadcast
// SIGINT to all of the processes in the foreground process's group, not
// just to the foreground process itself. This means that a Ctrl-C press
// will usually kill all of PHD's Propolis servers before the cleanup
// logic can run.
//
// To get around this, spawn Propolis server processes with a pre-`exec`
// hook that directs them to ignore SIGINT. They will still be killed
// during `TestVm` cleanup, assuming that executes normally; if the
// runner is rudely terminated after that, the server process is still
// preserved for investigation (and can still be cleaned up by other
// means, e.g. SIGKILL).
//
// Note that it's undesirable for PHD to try to clean up leaked kernel
// VMMs itself: leaking a kernel VMM after gracefully stopping a
// Propolis server VM indicates a Propolis server bug.
// To avoid this problem, add a pre-`exec` hook that directs Propolis
// servers to ignore SIGINT. On Ctrl-C, the runner will drop all active
// `TestVm`s, and this drop path (if allowed to complete) will kill all
// these processes.
unsafe {
server_cmd.pre_exec(move || {
libc::signal(libc::SIGINT, libc::SIG_IGN);
Ok(())
});
}

let server = PropolisServer {
server: unsafe {
std::process::Command::new("pfexec")
.args([
server_path.as_str(),
"run",
config_toml_path.as_str(),
server_addr.to_string().as_str(),
vnc_addr.to_string().as_str(),
])
.stdout(server_stdout)
.stderr(server_stderr)
.pre_exec(move || {
libc::signal(libc::SIGINT, libc::SIG_IGN);
Ok(())
})
.spawn()?
},
server: server_cmd.spawn()?,
address: server_addr,
};

Loading