Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use production propolis-server flags in PHD CI builds #878

Merged
merged 7 commits into from
Mar 8, 2025
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .github/buildomat/jobs/phd-build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,10 @@ rustc --version
# Build the Propolis server binary with 'dev' profile to enable assertions that
# should fire during tests.
banner build-propolis
ptime -m cargo build --verbose -p propolis-server

export PHD_BUILD="true"
ptime -m cargo build --verbose -p propolis-server \
--features omicron-build,failure-injection

# The PHD runner requires unwind-on-panic to catch certain test failures, so
# build it with the 'dev' profile which is so configured.
Expand Down
12 changes: 12 additions & 0 deletions .github/buildomat/phd-run-with-args.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,18 @@ if [ ! -d "$tmpdir" ]; then
mkdir $tmpdir
fi

# We'll be using the reservoir, so set it to something higher than the default
# 0MiB. Most tests would only need 512MiB (the default PHD guest VM memory
# size), some tests want to run multiple VMs concurrently (particularly around
# migration). 4096MiB is an arbitrary number intended to support the above and
# that we might want to run those tests concurrently at some point.
#
# Currently the lab host these tests will run on is well-known and has much
# more memory than this. Hopefully we won't have Propolis CI running on a
# machine with ~4GiB of memory, but this number could be tuned down if the need
# arises.
pfexec /usr/lib/rsrvrctl -s 4096

banner 'Tests'

runner="$phddir/phd-runner"
Expand Down
6 changes: 6 additions & 0 deletions bin/propolis-server/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -83,3 +83,9 @@ omicron-build = ["propolis/omicron-build"]

# Falcon builds require corresponding bits turned on in the dependency libs
falcon = ["propolis/falcon"]

# Testing necessitates injecting failures which should hopefully be rare or even
# never occur on real otherwise-unperturbed systems. We conditionally compile
# code supporting failure injection to avoid the risk of somehow injecting
# failures into a real system not under test.
failure-injection = []
2 changes: 1 addition & 1 deletion bin/propolis-server/src/lib/initializer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -814,7 +814,7 @@ impl MachineInitializer<'_> {
Ok(())
}

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
pub fn initialize_test_devices(&mut self) {
use propolis::hw::testdev::{
MigrationFailureDevice, MigrationFailures,
Expand Down
4 changes: 2 additions & 2 deletions bin/propolis-server/src/lib/migrate/preamble.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ impl Preamble {
};

match comp {
#[cfg(feature = "omicron-build")]
#[cfg(not(feature = "failure-injection"))]
ReplacementComponent::MigrationFailureInjector(_) => {
return Err(MigrateError::InstanceSpecsIncompatible(
format!(
Expand All @@ -62,7 +62,7 @@ impl Preamble {
));
}

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
ReplacementComponent::MigrationFailureInjector(comp) => {
let ComponentV0::MigrationFailureInjector(src) = to_amend
else {
Expand Down
12 changes: 6 additions & 6 deletions bin/propolis-server/src/lib/spec/api_spec_v0.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ use super::{
StorageDevice,
};

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
use super::MigrationFailure;

#[cfg(feature = "falcon")]
Expand Down Expand Up @@ -65,7 +65,7 @@ impl From<Spec> for InstanceSpecV0 {
serial,
pci_pci_bridges,
pvpanic,
#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
migration_failure,
#[cfg(feature = "falcon")]
softnpu,
Expand Down Expand Up @@ -157,7 +157,7 @@ impl From<Spec> for InstanceSpecV0 {
);
}

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
if let Some(mig) = migration_failure {
insert_component(
&mut spec,
Expand Down Expand Up @@ -310,14 +310,14 @@ impl TryFrom<InstanceSpecV0> for Spec {
// apply it to the builder later.
boot_settings = Some((device_id, settings));
}
#[cfg(feature = "omicron-build")]
#[cfg(not(feature = "failure-injection"))]
ComponentV0::MigrationFailureInjector(_) => {
return Err(ApiSpecError::FeatureCompiledOut {
component: device_id,
feature: "omicron-build",
feature: "failure-injection",
});
}
#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
ComponentV0::MigrationFailureInjector(mig) => {
builder.add_migration_failure_device(MigrationFailure {
id: device_id,
Expand Down
6 changes: 3 additions & 3 deletions bin/propolis-server/src/lib/spec/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ use super::{
Board, BootOrderEntry, BootSettings, Disk, Nic, QemuPvpanic, SerialPort,
};

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
use super::MigrationFailure;

#[cfg(feature = "falcon")]
Expand All @@ -50,7 +50,7 @@ pub(crate) enum SpecBuilderError {
#[error("pvpanic device already specified")]
PvpanicInUse,

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
#[error("migration failure injection already enabled")]
MigrationFailureInjectionInUse,

Expand Down Expand Up @@ -269,7 +269,7 @@ impl SpecBuilder {
Ok(self)
}

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
pub fn add_migration_failure_device(
&mut self,
mig: MigrationFailure,
Expand Down
6 changes: 3 additions & 3 deletions bin/propolis-server/src/lib/spec/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ use propolis_api_types::instance_spec::{
};
use thiserror::Error;

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
use propolis_api_types::instance_spec::components::devices::MigrationFailureInjector;

#[cfg(feature = "falcon")]
Expand Down Expand Up @@ -73,7 +73,7 @@ pub(crate) struct Spec {
pub pci_pci_bridges: BTreeMap<SpecKey, PciPciBridge>,
pub pvpanic: Option<QemuPvpanic>,

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
pub migration_failure: Option<MigrationFailure>,

#[cfg(feature = "falcon")]
Expand Down Expand Up @@ -285,7 +285,7 @@ pub struct QemuPvpanic {
pub spec: QemuPvpanicDesc,
}

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
#[derive(Clone, Debug)]
pub struct MigrationFailure {
pub id: SpecKey,
Expand Down
4 changes: 1 addition & 3 deletions bin/propolis-server/src/lib/vm/ensure.rs
Original file line number Diff line number Diff line change
Expand Up @@ -564,10 +564,8 @@ async fn initialize_vm_objects(
))?;
init.initialize_network_devices(&chipset).await?;

#[cfg(not(feature = "omicron-build"))]
#[cfg(feature = "failure-injection")]
init.initialize_test_devices();
#[cfg(feature = "omicron-build")]
info!(log, "`omicron-build` feature enabled, ignoring any test devices");

#[cfg(feature = "falcon")]
{
Expand Down
13 changes: 13 additions & 0 deletions bin/propolis-server/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,19 @@ fn main() -> anyhow::Result<()> {
// Ensure proper setup of USDT probes
register_probes().unwrap();

#[cfg(all(
feature = "omicron-build",
any(feature = "failure-injection", feature = "falcon")
))]
if option_env!("PHD_BUILD") != Some("true") {
panic!(
"`omicron-build` is enabled alongside development features, \
this build is NOT SUITABLE for production. Set PHD_BUILD=true in \
the environment and rebuild propolis-server if you really need \
this to work."
);
}

// Command line arguments.
let args = Args::parse();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -193,8 +193,8 @@ pub struct P9fs {
/// Describes a synthetic device that registers for VM lifecycle notifications
/// and returns errors during attempts to migrate.
///
/// This is only supported by Propolis servers compiled without the
/// `omicron-build` feature.
/// This is only supported by Propolis servers compiled with the
/// `failure-injection` feature.
#[derive(Clone, Deserialize, Serialize, Debug, JsonSchema)]
#[serde(deny_unknown_fields)]
pub struct MigrationFailureInjector {
Expand Down
2 changes: 1 addition & 1 deletion openapi/propolis-server.json
Original file line number Diff line number Diff line change
Expand Up @@ -1526,7 +1526,7 @@
]
},
"MigrationFailureInjector": {
"description": "Describes a synthetic device that registers for VM lifecycle notifications and returns errors during attempts to migrate.\n\nThis is only supported by Propolis servers compiled without the `omicron-build` feature.",
"description": "Describes a synthetic device that registers for VM lifecycle notifications and returns errors during attempts to migrate.\n\nThis is only supported by Propolis servers compiled with the `failure-injection` feature.",
"type": "object",
"properties": {
"fail_exports": {
Expand Down
38 changes: 33 additions & 5 deletions xtask/src/task_phd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

use anyhow::Context;
use camino::{Utf8Path, Utf8PathBuf};
use std::{fs, process::Command, time};
use std::{collections::HashMap, fs, process::Command, time};

macro_rules! cargo_log {
($tag:literal, $($arg:tt)*) => {
Expand Down Expand Up @@ -270,15 +270,15 @@ impl Cmd {
return Ok(());
}
Self::List { phd_args } => {
let phd_runner = build_bin("phd-runner", false)?;
let phd_runner = build_bin("phd-runner", false, None, None)?;
let status = run_exit_code(
phd_runner.command().arg("list").args(phd_args),
)?;
std::process::exit(status);
}

Self::RunnerHelp { phd_args } => {
let phd_runner = build_bin("phd-runner", false)?;
let phd_runner = build_bin("phd-runner", false, None, None)?;
let status = run_exit_code(
phd_runner.command().arg("help").args(phd_args),
)?;
Expand All @@ -292,7 +292,20 @@ impl Cmd {
cmd
}
None => {
let bin = build_bin("propolis-server", propolis_args.release)?;
let mut server_build_env = HashMap::new();
server_build_env
.insert("PHD_BUILD".to_string(), "true".to_string());
let bin = build_bin(
"propolis-server",
propolis_args.release,
// Some PHD tests specifically cover cases where a component
// in the system has encountered an error, so enable
// failure-injection. Do not enable `omicron-build` like we
// do in Buildomat because someone running `cargo xtask phd`
// is not building a propolis-server destined for Omicron.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, unless they are :) I think this was my use case earlier this week, and it may also come in handy if someone wants to try to reproduce a CI failure on a local machine.

Perhaps add an omicron_build switch to PropolisArgs to allow the user to opt into adding that feature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..... i suppose cargo xtask phd does build a replacement propolis-server doesn't it.

but yeah, a build to get omicron-build enabled makes a lot of sense. lemme do that.

Copy link
Member Author

@iximeow iximeow Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in writing 8affc65 i discovered that cargo xtask phd run and cargo xtask phd run --omicron-build ends up with artifacts in different directories. so if you run one, then the other, then the first one again expecting another propolis-server rebuild, it'll actually just run immediately. surprise!

but the test suite passes on my workstation with or without the flag (wherein i have a reservoir configured), everything fails (as expected!) when i set the reservoir too small.

Some("failure-injection"),
Some(server_build_env),
)?;
let path = bin
.path()
.try_into()
Expand Down Expand Up @@ -345,7 +358,7 @@ impl Cmd {
anyhow::bail!("Missing artifacts config `{artifacts_toml}`!");
}

let phd_runner = build_bin("phd-runner", false)?;
let phd_runner = build_bin("phd-runner", false, None, None)?;
let mut cmd = if cfg!(target_os = "illumos") {
let mut cmd = Command::new("pfexec");
cmd.arg(phd_runner.path());
Expand Down Expand Up @@ -418,15 +431,30 @@ impl BasePropolisArgs {
}
}

/// Build the binary `name` in debug or release with an optional build
/// environment variables and a list of Cargo features.
///
/// `features` is passed directly to Cargo, and so must be a space or
/// comma-separated list of features to activate.
fn build_bin(
name: impl AsRef<str>,
release: bool,
features: Option<&str>,
build_env: Option<HashMap<String, String>>,
) -> anyhow::Result<escargot::CargoRun> {
let name = name.as_ref();
cargo_log!("Compiling", "{name}");

let mut cmd =
escargot::CargoBuild::new().package(name).bin(name).current_target();
if let Some(features) = features {
cmd = cmd.features(features);
}
if let Some(env) = build_env {
for (k, v) in env {
cmd = cmd.env(k, v);
}
}
let profile = if release {
cmd = cmd.release();
"release [optimized]"
Expand Down
Loading