Skip to content

Commit 2efb9d5

Browse files
authored
[sled-agent] Clarify PathInRoot::zpool (#7428)
Fixing #7229 requires sled-agent to know, for any installed zone, what zpool was chosen for that zone's root. It looks like this is tracked by `InstalledZone::zonepath`: https://github.com/oxidecomputer/omicron/blob/5fa0d8e50fd6ab36c211f3f71bebeab3ccf1860c/illumos-utils/src/running_zone.rs#L917-L918 except the `ZpoolName` of `PathInPool` is optional: https://github.com/oxidecomputer/omicron/blob/5fa0d8e50fd6ab36c211f3f71bebeab3ccf1860c/illumos-utils/src/zpool.rs#L193 It's not obvious why the pool could be `None`, particularly when there's a comment on `PathInPool` that says we could derive the pool name from the path. Trying to make it non-optional reveals two spots where this can be set to none, both in `ZoneArgs::root()`, and for two very different reasons: https://github.com/oxidecomputer/omicron/blob/5fa0d8e50fd6ab36c211f3f71bebeab3ccf1860c/sled-agent/src/services.rs#L572-L584 The second arm is where we construct a `PathInPool` for the switch zone. That zone is not in a zpool at all, but is in the ramdisk, so there is truly no zpool involved. The first arm requires `pool` to be optional because `zone_config.zone.filesystem_pool` is optional, and in practice we do have values of `None` there (exactly what #7229 is trying to address!). This one seems dubious at best, because for any given zone we choose to install, we _do_ pick a pool for its filesystem, even if the zone config has `filesystem_pool: None`: https://github.com/oxidecomputer/omicron/blob/5fa0d8e50fd6ab36c211f3f71bebeab3ccf1860c/sled-agent/src/services.rs#L3707-L3720 This PR changes `PathInPool::pool` from `Option<ZpoolName>` to a new enum `ZpoolOrRamdisk`: https://github.com/oxidecomputer/omicron/blob/cd0a87f80301f86661c366979be07bc7ff34f421/illumos-utils/src/zpool.rs#L184-L188 From a Rust-typesystem point of view, this type is basically the same as `Option`, except by renaming the `None` case to `Ramdisk` it becomes obvious that `ZoneArgs::root()` is incorrect: we can't just change the first arm to return `ZpoolOrRamdisk::Ramdisk` in the case where the zone config `filesystem_pool` is `None`, because we're certainly not placing non-switch zones on the ramdisk. This led to removing `ZoneArgs::root()` altogether (the zone config alone is not enough to know the root of the zone, since we choose it randomly in some cases), and instead passing an extra argument into `initialize_zone` (the fully-populated `PathInPool` for the zone). This feels like the smallest change I could make to address the immediate hurdle to fixing #7229, but it might be worth spinning out some separate issues for longer term followup? 1. `PathInPool` is "denormalized" in that the zpool (if there is one) is repeated in the `path` it also stores. There is no enforcement that the two fields are consistent: one can construct or modify a `PathInPool` where the `path` is on some other zpool than the one in `pool`. 2. Should `PathInPool` be used to store paths that aren't in pools at all? (i.e., the switch zone in the ramdisk) 3. This is more tangentially related, but: the `PathInPool::path` value for a zone ends up stored in a sled-agent ledger as part of [`OmicronZoneConfigLocal`](https://github.com/oxidecomputer/omicron/blob/345e095ff9a906ab4f26f3bd623d21b4b4af862a/sled-agent/src/services.rs#L462-L483). The comment on `OmicronZoneConfigLocal` notes that "this struct is less necessary than it has been historically". Once #7229 is fixed, can we make the zone config `filesystem_pool` non-optional and then remove `OmicronZoneConfigLocal` altogether?
1 parent b7d7ba9 commit 2efb9d5

File tree

6 files changed

+53
-50
lines changed

6 files changed

+53
-50
lines changed

illumos-utils/src/running_zone.rs

+3-3
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ use crate::link::{Link, VnicAllocator};
1313
use crate::opte::{Port, PortTicket};
1414
use crate::svc::wait_for_service;
1515
use crate::zone::AddressRequest;
16-
use crate::zpool::{PathInPool, ZpoolName};
16+
use crate::zpool::{PathInPool, ZpoolOrRamdisk};
1717
use camino::{Utf8Path, Utf8PathBuf};
1818
use camino_tempfile::Utf8TempDir;
1919
use ipnetwork::IpNetwork;
@@ -358,8 +358,8 @@ impl RunningZone {
358358
}
359359

360360
/// Returns the zpool on which the filesystem path has been placed.
361-
pub fn root_zpool(&self) -> Option<&ZpoolName> {
362-
self.inner.zonepath.pool.as_ref()
361+
pub fn root_zpool(&self) -> &ZpoolOrRamdisk {
362+
&self.inner.zonepath.pool
363363
}
364364

365365
/// Return the name of a bootstrap VNIC in the zone, if any.

illumos-utils/src/zpool.rs

+7-1
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,12 @@ impl FromStr for ZpoolInfo {
181181
/// Wraps commands for interacting with ZFS pools.
182182
pub struct Zpool {}
183183

184+
#[derive(Debug, Clone, Eq, PartialEq)]
185+
pub enum ZpoolOrRamdisk {
186+
Zpool(ZpoolName),
187+
Ramdisk,
188+
}
189+
184190
/// A path which exists within a pool.
185191
///
186192
/// By storing these types together, it's possible to answer
@@ -190,7 +196,7 @@ pub struct Zpool {}
190196
// Rather Not.
191197
#[derive(Debug, Clone, Eq, PartialEq)]
192198
pub struct PathInPool {
193-
pub pool: Option<ZpoolName>,
199+
pub pool: ZpoolOrRamdisk,
194200
pub path: Utf8PathBuf,
195201
}
196202

sled-agent/src/instance.rs

+5-1
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ use illumos_utils::opte::{DhcpCfg, PortCreateParams, PortManager};
2424
use illumos_utils::running_zone::{RunningZone, ZoneBuilderFactory};
2525
use illumos_utils::svc::wait_for_service;
2626
use illumos_utils::zone::PROPOLIS_ZONE_PREFIX;
27+
use illumos_utils::zpool::ZpoolOrRamdisk;
2728
use omicron_common::api::internal::nexus::{SledVmmState, VmmRuntimeState};
2829
use omicron_common::api::internal::shared::{
2930
NetworkInterface, ResolvedVpcFirewallRule, SledIdentifiers, SourceNatConfig,
@@ -1836,7 +1837,10 @@ impl InstanceRunner {
18361837
let Some(run_state) = &self.running_state else {
18371838
return None;
18381839
};
1839-
run_state.running_zone.root_zpool().map(|p| p.clone())
1840+
match run_state.running_zone.root_zpool() {
1841+
ZpoolOrRamdisk::Zpool(zpool_name) => Some(zpool_name.clone()),
1842+
ZpoolOrRamdisk::Ramdisk => None,
1843+
}
18401844
}
18411845

18421846
fn current_state(&self) -> SledVmmState {

sled-agent/src/probe_manager.rs

+11-4
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ use illumos_utils::link::VnicAllocator;
66
use illumos_utils::opte::{DhcpCfg, PortCreateParams, PortManager};
77
use illumos_utils::running_zone::{RunningZone, ZoneBuilderFactory};
88
use illumos_utils::zone::Zones;
9+
use illumos_utils::zpool::ZpoolOrRamdisk;
910
use nexus_client::types::{
1011
BackgroundTasksActivateRequest, ProbeExternalIp, ProbeInfo,
1112
};
@@ -126,10 +127,16 @@ impl ProbeManager {
126127
.zones
127128
.iter()
128129
.filter_map(|(id, probe)| {
129-
let Some(probe_pool) = probe.root_zpool() else {
130-
// No known pool for this probe
131-
info!(self.inner.log, "use_only_these_disks: Cannot read filesystem pool"; "id" => ?id);
132-
return None;
130+
let probe_pool = match probe.root_zpool() {
131+
ZpoolOrRamdisk::Zpool(zpool_name) => zpool_name,
132+
ZpoolOrRamdisk::Ramdisk => {
133+
info!(
134+
self.inner.log,
135+
"use_only_these_disks: removing probe on ramdisk";
136+
"id" => ?id,
137+
);
138+
return None;
139+
}
133140
};
134141

135142
if !u2_set.contains(probe_pool) {

sled-agent/src/services.rs

+25-39
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ use illumos_utils::running_zone::{
5555
use illumos_utils::smf_helper::SmfHelper;
5656
use illumos_utils::zfs::ZONE_ZFS_RAMDISK_DATASET_MOUNTPOINT;
5757
use illumos_utils::zone::AddressRequest;
58-
use illumos_utils::zpool::{PathInPool, ZpoolName};
58+
use illumos_utils::zpool::{PathInPool, ZpoolName, ZpoolOrRamdisk};
5959
use illumos_utils::{execute, PFEXEC};
6060
use internal_dns_resolver::Resolver;
6161
use internal_dns_types::names::BOUNDARY_NTP_DNS_NAME;
@@ -533,20 +533,11 @@ impl illumos_utils::smf_helper::Service for SwitchService {
533533
}
534534
}
535535

536-
/// Combines the generic `SwitchZoneConfig` with other locally-determined
537-
/// configuration
538-
///
539-
/// This is analogous to `OmicronZoneConfigLocal`, but for the switch zone.
540-
struct SwitchZoneConfigLocal {
541-
zone: SwitchZoneConfig,
542-
root: Utf8PathBuf,
543-
}
544-
545536
/// Describes either an Omicron-managed zone or the switch zone, used for
546537
/// functions that operate on either one or the other
547538
enum ZoneArgs<'a> {
548539
Omicron(&'a OmicronZoneConfigLocal),
549-
Switch(&'a SwitchZoneConfigLocal),
540+
Switch(&'a SwitchZoneConfig),
550541
}
551542

552543
impl<'a> ZoneArgs<'a> {
@@ -565,20 +556,7 @@ impl<'a> ZoneArgs<'a> {
565556
) -> Box<dyn Iterator<Item = &'a SwitchService> + 'a> {
566557
match self {
567558
ZoneArgs::Omicron(_) => Box::new(std::iter::empty()),
568-
ZoneArgs::Switch(request) => Box::new(request.zone.services.iter()),
569-
}
570-
}
571-
572-
/// Return the root filesystem path for this zone
573-
pub fn root(&self) -> PathInPool {
574-
match self {
575-
ZoneArgs::Omicron(zone_config) => PathInPool {
576-
pool: zone_config.zone.filesystem_pool.clone(),
577-
path: zone_config.root.clone(),
578-
},
579-
ZoneArgs::Switch(zone_request) => {
580-
PathInPool { pool: None, path: zone_request.root.clone() }
581-
}
559+
ZoneArgs::Switch(request) => Box::new(request.services.iter()),
582560
}
583561
}
584562
}
@@ -1444,6 +1422,7 @@ impl ServiceManager {
14441422
async fn initialize_zone(
14451423
&self,
14461424
request: ZoneArgs<'_>,
1425+
zone_root_path: PathInPool,
14471426
filesystems: &[zone::Fs],
14481427
data_links: &[String],
14491428
fake_install_dir: Option<&String>,
@@ -1527,7 +1506,7 @@ impl ServiceManager {
15271506
let installed_zone = zone_builder
15281507
.with_log(self.inner.log.clone())
15291508
.with_underlay_vnic_allocator(&self.inner.underlay_vnic_allocator)
1530-
.with_zone_root_path(request.root())
1509+
.with_zone_root_path(zone_root_path)
15311510
.with_zone_image_paths(zone_image_paths.as_slice())
15321511
.with_zone_type(zone_type_str)
15331512
.with_datasets(datasets.as_slice())
@@ -2455,10 +2434,7 @@ impl ServiceManager {
24552434
})?;
24562435
RunningZone::boot(installed_zone).await?
24572436
}
2458-
ZoneArgs::Switch(SwitchZoneConfigLocal {
2459-
zone: SwitchZoneConfig { id, services, addresses },
2460-
..
2461-
}) => {
2437+
ZoneArgs::Switch(SwitchZoneConfig { id, services, addresses }) => {
24622438
let info = self.inner.sled_info.get();
24632439

24642440
let gw_addr = match info {
@@ -3247,20 +3223,23 @@ impl ServiceManager {
32473223
}
32483224

32493225
// Ensure that this zone's storage is ready.
3250-
let root = self
3226+
let zone_root_path = self
32513227
.validate_storage_and_pick_mountpoint(
32523228
mount_config,
32533229
&zone,
32543230
&all_u2_pools,
32553231
)
32563232
.await?;
32573233

3258-
let config =
3259-
OmicronZoneConfigLocal { zone: zone.clone(), root: root.path };
3234+
let config = OmicronZoneConfigLocal {
3235+
zone: zone.clone(),
3236+
root: zone_root_path.path.clone(),
3237+
};
32603238

32613239
let runtime = self
32623240
.initialize_zone(
32633241
ZoneArgs::Omicron(&config),
3242+
zone_root_path,
32643243
// filesystems=
32653244
&[],
32663245
// data_links=
@@ -3732,7 +3711,8 @@ impl ServiceManager {
37323711
}
37333712
let path = filesystem_pool
37343713
.dataset_mountpoint(&mount_config.root, ZONE_DATASET);
3735-
Ok(PathInPool { pool: Some(filesystem_pool), path })
3714+
let pool = ZpoolOrRamdisk::Zpool(filesystem_pool);
3715+
Ok(PathInPool { pool, path })
37363716
}
37373717

37383718
pub async fn cockroachdb_initialize(&self) -> Result<(), Error> {
@@ -4647,12 +4627,18 @@ impl ServiceManager {
46474627
// we could not create the U.2 disks due to lack of encryption. To break
46484628
// the cycle we put the switch zone root fs on the ramdisk.
46494629
let root = Utf8PathBuf::from(ZONE_ZFS_RAMDISK_DATASET_MOUNTPOINT);
4650-
let zone_request =
4651-
SwitchZoneConfigLocal { root, zone: request.clone() };
4652-
let zone_args = ZoneArgs::Switch(&zone_request);
4653-
info!(self.inner.log, "Starting switch zone",);
4630+
let zone_root_path =
4631+
PathInPool { pool: ZpoolOrRamdisk::Ramdisk, path: root.clone() };
4632+
let zone_args = ZoneArgs::Switch(&request);
4633+
info!(self.inner.log, "Starting switch zone");
46544634
let zone = self
4655-
.initialize_zone(zone_args, filesystems, data_links, None)
4635+
.initialize_zone(
4636+
zone_args,
4637+
zone_root_path,
4638+
filesystems,
4639+
data_links,
4640+
None,
4641+
)
46564642
.await?;
46574643
*sled_zone =
46584644
SwitchZoneState::Running { request: request.clone(), zone };

sled-storage/src/resources.rs

+2-2
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ use crate::disk::{Disk, DiskError, RawDisk};
1010
use crate::error::Error;
1111
use camino::Utf8PathBuf;
1212
use cfg_if::cfg_if;
13-
use illumos_utils::zpool::{PathInPool, ZpoolName};
13+
use illumos_utils::zpool::{PathInPool, ZpoolName, ZpoolOrRamdisk};
1414
use key_manager::StorageKeyRequester;
1515
use omicron_common::api::external::Generation;
1616
use omicron_common::disk::{
@@ -134,7 +134,7 @@ impl AllDisks {
134134
.map(|pool| {
135135
let path =
136136
pool.dataset_mountpoint(&self.mount_config.root, dataset);
137-
PathInPool { pool: Some(pool), path }
137+
PathInPool { pool: ZpoolOrRamdisk::Zpool(pool), path }
138138
})
139139
.collect()
140140
}

0 commit comments

Comments
 (0)