You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fixing #7229 requires sled-agent to know, for any installed zone, what
zpool was chosen for that zone's root. It looks like this is tracked by
`InstalledZone::zonepath`:
https://github.com/oxidecomputer/omicron/blob/5fa0d8e50fd6ab36c211f3f71bebeab3ccf1860c/illumos-utils/src/running_zone.rs#L917-L918
except the `ZpoolName` of `PathInPool` is optional:
https://github.com/oxidecomputer/omicron/blob/5fa0d8e50fd6ab36c211f3f71bebeab3ccf1860c/illumos-utils/src/zpool.rs#L193
It's not obvious why the pool could be `None`, particularly when there's
a comment on `PathInPool` that says we could derive the pool name from
the path. Trying to make it non-optional reveals two spots where this
can be set to none, both in `ZoneArgs::root()`, and for two very
different reasons:
https://github.com/oxidecomputer/omicron/blob/5fa0d8e50fd6ab36c211f3f71bebeab3ccf1860c/sled-agent/src/services.rs#L572-L584
The second arm is where we construct a `PathInPool` for the switch zone.
That zone is not in a zpool at all, but is in the ramdisk, so there is
truly no zpool involved.
The first arm requires `pool` to be optional because
`zone_config.zone.filesystem_pool` is optional, and in practice we do
have values of `None` there (exactly what #7229 is trying to address!).
This one seems dubious at best, because for any given zone we choose to
install, we _do_ pick a pool for its filesystem, even if the zone config
has `filesystem_pool: None`:
https://github.com/oxidecomputer/omicron/blob/5fa0d8e50fd6ab36c211f3f71bebeab3ccf1860c/sled-agent/src/services.rs#L3707-L3720
This PR changes `PathInPool::pool` from `Option<ZpoolName>` to a new
enum `ZpoolOrRamdisk`:
https://github.com/oxidecomputer/omicron/blob/cd0a87f80301f86661c366979be07bc7ff34f421/illumos-utils/src/zpool.rs#L184-L188
From a Rust-typesystem point of view, this type is basically the same as
`Option`, except by renaming the `None` case to `Ramdisk` it becomes
obvious that `ZoneArgs::root()` is incorrect: we can't just change the
first arm to return `ZpoolOrRamdisk::Ramdisk` in the case where the zone
config `filesystem_pool` is `None`, because we're certainly not placing
non-switch zones on the ramdisk. This led to removing `ZoneArgs::root()`
altogether (the zone config alone is not enough to know the root of the
zone, since we choose it randomly in some cases), and instead passing an
extra argument into `initialize_zone` (the fully-populated `PathInPool`
for the zone).
This feels like the smallest change I could make to address the
immediate hurdle to fixing #7229, but it might be worth spinning out
some separate issues for longer term followup?
1. `PathInPool` is "denormalized" in that the zpool (if there is one) is
repeated in the `path` it also stores. There is no enforcement that the
two fields are consistent: one can construct or modify a `PathInPool`
where the `path` is on some other zpool than the one in `pool`.
2. Should `PathInPool` be used to store paths that aren't in pools at
all? (i.e., the switch zone in the ramdisk)
3. This is more tangentially related, but: the `PathInPool::path` value
for a zone ends up stored in a sled-agent ledger as part of
[`OmicronZoneConfigLocal`](https://github.com/oxidecomputer/omicron/blob/345e095ff9a906ab4f26f3bd623d21b4b4af862a/sled-agent/src/services.rs#L462-L483).
The comment on `OmicronZoneConfigLocal` notes that "this struct is less
necessary than it has been historically". Once #7229 is fixed, can we
make the zone config `filesystem_pool` non-optional and then remove
`OmicronZoneConfigLocal` altogether?
0 commit comments