This directory contains the per-sled "agent", responsible for managing local hardware and instances.
This subdirectory contains implementations of both a "simulated" and "real" Sled Agent. The simulated agent allows testing state machine management in a low-overhead environment. Where reasonable, decision-making logic is shared between the two implementations.
-
src/bin
: Contains binaries for the sled agent (and simulated sled agent). -
src/bootstrap
: Contains bootstrap-related services, operating on a distinct HTTP endpoint from typical sled operation. -
src/common
: Shared state machine code between the simulated and real sled agent. -
src/sim
: Library code responsible for operating a simulated sled agent. -
src/illumos
: Illumos-specific helpers for accessing OS utilities to manage a sled.
Additionally, there are some noteworthy top-level files used by the sled agent:
-
src/instance_manager.rs
: Manages multiple instances on a sled. -
src/instance.rs
: Manages a single instance.
-
src/storage_manager.rs
: Manages storage within a sled. -
src/services.rs
: Manages services that are running within the sled. -
src/updates.rs
: Manages packages installed on the sled.
-
src/http_entrypoints.rs
: The HTTP API exposed by the Sled Agent. -
src/params.rs
: Parameters to the HTTP endpoints.
As well as some utilities:
-
src/illumos/running_zone.rs
: RAII wrapper around a running Zone owned by the Sled Agent. -
src/illumos/vnic.rs
: RAII wrapper around VNICs owned by the Sled Agent.
Note
|
This process is subject to change. What follows attempts to be an accurate description of the current implementation. |
As a prerequisite for starting new instances, the Sled Agent takes the following steps on initialization to manage OS-local resources.
-
… creates a ZFS filesystem for zones, called
rpool/zone
, mounted at/zone
. -
… identifies all Oxide-controlled zones (with the prefix
oxz_
) and all Oxide-controlled VNICs (with the prefixox
), which are removed from the machine.
-
A request arrives via the HTTP server (typically from Nexus), requesting an operation like
instance_put
. -
This request is routed to the
InstanceManager
through theensure
method. If the instance already exists, it is identified by UUID and updated. Otherwise, a new instance is created. -
A dedicated "control" VNIC is created by the Sled Agent, for use by the instance’s zone.
-
This requires identifying a physical data-link device, and creating a new VNIC on top of it.
-
-
An omicron-branded zone is created for the instance. Within this zone…
-
… Propolis' executable files exist.
-
… The previously created "control" VNIC is attached as a network interface.
-
… Necessary devices for virtualization (
/dev/{vmm,vmmctl,viona}
) are added.
-
-
The zone is then booted. Once the SMF network milestone has been reached…
-
… An IP address is allocated on the aforementioned VNIC.
-
… Propolis' HTTP server is initialized, using this IP address.
-
… And the original request to "create an instance" is passed to this Propolis server, where the actual VM initialization occurs.
-
-
At this point, the instance is up and running, and the Sled Agent monitors it such that Nexus may be notified if the VM changes state.
We track issues on Github using the following labels:
However, the following are general areas of improvement for the Sled Agent:
-
(Correctness) Plumbing of Nexus-customizable InstanceProperties, such as image, bootrom, memory, vcpus.
-
(Correctness) Integration of OPTE to manage the network and apply policy changes as requested by Nexus.
-
(Resilience) On initialization of the
InstanceManager
, the Sled Agent currently removes all Zones and VNICs on the system with known magic prefixes. Instead, these should be inspected and re-organized, especially to deal with a case where the Sled Agent reboots, but customer VMs continue to execute on a rack. -
(Performance) Minimize polling-based behavior, where possible. The Sled Agent is responsible for launching zones, SMF services, and HTTP servers - for all of these, the agent uses polling with timeouts to monitor progress between "starting" and "actually running and responding to requests". If possible, it would be preferable to replace these timeouts with event-triggered behavior, which would avoid unnecessary stalls.