Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mock: attempt realistic state transitions #860

Merged
merged 4 commits into from
Feb 20, 2025
Merged

Conversation

hawkw
Copy link
Member

@hawkw hawkw commented Feb 18, 2025

Presently, when the mock propolis-server receives a put-state request
for Stopped, it sets self.state to Stopped and shuts down the mock
serial console, here:

api::InstanceStateRequested::Stop => {
self.state = api::InstanceState::Stopped;
self.serial_task.shutdown().await;
Ok(())
}

However, it never actually sets the state_watcher state, which is
what's used by the instance_state_monitor endpoint, to Stopping,
the way we do for other state transitions:

api::InstanceStateRequested::Run
| api::InstanceStateRequested::Reboot => {
self.generation += 1;
self.state = api::InstanceState::Running;
self.state_watcher_tx
.send(api::InstanceStateMonitorResponse {
gen: self.generation,
state: self.state,
migration: api::InstanceMigrateStatusResponse {
migration_in: None,
migration_out: None,
},
})
.map_err(|_| Error::TransitionSendFail)
}

If we want to actually use the mock server to test sled-agent behavior
around instance stop, we need to make it behave realistically here.

In real life, stopping an instance will cause it to go through multiple
state transitions: first to Stopping and then to Stopped. This also
Currently, the mock doesn't have a way to cause multiple state transitions
to be observed by the instance_state_monitor client. Therefore, I've
changed the implementation to support this, using a map of states by
generation number. Now, when the state monitor requests the next state
transition from a given generation, we will return the state at
gen + 1 in that map if one exists, or wait until more states are
added to the map. Transitions that cause the instance to go through
multiple states will now add all of those states to the queue of states
to simulate.

The state used by instance_get and for determining what state
transitions are updated is now represented by a variable tracking the
current state generation. This is updated only once we expose a new
state to the instance_state_monitor client, so the understanding of
the instance's state used to determine what requested transitions
are valid is kept in sync with what we've claimed to be from the state
monitor's perspective.

Testing: I've pointed the omicron repo's propolis-mock-server dep
at commit 28d81cb and run
cargo nextest run -p omicron-sled-agent.1 All the tests still pass.

Fixes #857

Footnotes

  1. I believe only the sled-agent test suite uses
    propolis-mock-server?

@hawkw hawkw requested a review from gjcolombo February 19, 2025 21:44
@hawkw hawkw marked this pull request as ready for review February 19, 2025 21:44
@hawkw hawkw merged commit 98d0823 into master Feb 20, 2025
11 checks passed
@hawkw hawkw deleted the eliza/more-realistic-mock branch February 20, 2025 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mock-server doesn't simulate instance stop
2 participants