Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deactivate the read only parent after a scrub #1180

Merged
merged 1 commit into from
Feb 26, 2024
Merged

Conversation

leftwo
Copy link
Contributor

@leftwo leftwo commented Feb 23, 2024

If the scrub on a read only parent finishes without error, then send a
deactivate to force a disconnect from all the downstairs.

This should fix #1090

If the scrub on a read only parent finishes without error, then
send a deactivate to force a disconnect from all the downstairs.
@leftwo leftwo requested review from mkeeter and jmpesp February 23, 2024 22:52
@leftwo
Copy link
Contributor Author

leftwo commented Feb 23, 2024

I tested this on a bench gimlet which had a ROP parent for the boot disk.

Here are some logs from propolis:

22:30:15.588Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Scrub at offset 4781568/6291456 sp:4781568                                           
22:30:43.704Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Scrub at offset 5033216/6291456 sp:5033216                                           
22:31:11.829Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Scrub at offset 5284864/6291456 sp:5284864                                           
22:31:39.956Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Scrub at offset 5536512/6291456 sp:5536512                                           
22:32:08.017Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Scrub at offset 5788160/6291456 sp:5788160                                           
22:32:36.159Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Scrub at offset 6039808/6291456 sp:6039808                                           
22:33:04.284Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Scrub at offset 6291456/6291456 sp:6291456                                           
22:33:04.310Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Scrub 794d08f2-972c-4254-b6ba-12d2bba2c4dc done in 711 seconds. Retries:0 scrub_size:
131072 size:6291456 pause_milli:25                                                                                                                                       
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): deactivate read only parent 794d08f2-972c-4254-b6ba-12d2bba2c4dc                     
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Request to deactivate this guest                                                     
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): checking for deactivation                                                            
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [0] deactivate job 33783 not InProgress flush, NO                                    
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate client 0                                                     
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [1] deactivate job 33783 not InProgress flush, NO                                    
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate client 1                                                     
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [2] deactivate job 33783 not InProgress flush, NO                                    
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate client 2                                                     
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate due to state Active client = 0                               
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): checking for deactivation                                                            
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [0] deactivate job 33783 not InProgress flush, NO                                    
22:33:04.311Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate client 0                                                     
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [1] deactivate job 33783 not InProgress flush, NO                                    
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate client 1                                                     
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [2] check deactivate YES
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): ds_transition from Active to Deactivated client = 2
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [2] Transition from Active to Deactivated client = 2
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): deactivated client 2
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate due to state Active client = 0
22:33:04.312Z WARN propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): client task is sending Done(RequestedStop(Deactivated)) client = 2
22:33:04.312Z WARN propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): downstairs task for 2 stopped due to RequestedStop(Deactivated)
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Gone missing, transition from Deactivated to Disconnected client = 2
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Disconnected -> New client = 2
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): checking for deactivation
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [0] deactivate job 33783 not InProgress flush, NO
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate client 0
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [1] deactivate job 33783 not InProgress flush, NO
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate client 1
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate due to state Active client = 0
22:33:04.312Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): client task is exiting client = 2
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): checking for deactivation
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [0] deactivate job 33783 not InProgress flush, NO
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate client 0
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [1] check deactivate YES
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): ds_transition from Active to Deactivated client = 1
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [1] Transition from Active to Deactivated client = 1
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): deactivated client 1
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate due to state Active client = 0
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): checking for deactivation
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [0] deactivate, no work so YES
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): ds_transition from Active to Deactivated client = 0
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): [0] Transition from Active to Deactivated client = 0
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): deactivated client 0
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate due to state Deactivated client = 0
22:33:04.313Z WARN propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): client task is sending Done(RequestedStop(Deactivated)) client = 0
22:33:04.313Z WARN propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): downstairs task for 0 stopped due to RequestedStop(Deactivated)
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Gone missing, transition from Deactivated to Disconnected client = 0
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Disconnected -> New client = 0
22:33:04.313Z WARN propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): client task is sending Done(RequestedStop(Deactivated)) client = 1
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): checking for deactivation
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): ready to deactivate from state New client = 0
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): not ready to deactivate due to state Deactivated client = 1
22:33:04.313Z WARN propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): downstairs task for 1 stopped due to RequestedStop(Deactivated)
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Gone missing, transition from Deactivated to Disconnected client = 1
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Disconnected -> New client = 1
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): checking for deactivation
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): ready to deactivate from state New client = 0
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): ready to deactivate from state New client = 1
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): ready to deactivate from state New client = 2
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): All DS in the proper state! -> INIT
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Scrub of volume 794d08f2-972c-4254-b6ba-12d2bba2c4dc completed, remove parent
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): client task is exiting client = 0
22:33:04.313Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): client task is exiting client = 1
22:33:04.472Z INFO propolis-server (crucible-794d08f2-972c-4254-b6ba-12d2bba2c4dc): Submitted removal for read only parent on 794d08f2-972c-4254-b6ba-12d2bba2c4dc

@mkeeter
Copy link
Contributor

mkeeter commented Feb 26, 2024

I did some digging through the deactivation state machine, and this seems reasonable.

Once all three Downstairs are deactivated, the upstairs transitions back to UpstairsState::Initializing and the downstairs tasks are reinitialized. This sounds bad, but the auto_promote flag is set to false, so the client task waits for activation (and does not try to connect immediately).

It would probably be cleaner to not restart the client task immediately, but that's a larger refactoring, and this PR certainly doesn't make the situation worse.

One question inspired by this PR: should Volume::scrub consume its argument (i.e. take self)? If no one should use the volume afterwards, this would let us enforce that constraint with the compiler.

@leftwo
Copy link
Contributor Author

leftwo commented Feb 26, 2024

One question inspired by this PR: should Volume::scrub consume its argument (i.e. take self)? If no one should use the volume afterwards, this would let us enforce that constraint with the compiler.

The volume is still used by the rest of Propolis to send IOs, so I don't think we can take self here.
The ROP is no longer used, but the rest of the Volume (the SubVolumes) I believe are still needed.

@leftwo leftwo merged commit 662b84e into main Feb 26, 2024
18 checks passed
@leftwo leftwo deleted the alan/give-up-rop branch February 26, 2024 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read only parents need to give up after scrub is done
3 participants