-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AN-356] When a cluster fails to start up, don't detach persistent disk #4821
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #4821 +/- ##
===========================================
- Coverage 74.62% 74.62% -0.01%
===========================================
Files 166 166
Lines 14692 14690 -2
Branches 1135 1158 +23
===========================================
- Hits 10964 10962 -2
Misses 3728 3728
Continue to review full report in Codecov by Sentry.
|
@@ -101,6 +101,7 @@ class BaseCloudServiceRuntimeMonitorSpec extends AnyFlatSpec with Matchers with | |||
disk <- makePersistentDisk().save() | |||
start <- IO.realTimeInstant | |||
tid <- traceId.ask[TraceId] | |||
implicit0(ec: ExecutionContext) = scala.concurrent.ExecutionContext.Implicits.global |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does his do? Still wrapping my head around how to use implicits properly 😭
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...honestly I'm not entirely sure myself, just that I needed the EC implicit to do the disk query
@@ -303,6 +311,47 @@ class BaseCloudServiceRuntimeMonitorSpec extends AnyFlatSpec with Matchers with | |||
res.unsafeRunSync()(cats.effect.unsafe.IORuntime.global) | |||
} | |||
|
|||
it should "detach Ready disk on failed runtime create" in isolatedDbTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking, should we also detach the PD when the runtime is in deleting status? not just deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's done as a part of the deletedRuntime func in the GceRuntimeMonitor (complete deletion detaches the disk)
Jira ticket: https://broadworkbench.atlassian.net/browse/AN-356
Summary of changes
What
This PR moves the detach logic so that the disk is only detached when a runtime fails to create and the disk isn't in creating or failed
Why
When the startup script fails (due to a full disk etc) the persistent disk becomes ‘detached’ in the DB. (The disk id is removed from the RUNTIME_CONFIG)
This means that the user cannot even try to increase their disk size in the UI when they have a full disk because they will get a persistent disk not found for runtime error.
Testing these changes
What to test
df -Th
to see how much space is available on sdbfallocate
a file to fill up the spaceWho tested and where
jenkins retest
orjenkins multi-test
.