-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry decryption from a long-lived async task, instead of lots of detached ones #4715
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4715 +/- ##
=======================================
Coverage 86.14% 86.14%
=======================================
Files 291 292 +1
Lines 34308 34327 +19
=======================================
+ Hits 29553 29570 +17
- Misses 4755 4757 +2 ☔ View full report in Codecov by Sentry. |
This lines that codecov is complaining about missing coverage are pretty much all lifted-and-shifted from the their old location into |
@poljar is this something like what you intended in https://github.com/matrix-org/matrix-rust-sdk/pull/4644/files#r1963599148 ? |
|
||
// Spawn the long-running task, providing the receiver so we can listen for | ||
// decryption requests | ||
matrix_sdk::executor::spawn(decryption_task(state, room_data_provider, receiver)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we're still dropping the JoinHandle
though the task will get killed because the sender
gets dropped, which in turn means the receiver will get a None
.
That's a bit implicit compared to holding on to the JoinHandle
and cancelling in the drop()
call of `DecryptionRetryTask, but I guess there's nothing wrong with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I tried to join in drop()
but it's async, so I can't do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you wouldn't await the join in the drop()
call, you would just abort()
the task. But perhaps it's cleaner to let the task naturally die due to the Sender
being gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think keeping the join handle and manually aborting shows a clear ownership of the task, and would be a tiny bit clearer, even in terms of discoverability: we'd see the JoinHandle
in the struct, thus know that this is a task's owner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I was thinking this as well. But then the task could stop in two ways:
- It ends because the receiver receives a
None
. - It gets cancelled.
And then I would need to think about cancel safety of all the things the task contains. Which made me conclude that it's fine to let the task die of natural causes the way it's done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair; my argument in terms of discoverability still holds IMO, can we store the task in the struct at least?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's probably a wise idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in f39a969
Yeah, I think that's fine, I explained things a bit more here: #4715 (review) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks for splitting this into many small commits, made the review easier. I left one nit, but I'll approve this in advance.
There's also a typo in one commit:
Adjust tsts that retry decryption to wait with
In any case, this should make it much easier to move the redecryption logic into the correct place, wherever we decide this to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to change my mind, could you please put the JoinHandle
into the DecryptionRetryTask
.
I shouldn't be used for anything, just to let people know that the task is owned by this struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the last minute change, we can merge this after you make the compiler happy with the right types.
ec3bc04
to
ddfe69e
Compare
…el, instead of spawning a task directly
…nto the async task
ddfe69e
to
9cb0630
Compare
Part of element-hq/element-meta#2697
In response to a comment on #4644 - avoid spawning a detached async task every time we have some things to decrypt because some Megolm sessions updated.
Instead, we create one async task and hold it inside a new struct called
RetryDecryptionTask
. This struct holds one end of channel, and sends retry requests to the async task, which stops when the channel is closed, which will happen when theRetryDecryptionTask
struct is dropped.So we keep the feature that retrying decryption does not block the main processing, but we lose the spawning of lots of async tasks that are not kept track of.
As written, the task is not cancellable, but this could be added if we think we need it.
I strongly suggest reviewing commit by commit, since this is structured as a set of refactorings that hopefully make sense individually.