Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry Canvas submission on timeout #7018

Merged
merged 2 commits into from
Feb 20, 2025
Merged

Retry Canvas submission on timeout #7018

merged 2 commits into from
Feb 20, 2025

Conversation

marcospri
Copy link
Member

@marcospri marcospri commented Feb 18, 2025

Canvas submissions are sent when the first annotations is made during a "session" on a gradable assignment.

We have some error handling and display error dialogs accordingly but due to the indirect nature of the action is tricky to handle transactional errors like timeouts.

This commits adds a "flag" on the response of API calls to our own frontend marking the request as "retryable" allowing the FE to retry them.

Testing

  • Apply a diff like
diff --git a/lms/views/api/grading.py b/lms/views/api/grading.py
index 7b072a36d..430d0f0cc 100644
--- a/lms/views/api/grading.py
+++ b/lms/views/api/grading.py
@@ -103,6 +103,9 @@ class GradingViews:
         # absence of a submission from an ungraded submission. Non-Canvas LMSes in
         # theory require a grade).
 
+        if not self.request.headers.get("Retry-Count"):
+            raise ExternalRequestError(retryable=True)
+
         lis_result_sourcedid = self.parsed_params["lis_result_sourcedid"]
         try:
             # If we already have a score, then we've already recorded this info

to simulate a retry-success scenario.

2025-02-19 11:14:15,976 DEBUG [lms.views.helpers._logging:13][MainThread] Request to /api/lti/submissions succeeded after 1 retries

on the server logs.

Canvas submissions are sent when the first annotations is made during a
"session" on a gradable assignment.

We have some error handling and display error dialogs accordingly but due to the indirect
nature of the action is tricky to handle transactional errors like timeouts.

This commits adds a "flag" on the response of API calls to our own frontend marking
the request as "retryable" and the frontend.
@marcospri marcospri changed the title Retry Canvas submission on retry Retry Canvas submission on timeout Feb 18, 2025
@@ -254,6 +254,7 @@ export default function BasicLTILaunchApp() {
authToken,
path: '/api/lti/submissions',
data: submissionParams,
maxRetries: 2,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default, 10, seems excessive

...extraHeaders,
['Retry-Count']: (retryCount + 1).toString(),
},
retryCount: retryCount + 1,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like duplicating the same value but I reckon the header and the retryCount are relatively independent, one is part of the request and the other is part of our own code logic.

But I might be missing a nice trick.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few lines above there's this block:

const headers: Record<string, string> = {
  Authorization: authToken,
};

You could update it and set the Retry-Count header there, if retryCount > 0.

const headers: Record<string, string> = {
  Authorization: authToken,
};
+if (retryCount > 0) {
+  headers['Retry-Count'] = `${retryCount}`;
+}

That would reduce the amount of nested object spreading, and the duplication you mentioned above.

@@ -145,6 +151,7 @@ def record_canvas_speedgrader_submission(self):
self.request.registry.notify(
LTIEvent.from_request(request=self.request, type_=LTIEvent.Type.SUBMISSION)
)
self.request.add_finished_callback(log_retries_callback)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only applying this and (err.retryable = True) on this view.

If we merge this approach I'll create an issue to track generalizing this approach for other views.

await delay(retryDelay);
return apiCall({ ...options, retryCount: retryCount + 1 });
return apiCall({
...options,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a lot of ...s

@marcospri marcospri requested a review from acelaya February 19, 2025 10:16
@acelaya
Copy link
Contributor

acelaya commented Feb 19, 2025

This commits adds a "flag" on the response of API calls to our own frontend marking the request as "retryable" and the frontend.

And the frontend... What? Is this a cliffhanger for season 2? 😂

...extraHeaders,
['Retry-Count']: (retryCount + 1).toString(),
},
retryCount: retryCount + 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few lines above there's this block:

const headers: Record<string, string> = {
  Authorization: authToken,
};

You could update it and set the Retry-Count header there, if retryCount > 0.

const headers: Record<string, string> = {
  Authorization: authToken,
};
+if (retryCount > 0) {
+  headers['Retry-Count'] = `${retryCount}`;
+}

That would reduce the amount of nested object spreading, and the duplication you mentioned above.

Comment on lines +121 to +124
if err.is_timeout:
# We'll inform the frontend that this is a retryable error for timeouts.
err.retryable = True
raise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reckon none of the following conditions would match for a timeout, so you could just set err.retryable = err.is_timeout, but this is definitely more predictable.

...options,
extraHeaders: {
...extraHeaders,
['Retry-Count']: (retryCount + 1).toString(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being pedantic here, but I think the common practice is to prefix custom headers with X-, as in X-Retry-Count.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did out-pedantic myself before reading this:

https://www.rfc-editor.org/rfc/rfc6648.html#section-3

 3.  SHOULD NOT prefix their parameter names with "X-" or similar
       constructs.

I reckon the rationale there is that you start naming things X-, then become common usage but they endup stuck with the X-. I don't think there's are risk of that happening here so I don't have a strong opinion.

Copy link
Contributor

@acelaya acelaya Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wow!

I think the rational around adding the X- prefix is making sure you don't end up using a header that eventually becomes part of the standard, but I don't think there's a high risk that Retry-Count is added to the spec any time soon, without us noticing.

Let's go with Retry-Count then.

@marcospri marcospri merged commit 4c08f68 into main Feb 20, 2025
8 checks passed
@marcospri marcospri deleted the timeoutretry branch February 20, 2025 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants