status |
---|
accepted |
Each Flow Segment returned by the /flows/<flow id>/segment
endpoint includes a get_urls
property that may include (labelled) URLs for fetching media objects.
The calculation of pre-signed URLs for example can result in an not insignificant increased request time.
Clients do not need the get_urls
if they don't intend to fetch the media objects.
This ADR proposes adding a query option to filter the URLs that need to be included by a TAMS.
It was found that calculating a pre-signed URL for fetching a media object takes around 0.35 milliseconds when using the Python boto3 library. That means a request to get 5000 Flow Segments will include around 1.75 seconds overhead to calculate the pre-signed URLs. This was observed in a request to a TAMS which took around 4 seconds to return Flow Segments with pre-signed URLs and only around 2 seconds to return the same Flow Segments without pre-signed URLs.
The boto3 method to generate pre-signed URLs could also be optimised to reduce the time taken. It was found that the older deprecated boto library took around 0.14 milliseconds to calculate a pre-signed URL, a 60% speed reduction when compared to boto3. There are likely even more optimisations that could be made.
The pre-signed URL calculation may use a highly optimised implementation that avoids the large time overhead. This would still result in larger response sizes because the pre-signed URLs tend to be quite large as they include signatures, keys and tokens.
It is difficult to predict what the expected number of flow segments will be in practice. If it is low in all cases then including or not including pre-signed URLs doesn't make much difference. If it is unknown then having an option to avoid the pre-signed calculation overhead may be useful in some cases.
- Option 1: Always return all
get_urls
- Option 2: Add a query option to return none or all
get_urls
- Option 3: Add a simple query option to filter
get_urls
based on thelabel
- Option 4: Add more complex query option(s) to filter
get_urls
Chosen option: "Option 3: Add a simple query option to filter get_urls
based on the label
", because this covers the requirement to disable calculation of pre-signed URLs and at the same time allows other URLs to be retained.
Implemented in PR #88.
This is the current state.
- Bad, because TAMS is doing work to calculate pre-signed URLs even though clients may not need them
This adds a boolean option named include_get_urls
that if set to false
results in no get_urls
in the response and therefore no pre-signed URLs are calculated by TAMS.
- Good, because it allows clients to indicate that
get_urls
are not required - Bad, because some TAMS implementations may provide different types of URLs and a client does not have the option to filter out the ones that cause increased request time whilst retaining ones that don't
This extends Option 2 to add a (comma-separated) list option named accept_get_urls
that specifies the labels
associated with URLS to include in the response.
Omitting accept_get_urls
will result in all URLs in the response.
Setting accept_get_urls
to an empty string will result in no URLs in the response.
Flow segment get_urls
with no label cannot be filtered.
- Good, because it allows clients to indicate that
get_urls
are not required - Good, because clients can retain URLs that don't cause increased request times
- Neutral, because clients may want more complex filters based on what is available
This changes Option 3 to allow more complex filtering based on what URLs are available. E.g. include a URL for direct access to the object if available and otherwise a pre-signed URL.
- Good, because it allows clients to indicate that
get_urls
are not required - Good, because clients can retain URLs that don't cause increased request times
- Neutral, because clients can use more complex filters based on what is available
- Neutral, because it isn't clear whether there is a requirement for more complex filters
- Neutral, because 0021 thus far only proposes an App note for guidance on the use of
get_urls
labels on flow segments