You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was reminded by Stephen Kanyerezi (Uganda CPHL) that when uploading too many sequences for Nextclade web to handle, it just crashes without clear indication for the user of what went wrong.
I know what the rough sequence limits are before things go OOM for various viruses, e.g. ~200 for mpox, ~3000 for SC2 etc and know how to interpret crashes when I'm going over that limit, but this is not something we should expect of users.
There are a few things we could do here to be less confusing for users:
Have a configurable limit for how many sequences at most to process, set by default in each dataset, this setting should be so that crashes are usually avoided and user can be notified if they uploaded too many sequences, saying that they should batch to see the sequences above the limit
Somehow detect how much memory is used and stop before we go too high, e.g. above 3GB of RAM, this could be configurable so that people can adjust for their local situation
Of course we could also try to reduce the amount of RAM web takes, but this would just shift the limit, not reduce the fact we'll crash without clear message for users.
It's a sad thing, but I don't have any immediate good solution. We mention this problem in the docs, but I guess not everyone reads the docs. I don't plan any particular action, unless you really-really want we try something and convince Richard. Not sure if it's worth abandoning our other projects for this thing right now.
My current random thoughts:
Have a configurable limit for how many sequences at most to process
If we start parsing fasta and counting sequences in JS, that will impact memory even more. Otherwise it is running in wasm. I am not entirely sure how exactly you imagine the process here. Probably not impossible, but...
stop before we go too high
...even if we count certain number of sequences or memory, there's currently no way to stop the process due to #91 (I should really dig into webworker cancellation at some point though). So the best we can do at this point - detect the condition, show the warning, and then it crashes anyways.
Of course we could also try to reduce the amount of RAM web takes
I think there might be some good wins. The interaction between JS and wasm has some hacks and is not super efficient - everything goes through JSON serialization. There might exist better ways nowadays (if we could drop older browsers). But that will take some of time to investigate and to do. Plus the difficulties in measuring the exact memory consumption - browsers are weird animals. Also, not really solving the problem, as you mentioned.
At certain point I've been hopeful we get Memory64 in browsers, so that the ~3GB limit is lifted. It seems to have reached the point that the latest Firefox and Chrome support that, but older versions and other browsers either require a flag or don't support it. Though Rust toolchain is still immature. Once things improve in the toolchain we could add a conditional switch to 64-bit version. I think that would be a relatively easy win.
P.S. If someone reads this and has time and forces to dig into the problem, then contributions and ideas are very welcome.
I was reminded by Stephen Kanyerezi (Uganda CPHL) that when uploading too many sequences for Nextclade web to handle, it just crashes without clear indication for the user of what went wrong.
I know what the rough sequence limits are before things go OOM for various viruses, e.g. ~200 for mpox, ~3000 for SC2 etc and know how to interpret crashes when I'm going over that limit, but this is not something we should expect of users.
There are a few things we could do here to be less confusing for users:
I'm curious about your thoughts @ivan-aksamentov
The text was updated successfully, but these errors were encountered: