Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do something other than just crashing when people upload too many sequences for Nextclade #1570

Open
corneliusroemer opened this issue Feb 21, 2025 · 1 comment
Labels
package: nextclade_web t:bug Type: bug, error, something isn't working

Comments

@corneliusroemer
Copy link
Member

corneliusroemer commented Feb 21, 2025

I was reminded by Stephen Kanyerezi (Uganda CPHL) that when uploading too many sequences for Nextclade web to handle, it just crashes without clear indication for the user of what went wrong.

I know what the rough sequence limits are before things go OOM for various viruses, e.g. ~200 for mpox, ~3000 for SC2 etc and know how to interpret crashes when I'm going over that limit, but this is not something we should expect of users.

There are a few things we could do here to be less confusing for users:

  • Have a configurable limit for how many sequences at most to process, set by default in each dataset, this setting should be so that crashes are usually avoided and user can be notified if they uploaded too many sequences, saying that they should batch to see the sequences above the limit
  • Somehow detect how much memory is used and stop before we go too high, e.g. above 3GB of RAM, this could be configurable so that people can adjust for their local situation
  • Of course we could also try to reduce the amount of RAM web takes, but this would just shift the limit, not reduce the fact we'll crash without clear message for users.

I'm curious about your thoughts @ivan-aksamentov

@corneliusroemer corneliusroemer added needs triage Mark for review and label assignment package: nextclade_web t:bug Type: bug, error, something isn't working labels Feb 21, 2025
@ivan-aksamentov
Copy link
Member

Discussed in #447 and it's still tricky

It's a sad thing, but I don't have any immediate good solution. We mention this problem in the docs, but I guess not everyone reads the docs. I don't plan any particular action, unless you really-really want we try something and convince Richard. Not sure if it's worth abandoning our other projects for this thing right now.

My current random thoughts:

Have a configurable limit for how many sequences at most to process

If we start parsing fasta and counting sequences in JS, that will impact memory even more. Otherwise it is running in wasm. I am not entirely sure how exactly you imagine the process here. Probably not impossible, but...

stop before we go too high

...even if we count certain number of sequences or memory, there's currently no way to stop the process due to #91 (I should really dig into webworker cancellation at some point though). So the best we can do at this point - detect the condition, show the warning, and then it crashes anyways.

Of course we could also try to reduce the amount of RAM web takes

I think there might be some good wins. The interaction between JS and wasm has some hacks and is not super efficient - everything goes through JSON serialization. There might exist better ways nowadays (if we could drop older browsers). But that will take some of time to investigate and to do. Plus the difficulties in measuring the exact memory consumption - browsers are weird animals. Also, not really solving the problem, as you mentioned.


At certain point I've been hopeful we get Memory64 in browsers, so that the ~3GB limit is lifted. It seems to have reached the point that the latest Firefox and Chrome support that, but older versions and other browsers either require a flag or don't support it. Though Rust toolchain is still immature. Once things improve in the toolchain we could add a conditional switch to 64-bit version. I think that would be a relatively easy win.


P.S. If someone reads this and has time and forces to dig into the problem, then contributions and ideas are very welcome.

@ivan-aksamentov ivan-aksamentov removed the needs triage Mark for review and label assignment label Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package: nextclade_web t:bug Type: bug, error, something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants