Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak while obtaining data from DataLoader #5

Closed
kzhang0718 opened this issue Sep 2, 2019 · 6 comments
Closed

Memory leak while obtaining data from DataLoader #5

kzhang0718 opened this issue Sep 2, 2019 · 6 comments

Comments

@kzhang0718
Copy link

Hello, thanks for the great work.

There seems to be a memory issue when acquiring tile images from the DataSet. Memory increases linearly every iteration while eventually eats up the entire RAM. From the code it doesn't look like anything should be taking up that much memory. Any advice how to get around this? Seems to have to do with openslide.

@gabricampanella
Copy link
Collaborator

gabricampanella commented Sep 2, 2019

@kzhang0718 your intuition is correct. Unfortunately openslide was never optimized for such large-scale AI applications. See related threads:
openslide/openslide-python#24
openslide/openslide#38

One possible solution which is to change the source code of openslide so that cache is disabled. For example GeertLitjens 's (https://github.com/GeertLitjens) fix works well. You will have then to compile the code yourself.

Hope this was helpful.

@kzhang0718
Copy link
Author

@gabricampanella Thanks, that's very informative. I figured it must have to do with the cache but I have not thought about disabling it from the openslide source code. I modified the DataSet class so that WSI file handles are open and closed on the fly, memory looked fine but performance dropped significantly. I'll definitely try disabling the cache.

On a different note, the code doesn't look very scalable. I'm only working with about 200 WSIs so it's not a big issue at the moment. But when I get to thousands or even tens of thousands of WSIs training would take forever. I wonder how you managed to train on over 30k (something like that) WSIs for your Nature Medicine's work. Would appreciate it if you could share some experience.

@gabricampanella
Copy link
Collaborator

@kzhang0718 You are right that given this set-up, scalability will become an issue as you hit the tens of thousands. Solving that caching issue will alleviate the problem a bit and will allow you to hit that mark. To go beyond there are many things that can be done both on the algorithm side (for example being smarter about the inference stage) and on the framework side (custom made data loading). To give you some context, 10k prostate core biopsies were trained in about a week. The breast dataset which is composed of much larger tissue samples took one month.

@kzhang0718
Copy link
Author

@gabricampanella Great, thanks for the info! I'll see what I can do when it hits that mark.

Closing this thread.

@samkleeman1
Copy link

Many thanks for sharing this exciting methodology. I am having the same problem. I was wondering if you could advise how to access GeertLitjens's fix. Is that what you used for the Nature Medicine paper?

@aamster
Copy link

aamster commented Oct 4, 2022

I used tiffslide instead which does not have this problem. Unfortunately currently it cannot be used with multiprocessing Bayer-Group/tiffslide#18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants