-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPUs? #190
Comments
Benchmarking on M2's makes things complicated. I would benchmark on Intel/AMD. My 2 cents would be that the metric you might actually want would require normalizing by the number of cores and the fraction of the GPU memory being used (given that they can be partitioned via MiG configurations now). Then, I always have in my mind the order of mag estimate that a modern GPU cost is 100x a modern CPU cost. An example of this is through NSF ACCESS calculator. For example, on DARWIN, a GPU-h is 69x more than a CPU-h. |
Just because I saw this today, at Harvard's Cannon cluster, an hour on an NVIDIA A100 is 209.4x an hour on an Intel Cascade Lake core. https://docs.rc.fas.harvard.edu/kb/fairshare/ |
It has been suggested before that Korg might be sped up by putting the hot loop (line opacity calculation) on the GPU.
This paper presents an interesting implementation of that idea: https://ui.adsabs.harvard.edu/abs/2015ApJ...808..182G/abstract
Comparing their performance to ours is not straightforward. They take about ~1s to calculate opacity from 10^6 lines in their performance tests. If I do a synthesis for a cool star with the Pokazatel water linelist, there are ~2*10^6 lines across ~50 layers -> 10^8 line opacity calculations. This takes ~40s. So naively, we are already doing great (0.4 µs vs their 1 µs per line). BUT:
voigt
could be more accurate #43.)The code for this (and for GPU-accelerated RT!) is on their github: https://github.com/exoclime.
The text was updated successfully, but these errors were encountered: