Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference Fails on 16GB GPU even in Low-VRAM Mode #32

Open
rachelkluu opened this issue Jan 21, 2025 · 3 comments
Open

Inference Fails on 16GB GPU even in Low-VRAM Mode #32

rachelkluu opened this issue Jan 21, 2025 · 3 comments

Comments

@rachelkluu
Copy link

I'm running into issues trying to run.py on the demo files. Command:

python run.py demo_files/examples/fish.png --output-dir output/ --low-vram-mode

Output:
RuntimeError: CUDA driver error: out of memory

CUDA Available: True
CUDA Version: 12.4
PyTorch Version: 2.5.1+cu124
GPU 0: Quadro RTX 5000
GPU 0 Memory: 16.11 GB

I already tried rebuilding the venv twice now, but still not working. It works when I run it on my CPU but ideally, I'd like to get it running on my GPU. Thanks!

@rachelkluu rachelkluu changed the title Inference Fails on 16GB GPU in Low-VRAM Mode Inference Fails on 16GB GPU even in Low-VRAM Mode Jan 21, 2025
@jammm
Copy link
Collaborator

jammm commented Jan 23, 2025

Hmm that's weird. It ran fine (without low VRAM mode) on my 4080 mobile and that one has 12GB of VRAM. It should use ~10GB on memory without low VRAM mode and around 6GB with low VRAM mode.

Are you sure that there's nothing else running on the GPU when you run the model? What does nvidia-smi show?

@rachelkluu
Copy link
Author

Yeah it's odd. It happens in both low VRAM mode and not.

Here's nvidia-smi when I run without low VRAM and yes, it'll stay at ~10GB for some time

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.134 Driver Version: 553.35 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Quadro RTX 4000 On | 00000000:04:00.0 Off | N/A |
| 30% 32C P8 1W / 125W | 15MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 Quadro RTX 5000 On | 00000000:65:00.0 On | 0 |
| 33% 51C P2 196W / 230W | 10477MiB / 15360MiB | 100% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 16 G /Xwayland N/A |
| 1 N/A N/A 16 G /Xwayland N/A |
+-----------------------------------------------------------------------------------------+

But then all of a sudden it'll spike to 15GB+ and crash

Image

@jammm
Copy link
Collaborator

jammm commented Jan 29, 2025

Can you try setting CUDA_VISIBLE_DEVICES=1 and run again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants