Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Test out thunder.jit w/ NeMo models. #1694

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions thunder/tests/test_networks.py
Original file line number Diff line number Diff line change
Expand Up @@ -406,7 +406,8 @@ def test_thunderfx_mistral_nemo_small():
device = torch.device("cuda")
model.to(device)
model.train()
mdl = thunder.dynamo.thunderfx(model)
# mdl = thunder.dynamo.thunderfx(model)
mdl = thunder.jit(model)

batch_size = 1
iid_size = (batch_size, config.max_position_embeddings)
Expand All @@ -421,8 +422,16 @@ def test_thunderfx_mistral_nemo_small():
assert mdl._backend.subgraph_infos, "Should have at least 1 subgraph"


nemo_models: list[str] = [
"microsoft/Phi-3-mini-128k-instruct",
"bigcode/starcoder2-7b",
"Qwen/Qwen2.5-7B-Instruct",
"Qwen/Qwen2-7B",
]


@thunder.tests.framework.requiresCUDA
@pytest.mark.parametrize("model_id", ["Qwen/Qwen2.5-7B-Instruct", "microsoft/Phi-3-mini-128k-instruct"])
@pytest.mark.parametrize("model_id", nemo_models)
def test_hf_for_nemo(model_id):
from thunder.dynamo import thunderfx
from transformers import AutoConfig, AutoModelForCausalLM
Expand All @@ -445,7 +454,8 @@ def test_hf_for_nemo(model_id):
# fullgraph=True used to work with transformers 4.45.2, but it doesn't work
# with 4.46.2 because of re.findall usage in the loss function
fullgraph = False
compiled_model = thunderfx(model, fullgraph=fullgraph)
# compiled_model = thunderfx(model, fullgraph=fullgraph)
compiled_model = thunder.jit(model, fullgraph=fullgraph)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't this failing due to unsupported argument as I thought thunder.jit doesn't have fullgraph argument?
Or **compile_options takes keyword argument so there wouldn't be errors for unsupported args?

Suggested change
compiled_model = thunder.jit(model, fullgraph=fullgraph)
compiled_model = thunder.jit(model, fullgraph=fullgraph)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. One can see the logs by clicking through the CI links below:

=========================== short test summary info ============================
FAILED thunder/tests/test_networks.py::test_hf_for_nemo[bigcode/starcoder2-7b] - AssertionError
FAILED thunder/tests/test_networks.py::test_hf_for_nemo[microsoft/Phi-3-mini-128k-instruct] - AssertionError: expected tensor with (48,), cuda:0, torch.float32, requires_grad=False, got (1,), cuda:0, torch.bfloat16, False
FAILED thunder/tests/test_networks.py::test_thunderfx_mistral_nemo_small - AssertionError
============ 3 failed, 29 passed, 172 warnings in 172.63s (0:02:52) ============

so I guess your theory is correct that we do not error out due to unsupported (kw)args.
I have a vague memory of us discussing doing that though; maybe it's just a warning?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is one of the things I dislike about the options.


input_ids = torch.randint(0, configuration.vocab_size, (1, configuration.max_position_embeddings), device="cuda")
ref_output = model(input_ids=input_ids, labels=input_ids)
Expand Down
Loading