Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide workflow for using local uncommitted changes on Slurm #161

Open
ryantwolf opened this issue Feb 26, 2025 · 0 comments
Open

Provide workflow for using local uncommitted changes on Slurm #161

ryantwolf opened this issue Feb 26, 2025 · 0 comments
Labels
documentation Improvements or additions to documentation

Comments

@ryantwolf
Copy link
Collaborator

ryantwolf commented Feb 26, 2025

I tried using NeMo Run to launch a Partial on Slurm and ran into a few issues that were solvable but should be more explicitly documented in a guide.

  1. I had a script defined like this:
import nemo_run as run
from nemo_run.core.execution import SlurmExecutor

def my_function(a, b):
  pass

def my_slurm_executor() -> SlurmExecutor:

    return SlurmExecutor(
        ...
    )


def main():

    fn = run.Partial(
        my_function,
        a=1,
        b=2,
    )

    executor = my_slurm_executor()
    with run.Experiment("example_exp", executor=executor) as exp:
        exp.add(fn, tail_logs=True)
        exp.run(detach=False)

if __name__ == "__main__":
    main()

This failed due to my_function being in the same file that I was launching the job from.

  1. I was told that I need to move my_function to a separate file in the repo, but since my changes were not committed (due to this being a simple test) I had to also modify the packager to get it to work.
import nemo_run as run
from nemo_run.core.execution import SlurmExecutor
from nemo_run.core.packaging import GitArchivePackager


def my_function(a, b):
  pass

def my_slurm_executor() -> SlurmExecutor:

    packager = GitArchivePackager(
        include_pattern=os.path.join(os.getcwd(), "*"),  
    )
    return SlurmExecutor(
        packager=packager,
        ...
    )


def main():

    fn = run.Partial(
        my_function,
        a=1,
        b=2,
    )

    executor = my_slurm_executor()
    with run.Experiment("example_exp", executor=executor) as exp:
        exp.add(fn, tail_logs=True)
        exp.run(detach=False)

if __name__ == "__main__":
    main()

Ideally the first way should be supported, but for now its lack of support should be documented and the 2nd way to do this should also be documented.

@ryantwolf ryantwolf added the documentation Improvements or additions to documentation label Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant