Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snowanl fails on WCOSS2 with undefined mpi reference #3395

Closed
RussTreadon-NOAA opened this issue Feb 26, 2025 · 1 comment · Fixed by #3396
Closed

snowanl fails on WCOSS2 with undefined mpi reference #3395

RussTreadon-NOAA opened this issue Feb 26, 2025 · 1 comment · Fixed by #3396
Assignees
Labels
JEDI Feature development to support JEDI-based DA triage Issues that are triage

Comments

@RussTreadon-NOAA
Copy link
Contributor

What is wrong?

C96C48_hybatmaerosnowDA jobs gdas_snowanl and enkfgdas_esnowanl fail on WCOSS2 with

ImportError: /opt/cray/pe/lib64/libmpifort_intel.so.12: undefined symbol: MPIX_Enqueue_send

What should have happened?

Jobs gdas_snowanl and enkfgdas_esnowanl should successfully run to completion

What machines are impacted?

WCOSS2

What global-workflow hash are you using?

4fd0ec1

Steps to reproduce

  1. clone and install ClaraDraper-NOAA:feature/soilanal_det_clean on Cactus
  2. set up and launch C96C48_hybatmaerosnowDA
  3. cycle forward to the snowanl jobs in the 20211221 00Z cycle
  4. the snowanl jobs will fail with an undefined symbol error

Additional information

RFC 13644 (implemented 5-6 February 2025) was supposed to remove the need to add

export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/opt/cray/pe/mpich/8.1.19/ofi/intel/19.0/lib"

to GDASApp jobs. It did for a while but as of 26 February 2025 the snowanl jobs failed without this line being present in the WCOSS2 section of ush/load_ufsda_modules.sh.

Do you have a proposed solution?

The snowanl jobs run to completion on Cactus with the following change to ush/load_ufsda_modules.sh

@@ -41,6 +41,7 @@ case "${MACHINE_ID}" in
       # TODO: Add path to GDASApp libraries and cray-mpich as temporary patches
       # TODO: Remove LD_LIBRARY_PATH lines as soon as permanent solutions are available
       export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${HOMEgfs}/sorc/gdas.cd/build/lib"
+      export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/opt/cray/pe/mpich/8.1.19/ofi/intel/19.0/lib"
     fi
     module load "${MODS}/${MACHINE_ID}"
     ncdump=$( command -v ncdump )
@RussTreadon-NOAA RussTreadon-NOAA added the triage Issues that are triage label Feb 26, 2025
@RussTreadon-NOAA RussTreadon-NOAA self-assigned this Feb 26, 2025
@RussTreadon-NOAA
Copy link
Contributor Author

Work for this issue will be done in RussTreadon-NOAA:bugfix/snowanl

@RussTreadon-NOAA RussTreadon-NOAA added the JEDI Feature development to support JEDI-based DA label Feb 26, 2025
RussTreadon-NOAA added a commit to RussTreadon-NOAA/global-workflow that referenced this issue Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JEDI Feature development to support JEDI-based DA triage Issues that are triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant