-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop] Add smoke and dust verification #1174
base: develop
Are you sure you want to change the base?
Conversation
…tending to time out for 48-hour forecasts.
…sure PcpCombine operates only on those hours unique to the cycle, i.e. for those times starting from the initial time of the cycle to just before the initial time of the next cycle. For the PcpCombine_obs task for the last cycle, allow it to operate on all hours of that cycle's forecast. This ensures that the PcpCombine tasks for the various cycles do not clobber each other's output. Accordingly, change the dependencies of downstream tasks that depend on PcpCombine obs output to make sure they include all PcpCombine_obs tasks that cover the forecast period of the that downstream task's cycle.
…ossibly also get_obs_ndas by putting in sleep commands.
- There are lots of task-specific checks that always run regardless of task inclusion: add some checks there so that we don't have to include unnecessary variables like PREDEF_GRID_NAME in vx-only experiments - There were a few task-specific checks that DO check for task inclusion, but the checks were broken: fix those - Move dict_find from an inline function in setup.py to a proper external python function
task-dependent logic checks - Break out all FV3 namelist logic out into a new function, setup_fv3_namelist - Only call this new function if the run_fcst task is active - Delay exporting of variables further down the page (need to completely eliminated this eventually) - Replace some *_vrfy commands with their proper versions - Eliminate some unnecessary variables and block comments
…, need to create observation directories if they don't exist
…not specified, include correct valid VX_FIELDS for new variables
…w; more tasks to come!
- New metplus conf file - New J-job and exscript for new task - New task entry in wflow/verify_pre.yaml - New variables for obs filenames and ASCII2NC output filenames - New entries in various scripts for new task - ush/get_metplus_tool_name.sh - ush/setup.py - ush/set_vx_fhr_list.sh - Updating some comments - Stage test observations on disk for faster testing
- Add PM10 as a valid ob type - Update PcpCombine.conf template to allow obs other than CCPA, USER_DEFINED command - Fix task name for ASCII2NC - Add PCPCombine tasks for PM - Fix check of airnow ob file name in exregional_get_verif_obs.sh - ASCII2NC doesn't need beta version of MET - Update some comments in config_defaults.yaml - Pythonize ush/set_vx_fhr_list.sh with help from ChatGPT; this results in an insane speedup (100 seconds to check forecast files --> ~ 1 second)
importing the necessary METplus functions directly. This will need some attention before merging to ensure it is platform-independent, only working on Hera for now. But the smoke stuff is Hera-specific for now regardless.
don't get any matched pairs. However, it seems as if the example case has the same issue, so I'll need to figure out what's going on there. - Update vx_config_det.yaml for correct obs names - Update verify_det.yaml to make the PointStat metatask loop over obtypes, so we can combine NDAS with smoke vx - Add PM10 to ASCII2nc_obs - Remove verbose flag from set_vx_fhr_list.py call in exregional_check_post_output.sh so we get correct FHR results - Update exregional_run_met_gridstat_or_pointstat_vx.sh uses beta release, can handle smoke vx obtypes for PointStat - - Remove deleted script from ush/source_util_funcs.sh
files are unique! Also, make the metatask rules for ASCII2nc simpler
- Produce hourly nc obs files for AOD - Probably doesn't make a difference, but explicitly reference AOD as "AERONET_AOD" in POINT_STAT_MESSAGE_TYPE
…from RRFS is 550 nm. This gets us matched pairs!
…es user-specifiable.
- replace references to old source_config_for_task function with new yaml-based stuff - Rename old LOAD_MODULES_RUN_TASK_FP --> LOAD_MODULES_RUN_TASK in rocoto - remove "grid_params" from sections to reference in verification tasks, since this section may not be set and the variables are not needed anyway - Add back create_symlink_to_file import to create_symlink_to_file - Remove references to beta release: we going for real this time!
for the smoke VX. Now we don't have to hard-code to the beta version to get smoke working, but the downside is we can only use it on Hera for GNU compilers
ush/setup.py
Outdated
vx_metatasks_all_by_obtype["AERONET"] = ["task_get_obs_aeronet","metatask_ASCII2nc_obs"] | ||
|
||
vx_field_groups_all_by_obtype["AIRNOW"] = ["PM25","PM10"] | ||
vx_metatasks_all_by_obtype["AIRNOW"] = ["task_get_obs_airnow","metatask_ASCII2nc_obs","metatask_PcpCombine_fcst_PM_all_mems"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkavulich Looks like the first two sections are repeated in sections 3 and 4 except with everything on a single line. I guess remove the first two or last two stanzas/sections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this, this was just a bad merge resolution. Fixed.
@mkavulich I only had a couple of comments. Looks good. Approving now. |
@mkavulich I haven't gotten a chance to run that test case for checking the timing when pulling one of the new obs types. I think you said that is not urgent for the code slush, but let me know if you need me to do that. Thanks. |
The UFS_FIRE WE2E tests successfully passed:
as well as the AQM WE2E test:
|
Co-authored-by: Christina Holt <56881914+christinaholtNOAA@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@christinaholtNOAA I believe I have resolved all your suggestions. Let me know if you have anything else!
@@ -633,7 +633,7 @@ Pre-Existing Directory Parameter | |||
|
|||
* **"delete":** The preexisting directory is deleted and a new directory (having the same name as the original preexisting directory) is created. | |||
|
|||
* **"rename":** The preexisting directory is renamed and a new directory (having the same name as the original pre-existing directory) is created. The new name of the preexisting directory consists of its original name and the suffix "_old###", where ``###`` is a 3-digit integer chosen to make the new name unique. | |||
* **"rename":** The preexisting directory is renamed and a new directory (having the same name as the original pre-existing directory) is created. The new name of the preexisting directory consists of its original name and the suffix "_old_YYYYMMDD_HHmmss", where ``YYYYMMDD_HHmmss`` is the full date and time of the rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"rename" in this case is a noun, so it is a full sentence (just missing a period). With an added period is this wording okay?
The "`AErosol RObotic NETwork <https://aeronet.gsfc.nasa.gov/>`_": A worldwide ground-based remote sensing aerosol networks established by NASA and PHOTONS. The SRW verification tasks can use "Level 1.5" (cloud-screened and quality-controlled) aerosol optical depth observations. | ||
|
||
AIRNOW | ||
A North American ground-level air quality measurement network. The SRW verification tasks can use PM2.5 and PM10 observations. More information available at https://www.airnow.gov/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadTheDocs automatically renders text urls as hyperlinks; this is done elsewhere in the Glossary so I assume it's okay.
@@ -0,0 +1,84 @@ | |||
#!/usr/bin/env bash | |||
|
|||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, also removed some unnecessary yaml sections.
var: | ||
mem: '{% if global.DO_ENSEMBLE %}{% for m in range(1, global.NUM_ENS_MEMBERS+1) %}{{ "%03d "%m }}{%- endfor -%} {% else %}{{ "000"|string }}{% endif %}' | ||
metatask_PointStat_SFC_UPA_mem#mem#: | ||
FIELD_GROUP: '{% for var in verification.VX_FIELD_GROUPS %}{% if var in ["SFC", "UPA", "AOD", "PM25", "PM10"] %}{{ "%s " % var }}{% endif %}{% endfor %}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this one, but the rest are "if/else" statements so I don't think this solution works well
@@ -0,0 +1,376 @@ | |||
#!/usr/bin/env bash | |||
|
|||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since no other tasks in verification have a doc block here, I'll decline to add if here for now, but something to add for the future.
#----------------------------------------------------------------------- | ||
# | ||
export METPLUS_CONF | ||
export LOGDIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOGDIR
is set at the rocoto task level, and METPLUS_CONF
comes from the config file. These variables are needed because they are referenced in the main METplus common.conf
file.
ush/setup.py
Outdated
vx_metatasks_all_by_obtype["AERONET"] = ["task_get_obs_aeronet","metatask_ASCII2nc_obs"] | ||
|
||
vx_field_groups_all_by_obtype["AIRNOW"] = ["PM25","PM10"] | ||
vx_metatasks_all_by_obtype["AIRNOW"] = ["task_get_obs_airnow","metatask_ASCII2nc_obs","metatask_PcpCombine_fcst_PM_all_mems"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this, this was just a bad merge resolution. Fixed.
task: 'run_MET_PcpCombine_fcst_#FIELD_GROUP#_mem#mem#' | ||
taskdep: | ||
attrs: | ||
task: 'run_MET_PcpCombine_fcst_#FIELD_GROUP#_mem#mem#' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new dependency is needed for PM2.5 coming from the model output; it is split among two different fields, so those must be combined in order to get the correct results from PointStat (see the changes to PcpCombine.conf and exregional_run_met_pcpcombine.sh)
@gsketefian that's correct, if a bug fix is needed for those non-00z ob times then we can apply it to the release branch after the slush. But we should probably get to it in the next week to make sure we have enough time. |
@MichaelLueken I am seeing a failure after my last round of suggested changes; I will let you know when this is ready for testing. |
Okay @MichaelLueken @christinaholtNOAA @gsketefian I have now applied what I think are the final changes, and all verification and fundamental tests are passing on Hera. I think this PR is ready for final tests. |
Thanks, @mkavulich! Launching tests now. |
The Derecho, Gaea-C5, Hera Intel, and Hercules WE2E coverage tests have successfully passed via Jenkins automated testing. The Gaea-C6 runner is down, the Hera GNU tests were aborted for running over the eight hour time limit, and the skill-score test failed on Orion. I'm running the Gaea-C6 tests manually and will kick off the Hera GNU tests once again, but there is an issue that was missed for the skill-score test on Orion: Please update the
Once this line is updated, I will also relaunch the Orion coverage tests. |
@MichaelLueken I have made the suggested change. Which jobs hit the time limit on Hera GNU? If it's verification tests only, we have noticed that newer versions of MET run very slowly with GNU compilers; it may be worth considering swapping out some of the more compute-intensive verification tests to another coverage suite. |
The only test that runs long on Hera GNU is the Also, on Gaea-C6, the smoke and dust WE2E, |
The problem task is |
That makes sense. GenEnsProd is one of the most compute-intensive MET utilities, and UPA has the most data. I suspect moving that test (and any other ensemble vx tests) out of the GNU coverage suite will solve the issue. |
I agree, if you move the Unfortunately, the seg faulting of the
for both Gaea-C6 and Hera. Can you think of any of your changes that could potentially be adversely affecting the smoke and dust capability in the weather model? |
@MichaelLueken can you point me to the failing test run directory on Hera? And if possible, a successful test from some other PR? |
The failed test from your PR on Hera is available - /scratch1/NCEPDEV/stmp2/Michael.Lueken/ufs-srweather-app/expt_dirs/smoke_dust_grid_RRFS_CONUS_3km_suite_HRRR_gf Since the tests for PRs are only run on Gaea-C6, I'll kick off a test using the current develop on Hera and provide you the path once it has started. |
The current HEAD of develop has made it past the part that seg faults on Hera without issue - /scratch1/NCEPDEV/stmp2/Michael.Lueken/expt_dirs/smoke_dust_grid_RRFS_CONUS_3km_suite_HRRR_gf |
@MichaelLueken I think I have fixed the issue; when I merged in the latest develop I missed a line that did not update LEVP correctly in Side note, but this is a great example of a problem that would have been caught by the type of component-level tests I proposed to @benkozi that he documented in this discussion, and wouldn't have wasted so much time and resources having to re-run full tests and manually check them. Seeing that the namelists didn't match would have let me know ahead of time that there was a problem. |
I've updated my clone of your branch and relaunched the With respect to the Hera GNU
|
I'm testing again on Hera, but the Gaea-C6
while the value of |
@MichaelLueken I think I have spotted the problem: I pushed another change that should hopefully fix it. It was another merge-related problem related to vertical levels (this time the Strangely, I wasn't able to reproduce this problem by modifying tests like |
The automated tests have successfully passed on Orion following your update to the skill-score parm file:
|
DESCRIPTION OF CHANGES:
This PR adds verification of smoke and dust observations to the SRW verification workflow. These observations come from new data sources: AERONET (aerosol optical depth) and AIRNOW (particulate matter). This necessitated adding some new tasks to the verification section of the workflow, and modifying some existing tasks. A new test,
MET_verification_smoke_only_vx
, has been added to test out these new capabilities. Major updates include:New observation types
Two new sets of observations (AERONET and AIRNOW) are now included for ingestion by verification tasks, and all the proper logic has been included for retrieving these obs from HPSS if necessary. In addition, a new capability allows for retrieval of AIRNOW obs from AWS over the internet without needing HPSS access: this can theoretically be extended to other ob types as well but the proper logic will need to be included in
parm/data_locations.yml
By default, the new observation types also report all matched pairs in the output stat files.
New MET tool: ASCII2NC
This is a new MET tool for SRW, used for converting the ASCII-based AERONET and AIRNOW obs to NetCDF that can be processed by later MET tasks. This is a new task with a new J-Job and exscript, as well as a new METplus conf file template.
Generalizing some tasks and metatasks
Some metatasks were previously hard-coded to certain observation types or other variables that needed to become more generic:
metatask_PointStat_SFC_UPA_all_mems
now has an outer loop of metatasks over the observation type, with an inner loop for each ensemble member. This was needed in order to accommodate the new observation types for the PointStat tool.MET and METplus upgrade
These new verification capabilities necessitated an update to a newer MET version, and bugs in 11.1.1 required updating further to MET 12.0.1 and METplus 6.0.0. These have been installed in all the usual places, thanks @RatkoVasic-NOAA!
Additional updates
In addition, several minor updates are included:
run_WE2E_tests.py
when parsing an XML with broken/unsatisfied dependencies; now it will properly print an informational message if jobs aren't being submitted properly instead of silently hangingNUM_MISSING_OBS_FILES_MAX
from 2 to 0; we really shouldn't have missing files for any of our tests, users can bump this up if they need toln_vrfy
withcreate_symlink_to_file
, and update the latter with wildcard functionalityush/generate_FV3LAM_wflow.py
, separated out the setting of namelist variables into its own functionush/retrieve_data.py
now creates directories if they do not existdict_find()
fromsetup.py
toush/python_utils/misc.py
for more general useType of change
REMOVE_RAW_OBS_*
variables for different observation types are consolidated into a singleREMOVE_RAW_OBS_DIRS
variableTESTS CONDUCTED:
DEPENDENCIES:
None.
DOCUMENTATION:
Added documentation to the Users Guide for new options where appropriate.
ISSUE:
None
CHECKLIST
CONTRIBUTORS (optional):
Thanks to @RatkoVasic-NOAA, @ulmononian, @christinaholtNOAA, and @gsketefian for their help and contributions