-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GEFS regression test suite from EP5r2 configuration/case #2442
base: develop
Are you sure you want to change the base?
Add GEFS regression test suite from EP5r2 configuration/case #2442
Conversation
…esn't run with namelist error
Yes, this divide by 0 error is with intel debug
The only change for the debug test is to shorten forecast length fhmax, since it is slower |
I didn’t run into this problem. Since this is in debug mode, weird thing may happen. In your case if what you pass in is zero size array ( like i2<i1, or j2<j1 or km =0), the loop is safe. But I am not sure about the vector form ( it should be fine, but who knows). Would you please check the size first ?
Cheers
Weiyuan
From: Jun Wang ***@***.***>
Date: Friday, September 20, 2024 at 1:43 PM
To: ufs-community/ufs-weather-model ***@***.***>
Cc: Jiang, Weiyuan (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC] ***@***.***>, Mention ***@***.***>
Subject: [EXTERNAL] [BULK] Re: [ufs-community/ufs-weather-model] Add GEFS regression test suite from EP5r2 configuration/case (PR #2442)
CAUTION: This email originated from outside of NASA. Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC.
@lipan-NOAA<https://github.com/lipan-NOAA> @junwang-noaa<https://github.com/junwang-noaa> Intel debug reliably fails in GOCART with traceback to Floating point exception: floating-point divide by zero at GOCART/Process_Library/GOCART2G_Process.F90:1575<https://github.com/GEOS-ESM/GOCART/blob/041422934cae1570f2f0e67239d5d89f11c6e1b7/Process_Library/GOCART2G_Process.F90#L1575> on n_atmsteps = 16: tau = vs/dz
I would imagine FV3 would complain first about any vanishing layer thickness (?), from say stochastic physics
Interestingly, a 3 hour forecast completes if I change this line to:
diff --git a/Process_Library/GOCART2G_Process.F90 b/Process_Library/GOCART2G_Process.F90
index cc0b599..123260c 100644
…--- a/Process_Library/GOCART2G_Process.F90
+++ b/Process_Library/GOCART2G_Process.F90
@@ -1561,7 +1561,7 @@ CONTAINS
! local
- integer :: i, j, iit
+ integer :: i, j, iit, k
integer :: nSubSteps
real, dimension(i1:i2, j1:j2, km) :: tau
@@ -1571,8 +1571,13 @@ CONTAINS
real :: dt, dt_cfl
-
- tau = vs/dz
+ do k = 1,km
+ do j = j1, j2
+ do i = i1, i2
+ tau(i,j,k) = vs(i,j,k)/dz(i,j,k)
+ end do
+ end do
+ end do
Maybe there is some haloes or padding to arrays leading to dz=0? I'm not sure where to go from here
@NickSzapiro-NOAA<https://github.com/NickSzapiro-NOAA> So you are running the test in debug mode? It might be some dimension mismatch.
@weiyuan-jiang<https://github.com/weiyuan-jiang> @tclune<https://github.com/tclune> have you run into the issue?
—
Reply to this email directly, view it on GitHub<#2442 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AMQTYONHLCJBHPMPD6GI3STZXRNFLAVCNFSM6AAAAABOQJKYXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRUGIYTGMJWG4>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@weiyuan-jiang @tclune I added
There are dozens of tasks (out of 768) with small dz, with counts ranging from 1 up to 111. All have the same size The curiosities continue as using the alternate loop with |
fltng_pnt | ||
lossless | ||
pos_pert_fcst | ||
12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NickSzapiro-NOAA It seems to me the UPP control files "postxconfig-NT-gefs.txt" and "postxconfig-NT-gefs_FH00.txt" haven't been updated to the new format. I wonder if any grib2 files (inline post results) have been successfully generated from your new RT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @WenMeng-NOAA . These postxconfig files are from the provided EP5r2 workflow and stopped working after UPP update in #2326.
Do you know how to reformat these?
As temporary fix, I'm using the gfs postxconfig instead. This choice happens in tests/fv3_conf/cpld_control_run.IN
#inline post
if [ $WRITE_DOPOST = .true. ]; then
cp ${PATHRT}/parm/post_itag_gfs itag
cp ${PATHRT}/parm/postxconfig-NT-gfs.txt postxconfig-NT.txt
cp ${PATHRT}/parm/postxconfig-NT-gfs_FH00.txt postxconfig-NT_FH00.txt
cp ${PATHRT}/parm/params_grib2_tbl_new params_grib2_tbl_new
if [[ ${BMIC} == .true. ]]; then
cp ${PATHRT}/parm/post_itag_gefs itag
#copied "gefs" postxconfig files not working afer UFS #2326
#cp ${PATHRT}/parm/postxconfig-NT-gefs.txt postxconfig-NT.txt
#cp ${PATHRT}/parm/postxconfig-NT-gefs_FH00.txt postxconfig-NT_FH00.txt
cp ${PATHRT}/parm/postxconfig-NT-gfs.txt postxconfig-NT.txt
cp ${PATHRT}/parm/postxconfig-NT-gfs_FH00.txt postxconfig-NT_FH00.txt
cp ${PATHRT}/parm/params_grib2_tbl_new params_grib2_tbl_new
else
cp ${PATHRT}/parm/post_itag_gfs itag
cp ${PATHRT}/parm/postxconfig-NT-gfs.txt postxconfig-NT.txt
cp ${PATHRT}/parm/postxconfig-NT-gfs_FH00.txt postxconfig-NT_FH00.txt
cp ${PATHRT}/parm/params_grib2_tbl_new params_grib2_tbl_new
fi
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NickSzapiro-NOAA @lipan-NOAA If you provide me the UPP control files in xml format, I can generate the text files in new format for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NickSzapiro-NOAA I have regenerated "postxconfig-NT-gefs.txt" with the xml file "postcntrl_gefs.xml" provided by @lipan-NOAA. Please let me know if you pick it up from Hera or other machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WenMeng-NOAA Can you make PR to this branch with file changes? If not, happy to bring in from Hera
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NickSzapiro-NOAA A PR was just submitted to your branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example run directory using the updated postxconfig files is on hera at:
/scratch1/NCEPDEV/nems/Nick.Szapiro/tasks/updateToEP5/uwm_gefs_upp/tests/run_dir/cpld_control_gefs_intel/
wgrib2 -v seems reasonable, but please let me know if we can verify contents are ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NickSzapiro-NOAA Your test results look good to me, except for missing aerosol fields. I will provide you changes for generating these aerosol fields from the inline post.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-ran and see aerosol fields (same run_dir). Please let me know if there is anything more to resolve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NickSzapiro-NOAA Your test results look good to me. @lipan-NOAA Can you also validate aerosol fields in grib2 files?
Update the inline post control files for gefs.
tests/parm/post_itag_gefs
Outdated
MODELNAME='GFS' | ||
/ | ||
&NAMPGB | ||
KPO=50,PO=1000.,975.,950.,925.,900.,875.,850.,825.,800.,775.,750.,725.,700.,675.,650.,625.,600.,575.,550.,525.,500.,475.,450.,425.,400.,375.,350.,325.,300.,275.,250.,225.,200.,175.,150.,125.,100.,70.,50.,40.,30.,20.,15.,10.,7.,5.,3.,2.,1.,0.4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NickSzapiro-NOAA To generate aerosol fields from the inline post, please add change as:
from
KPO=50,PO=1000.,975.,950.,925.,900.,875.,850.,825.,800.,775.,750.,725.,700.,675.,650.,625.,600.,575.,550.,525.,500.,475.,450.,425.,400.,375.,350.,325.,300.,275.,250.,225.,200.,175.,150.,125.,100.,70.,50.,40.,30.,20.,15.,10.,7.,5.,3.,2.,1.,0.4,
into
KPO=50,PO=1000.,975.,950.,925.,900.,875.,850.,825.,800.,775.,750.,725.,700.,675.,650.,625.,600.,575.,550.,525.,500.,475.,450.,425.,400.,375.,350.,325.,300.,275.,250.,225.,200.,175.,150.,125.,100.,70.,50.,40.,30.,20.,15.,10.,7.,5.,3.,2.,1.,0.4,nasa_on=.true.,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NickSzapiro-NOAA Also copy all optics_luts_*_nasa.dat files from UPP/fix/chem to your run directory.
@NickSzapiro-NOAA can you check if dz is zero on any of grid pionts since the error message is "Floating point exception: floating-point divide by zero", we may then need to track where the zero value dz comes from. Thanks |
The GOCART dz=0 issue seems to have been resolved at scripting level with the test case now running the specified 3 hours on Hera (taking ~40 minutes of runtime). The fix is from a commit that moves where ExtData is symbolic linked (commit), e.g.,
While I understand that there are subtle filesystem differences for reading files in debug mode, I don't actually know why anything has changed. The good news is that Intel debug GEFS test case runs now, at least on Hera |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bbakernoaa Could you make these consistent with the monthly data?
@NickSzapiro-NOAA The UPP develop recently had a commit (a6c1a38c) thst includes aerosol fields in the UPP control files for GEFS. Could you add changes to link postxconfig-NT-gefs.txt and postxconfig-NT-gefs_FH00.txt to the files postxconfig-NT-gefs.txt and postxconfig-NT-gefs-f00.txt under parm/gefs from UPP repository? Thanks! |
@WenMeng-NOAA Should there be an fv3atm PR first to update the UPP hash? |
@NickSzapiro-NOAA That's right. |
@WenMeng-NOAA Can we reduce the number of times lines like these get logged?
They may be for development and double the size of the log file |
@NickSzapiro-NOAA Could you open an issue at https://github.com/NOAA-EMC/UPP/issues, so we will work on that? |
|
Commit Queue Requirements:
Description:
This PR updates the cpld_bmark_p8 tests to a prototype GEFS test case of fully coupled s2swa+IAU+stochastics physics, with configuration and warm starts from restarts of EP5r2 ensemble member 1 for 2021-03-25 06Z.
The EP5r2 test case was kindly provided by @bingfu-NOAA via @junwang-noaa with aerosol input data and configurations from @lipan-NOAA.
A separate INPUTDATA_ROOT_BMIC is no longer needed and is removed.
This PR is in a draft mode subject to meeting basic reproducibility/quality checks. The following have been tested on Hera:
140: The OSC pt2pt component does not support MPI_THREAD_MULTIPLE in this release.
140: Workarounds are to run on a single node, or to use a system with an RDMA
140: capable network such as Infiniband.
Error in handle_err: get_var3_r4 get_vara_real delp_inc NetCDF: HDF error
This wav line calls ESMF_MeshCreate
Input data is currently in user space on hera and scripts need updating once filepaths are in shared space.
Commit Message:
Priority:
Git Tracking
UFSWM:
Sub component Pull Requests:
UFSWM Blocking Dependencies:
Changes
Regression Test Changes (Please commit test_changes.list):
Input data Changes:
Library Changes/Upgrades:
Testing Log: