Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fullchem run overflows #2686

Open
Xinying331 opened this issue Jan 16, 2025 · 3 comments
Open

Fullchem run overflows #2686

Xinying331 opened this issue Jan 16, 2025 · 3 comments
Assignees
Labels
category: Bug Something isn't working topic: GCHP Related to GCHP only topic: Runtime Error Related to runtime issues (e.g. simulation stopped w/ error)

Comments

@Xinying331
Copy link

Your name

Xyw

Your affiliation

Rutgers University

What happened? What did you expect to happen?

The GCHP full-chem run crashed during the execution of GCHPchem in ucx_mod.F90.

What are the steps to reproduce the bug?

I am running a global 4x5 fullchem simulation using GCHP v14.5.0 on the Pleiades platform with GEOS-FP input data. To debug the issue, I built GCHP with debug flags enabled. The error message indicates that the problem occurs in the ucx_mod.F90 code.

Here are the steps I have taken so far:

1)I referred to issue #243 and #526 and updated the Gfortran compiler to version 12.3.0, but this did not resolve the problem.
2) I verified the restart file for v14.5.0 from AWS and checked the meteorology files (e.g., do a test run just for the month of July 2019) to ensure they are valid.
3) I printed the values of variables that might be causing the overflow error in ucx_mod.F90 based on the run log.
image
From ucx_mod.F90 the error occurs at:
image

After print out related variables I noticed a drastic drop in H2O and HNO3 concentrations, which might be causing the overflow. (You may get more information about it from run.log
image

I am unsure whether the issue is related to the ESMF version or another configuration on Pleiades. I would greatly appreciate any advice you could provide. Thank you so much! @yantosca @lizziel @msulprizio

Please attach any relevant configuration and log files.

allPEs.log
ExtData.rc.txt
HEMCO_Config.rc.txt
run.log.txt

What GEOS-Chem version were you using?

14.5.0

What environment were you running GEOS-Chem on?

Other (please explain below)

What compiler and version were you using?

Compilers: GNU 12.3.0; ESMF-8.3.1

Will you be addressing this bug yourself?

No

In what configuration were you running GEOS-Chem?

GCHP

What simulation were you running?

Full chemistry

As what resolution were you running GEOS-Chem?

C24

What meterology fields did you use?

GEOS-FP

Additional information

No response

@Xinying331 Xinying331 added the category: Bug Something isn't working label Jan 16, 2025
@Xinying331
Copy link
Author

my apologize, From ucx_mod.F90 the error occurs at line 1868:
image

@yantosca yantosca added topic: GCHP Related to GCHP only topic: Runtime Error Related to runtime issues (e.g. simulation stopped w/ error) labels Jan 16, 2025
@lizziel
Copy link
Contributor

lizziel commented Jan 16, 2025

Hi @Xinying331, it looks like there was an error in the log file earlier than the overflow that might shed light on the problem. It is here:

     GCHPctmEnv: INFO: Configured to expect 'bottom-up' meteorological data from 'ExtData'
     GCHPctmEnv: INFO: Configured to use dry air pressure in advection
     GCHPctmEnv: INFO: Configured to correct native mass flux (if using) for humidity
 Real*4 Resource Parameter: GCHPchem_DT:1200.000000
 Integer*4 Resource Parameter: GCHPchem_REFERENCE_TIME:1000
pe=00111 FAIL at line=08412    MAPL_Generic.F90                         <status=41>
pe=00111 FAIL at line=08334    MAPL_Generic.F90                         <status=41>
pe=00097 FAIL at line=08412    MAPL_Generic.F90                         <status=41>
pe=00097 FAIL at line=08334    MAPL_Generic.F90                         <status=41>
pe=00098 FAIL at line=08412    MAPL_Generic.F90                         <status=41>

The run kept going, but I wonder if there was a problem reading a file that ultimately caused a problem later on. Would you be able to go to the relevant lines in MAPL and add some prints to see what it was looking for? You can search for the file at the top of the code directory with find . -name MAPL_Generic.F90.

A few tips for the debug:

  1. In your logging.yaml file you can change level: DEBUG to level: WARNING. A previous iteration of our docs said to set both level and root_level to DEBUG but this was incorrect. You only need to set root_level. This will greatly cut down on the file length of allPEs.log.
  2. Run with a minimal number of cores for the debug runs.

@Xinying331
Copy link
Author

Hi Lizziel,

Thank you for getting back to me. However, Pleiades is currently undergoing its annual maintenance. I will update this request once Pleiades is back online.

Thanks!
Xinying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working topic: GCHP Related to GCHP only topic: Runtime Error Related to runtime issues (e.g. simulation stopped w/ error)
Projects
None yet
Development

No branches or pull requests

3 participants