Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

running GCHP segmentation fault #476

Open
YaqingZhou1001 opened this issue Feb 19, 2025 · 1 comment
Open

running GCHP segmentation fault #476

YaqingZhou1001 opened this issue Feb 19, 2025 · 1 comment
Assignees
Labels
category: Bug Something isn't working topic: Runtime Related to runtime issues (e.g. simulation stops with error)

Comments

@YaqingZhou1001
Copy link

Your name

Yaqing Zhou

Your affiliation

Jinan University

What happened? What did you expect to happen?

When I run GCHP 14.5.0 as a batch job, the GCHP log is empty, but the batch job log shows a segmentation fault.
It seems that GCHP is only being configured, but it hasn’t actually run, and the error occurs at that point.
A similar error also occurs when running interactively.

By the way, during compilation, some warnings were displayed, but it finally complete. I'm not sure whether these warnings could affect the GCHP run. So I also attached the environment file below.

What are the steps to reproduce the bug?

After compiling and setting up the run directory, when running GCHP as a batch job or interactively, the bug appears.

Please attach any relevant configuration and log files.

  1. GCHP log file

gchp_log.txt

  1. setCommonRunSettings.sh

setCommonRunSettings.txt

  1. batch job run script

batchjob_script.txt

  1. batch job log

batchjob_log.txt

  1. gchp environment file

gchp.env.txt

What GCHP version were you using?

14.5.0

What environment were you running GCHP on?

Local cluster

What compiler and version were you using?

Intel 2021.3.0

What MPI library and version were you using?

OpenMIPI 4.0.3

Will you be addressing this bug yourself?

No

Additional information

No response

@YaqingZhou1001 YaqingZhou1001 added the category: Bug Something isn't working label Feb 19, 2025
@yantosca yantosca added the topic: Runtime Related to runtime issues (e.g. simulation stops with error) label Feb 21, 2025
@lizziel lizziel self-assigned this Mar 17, 2025
@lizziel
Copy link
Contributor

lizziel commented Mar 18, 2025

Hi @YaqingZhou1001, have you successfully used this set of libraries and ESMF with GCHP before on your cluster? I suggest trying the following:

  1. Turn on error prints in ESMF. Look for configuration file ESMF.rc in your run directory. Open it and set the logKindFlag parameter to ESMF_LOGKIND_MULTI_ON_ERROR. When you run again you should then get ESMF error log files upon rerun. There will be one log file per processor and each file will start with PET. More often than not the ESMF error message will appear in every file if it is an ESMF problem.
  2. Check if there are any messages in output log file allPEs.log
  3. Build GCHP with debug flags on by reconfiguring with -DCMAKE_BUILD_TYPE=Debug. Then run again.
  4. Contact your system administrator to see if they have ever seen this issue before since it appears to be MPI-related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working topic: Runtime Related to runtime issues (e.g. simulation stops with error)
Projects
None yet
Development

No branches or pull requests

3 participants