Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add genesis experiment #534

Merged
merged 7 commits into from
Jan 17, 2025
Merged

Add genesis experiment #534

merged 7 commits into from
Jan 17, 2025

Conversation

slabasan
Copy link
Collaborator

@slabasan slabasan commented Jan 8, 2025

Description

To test on Fugaku:

cd benchpark
. setup-env.sh
benchpark system init --dest fugaku-system fugaku 
benchpark experiment init --dest genesis-openmp genesis +openmp
benchpark setup genesis-openmp ./fugaku-system workspace/

Then follow the output steps.

Please validate the runtime parameters in the ramble.yaml (workspace/genesis-openmp/ramble.yaml)
Please validate the output when the job completes

Experiment.py for genesis.

Dependencies: None

Fixes issue(s): Part of #460.

Type of Change

  • {X} Adding a system, benchmark, or experiment
  • { } Modifying an existing system, benchmark, or experiment
  • { } Documentation update
  • { } Build/CI update
  • { } Benchpark core functionality

Checklist:

If adding/modifying a system:

  • { } Create a new directory for the system and a new system.py file
  • { } Add a new dry run unit test in .github/workflows
  • { } System appears in System Specifications table in docs catalogue section

If adding/modifying a benchpark:

  • { } Add a new application.py and (maybe) package.py under a new directory
    for this benchmark
  • { } Configure an experiment
  • { } Benchmark appears in Benchmarks and Experiments table in docs catalogue
    section

If adding/modifying a experiment:

  • { x } Extend experiment.py under existing directory for specific benchmark
  • { } Define a single node and multi-node experiments

If adding/modifying core functionality:

  • { } Update docs
  • { } Update .github/workflows and .gitlab/ci unit tests (if needed)

@slabasan slabasan added experiment New or modified experiment application New or modified application WIP A work-in-progress not yet ready to commit labels Jan 8, 2025
@github-actions github-actions bot added the ci Involving Project CI & Unit Tests label Jan 8, 2025
@slabasan slabasan marked this pull request as draft January 8, 2025 02:28
@slabasan slabasan changed the title Add genesis experiement Add genesis experiment Jan 8, 2025
@pearce8 pearce8 mentioned this pull request Dec 9, 2024
4 tasks
@pearce8 pearce8 requested a review from dyokelson January 8, 2025 16:02
Copy link
Collaborator

@dyokelson dyokelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on Ruby, currently fails during job execution with the following error:

srun: error: ruby50: task 0: Exited with exit code 1
 Open_file> File /usr/WS2/yokelson/benchpark/workspace/genesis-openmp/Cts-6d48f81/workspace/inputs/genesis/DHFR/benchmark-2020/npt/genesis2.0beta/jac_amber/p1.inp does not exist  rank_no =     0

Steps to reproduce:

benchpark experiment init --dest=genesis-openmp genesis +openmp
benchpark setup genesis-openmp ./ruby-system workspace/

Then follow the rest of the prompts, ramble on successfully submits the batch job but the job fails.

It looks like it can't find an input file but input files do exist there but they have different names (p16.inp, p32.inp, p64.inp, p8.inp). Maybe we need to increase the number of processes so it grabs the right one?

Copy link
Collaborator

@dyokelson dyokelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok increasing the MPI ranks seems to have fixed it but I am not sure if it ran all the way through, we should have Riken folks validate the parameters and test on Fugaku

@pearce8 pearce8 requested a review from jdomke January 9, 2025 00:01
@pearce8
Copy link
Collaborator

pearce8 commented Jan 9, 2025

@jdomke can you or your team please review this PR to validate the parameters and test on Fugaku?

@slabasan
Copy link
Collaborator Author

slabasan commented Jan 17, 2025

Confirmed this works on ruby with the most recent changes from @jdomke, last part of output is:

[STEP6] Deallocate Arrays
 
Output_Time> Averaged timer profile (Min, Max)
  total time      =      36.228
    setup         =       1.175
    dynamics      =      35.053
      energy      =      24.199
      integrator  =       7.526
      pairlist    =       2.155 (       2.071,       2.243)
... 
  integrator       
    constraint    =       0.466 (       0.437,       0.494)
    update        =       1.060 (       1.045,       1.073)
    comm_coord    =       1.213 (       1.126,       1.311)
    comm_force    =       1.367 (       1.152,       1.509)
    comm_migrate  =       0.062 (       0.060,       0.064)

@slabasan slabasan marked this pull request as ready for review January 17, 2025 21:40
@slabasan
Copy link
Collaborator Author

@pearce8 @jdomke This is ready for final review before merge.

@slabasan slabasan requested a review from pearce8 January 17, 2025 21:41
@pearce8 pearce8 merged commit b8792c7 into develop Jan 17, 2025
14 of 15 checks passed
@pearce8 pearce8 deleted the genesis-rewrite branch January 17, 2025 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
application New or modified application ci Involving Project CI & Unit Tests experiment New or modified experiment WIP A work-in-progress not yet ready to commit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants