Testing out what partial charges could look over the CLI #1068

IAlibay · 2025-01-07T19:03:35Z

Checklist

Added a news entry

Developers certificate of origin

I certify that this contribution is covered by the MIT License here and the Developer Certificate of Origin at https://developercertificate.org/.

jthorton · 2025-01-16T16:44:10Z

Questions:

In 283d9e9 I have added the basic functionality discussed we have a new CLI command openfe charge-molecules (open to opinions on the name!) we can optionally also overwrite user charges using the --overwrite-charges flag as I thought it would awkward for a user to have to load the ligands removed the charges manually and then dump the ligands back to SDF. Would it also make sense to expose this flag to the plan RBFE and RHFE CLI commands?

In the RBFE, I also run the cofactors through the charge generation pipeline. Does this make sense? I think that's what happens later at the protocol level, and I wanted to match that.

For the RBFE and RHFE should we expose a flag to skip charge generation, there was mention of this in a meeting I think.

IAlibay

Some additional quick thoughts - we need to be careful about replacing the charge generation from being inplace to being a return.

IAlibay · 2025-01-16T17:09:30Z

openfe/protocols/openmm_utils/charge_generation.py

    """

    # If you have non-zero charges and not overwriting, just return
    if (offmol.partial_charges is not None and np.any(offmol.partial_charges)):
        if not overwrite:
-            return
+            return offmol


This is probably going to silently break a bunch of things (particlularly protocols).

Would it be worth doing an inplace style flag where we optionally just overwrite the input molecule and otherwise return a molecule?

I think that as we still assign the charges to the input molecule and not a copy this should still be fine. I tried to make a small example of this here, which should hopefully mean we don't break anything!

from openff.toolkit import Molecule def charge_mol(mol) -> Molecule: mol.assign_partial_charges("am1bcc") return mol m = Molecule.from_smiles("CCO") charge_mol(m) print(m.partial_charges)

[-0.13610011111111114 0.12639988888888887 -0.5998001111111111 0.04236688888888887 0.04236688888888887 0.04236688888888887 0.04319988888888887 0.04319988888888887 0.3959998888888889] elementary_charge

Ah sorry - I think I hallucinated that you had removed the offmol_copy reassignment step below.

So just to confirm, this now does both an inplace asssignment and returns the offmol itself?

I guess, do we want to have the option to not do the inplace assignment? (i.e. could there be cases where this would be useful?)

Yeah it still does in place but I need the return for the multiprocessing to work, we can add an option for not in place but maybe that should be in another PR?

IAlibay · 2025-01-16T17:10:04Z

openfe/protocols/openmm_utils/charge_generation.py

-    )  # type: ignore
+    # limit the number of threads used by SQM
+    # <https://github.com/openforcefield/openff-toolkit/issues/1831>
+    with temp_env({"OMP_NUM_THREADS": "2"}):


There's some more pythonic tools for thread limiting like this, let me have a dig in the morning.

i.e. we'll need them going forward, so maybe we should bite the bullet and go for it.

This is what we use in MDAnalysis-land: https://github.com/joblib/threadpoolctl

It might be particularly helpful here because I'm not 100% sure all BLAS/LAPACK libraries respect OMP_NUM_THREADS (I'm thinking of you MKL library).

There might be better ones too!

IAlibay

Mostly doing a pass on the "partial charge generation" side of things and the overall concept (I'll do a finer grained review once it's not a draft).

Overall looks good to me!

IAlibay · 2025-01-17T13:12:37Z

openfe/protocols/openmm_utils/charge_generation.py

    """

    # If you have non-zero charges and not overwriting, just return
    if (offmol.partial_charges is not None and np.any(offmol.partial_charges)):
        if not overwrite:
-            return
+            return offmol


Ah sorry - I think I hallucinated that you had removed the offmol_copy reassignment step below.

So just to confirm, this now does both an inplace asssignment and returns the offmol itself?

I guess, do we want to have the option to not do the inplace assignment? (i.e. could there be cases where this would be useful?)

IAlibay · 2025-01-17T13:21:20Z

openfe/protocols/openmm_utils/charge_generation.py

-    )  # type: ignore
+    # limit the number of threads used by SQM
+    # <https://github.com/openforcefield/openff-toolkit/issues/1831>
+    with temp_env({"OMP_NUM_THREADS": "2"}):


This is what we use in MDAnalysis-land: https://github.com/joblib/threadpoolctl

It might be particularly helpful here because I'm not 100% sure all BLAS/LAPACK libraries respect OMP_NUM_THREADS (I'm thinking of you MKL library).

There might be better ones too!

IAlibay · 2025-01-17T13:23:18Z

openfe/protocols/openmm_utils/charge_generation.py

Probably another issue/PR - do we want to move this module somewhere else within the repo? It feels like it's more than just "openmm_utils".

Long term I think I want to move it to "openfe-utils" or similar, that way upstream protocols can use them without depending on OpenFE.

IAlibay · 2025-01-17T13:25:35Z

openfe/protocols/openmm_utils/charge_generation.py

+    ------
+    ValueError
+      If the ``toolkit_backend`` is not supported by the selected ``method``.
+      If ``generate_n_conformers`` is ``None``, but the input ``offmol``


This might actually be the case where we want to not inplace modify charges - what if things fails half way through a charge assignment loop, do we end up with have half our ligands charged and half our ligands uncharged?

If charging fails for one or more molecules the whole loop will break, so we might want to more gracefully handle failing though I am not sure what the end result would be one file with charged ligands and another with fails.

Might be worth testing - could be in a follow-up (for the sake of the CLI if this fails the whole thing fails, but the API is a bit different).

IAlibay · 2025-01-17T13:26:29Z

openfe/protocols/openmm_utils/charge_generation.py

+    return offmol
+
+
+def bulk_assign_partial_charges(


This looks great! Have you tested if it scales as intended?

Yes we get a good speed up in local testing, charging 16 malt1 ligands (~42 atoms) with am1bcc (ambertools) takes:

4:02 2 cores

2:20 4 cores

1: 48 6 cores

atravitz

Here's my partial review - I'll look at the rest on Tuesday!

atravitz · 2025-01-17T22:50:27Z

openfecli/parameters/utils.py

+
+
+WORKERS = Option(
+    "-w",


Is it more consistent with other tools to make this -n, or is there precedent elsewhere using -w?

I know I just added -n as the shorthand for --n-protocol-repeats in this PR, but I think maybe that one just shouldn't have a short version, and that -n should be used for multiprocessing instead.

Also, could you move this to a new misc.py file like I did in that PR? I'd like to keep utils.py for only helper functions.

Yeah -n is probably more common I just didn't want to clash with anything so happy to chance to -n and -ncores here.

openfecli/parameters/plan_network_options.py

Co-authored-by: Alyssa Travitz <[email protected]>

codecov · 2025-01-21T13:45:23Z

Codecov Report

Attention: Patch coverage is 98.14815% with 5 lines in your changes missing coverage. Please review.

Project coverage is 92.87%. Comparing base (915d110) to head (6ab8e9f).

Files with missing lines	Patch %	Lines
openfe/protocols/openmm_utils/charge_generation.py	75.00%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1068      +/-   ##
==========================================
- Coverage   94.46%   92.87%   -1.60%     
==========================================
  Files         135      138       +3     
  Lines       10090    10343     +253     
==========================================
+ Hits         9532     9606      +74     
- Misses        558      737     +179

Flag	Coverage Δ
fast-tests	`92.87% <98.14%> (?)`
slow-tests	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

IAlibay · 2025-01-21T16:57:05Z

openfe/protocols/openmm_utils/charge_generation.py

-    )  # type: ignore
+    # limit the number of threads used by SQM
+    # <https://github.com/openforcefield/openff-toolkit/issues/1831>
+    with threadpool_limits(limits=2):


What's the advantage of 2 vs 1 here?

I thought 1 might have been harsh but it seems fine in testing, maybe slightly faster while using more cores so changed to 1.

atravitz

Looks great, I just have a few comments. And please add a news entry!

atravitz · 2025-01-21T18:09:35Z

openfe/tests/data/cdk8.zip

is this test data being changed?

Yeah, the ligands now have charges stored in the SDF so it should skip generating the charges and keep the CI running quickly!

openfecli/commands/generate_partial_charges.py

atravitz · 2025-01-21T18:38:12Z

openfecli/parameters/plan_network_options.py

+Partial Charges
+===============
+
+Methods
+-------
+
+Supported partial charge method choices are:
+    - ``am1bcc``
+    - ``am1bccelf10`` (only possible if `off_toolkit_backend` in settings is set to `openeye`)
+    - ``nagl``
+    - ``espaloma`` (must have `espaloma_charge` installed)
+
+Settings
+--------
+
+The following settings can also be set:
+
+    - ``off_toolkit_backend``: The backend to use for partial charge generation. Choose from  ``ambertools`` (default), ``openeye`` or ``rdkit``.
+    - ``number_of_conformers``: The number of conformers to use for partial charge generation.
+      If unset (default), the input conformer will be used.
+    - ``nagl_model``: The NAGL model to use.
+      If unset (default), the latest available production charge model will be used.
+
+For more information on the different options, please refer to https://docs.openfree.energy/en/stable/reference/api/openmm_protocol_settings.html#openfe.protocols.openmm_utils.omm_settings.OpenFFPartialChargeSettings.
+


I'm not sure this is the best place for these docs to live, but that can be a follow-up PR as I do more docs review.

Yeah this makes the help message on the CLI way to long we should just move all of this to the yaml settings docs once we get them setup.

Yeah I'm the guilty party on this one - please don't feel like you have to go with this (or really anything else I wrote), this was some late night ramblings.

atravitz · 2025-01-21T18:39:49Z

openfecli/tests/test_rbfe_tutorial.py

-            'easy_rbfe_lig_ejm_46_solvent_lig_jmc_23_solvent.json',
-            'easy_rbfe_lig_ejm_46_complex_lig_jmc_23_complex.json',
-            'easy_rbfe_lig_jmc_23_complex_lig_jmc_27_complex.json',
            'easy_rbfe_lig_jmc_23_solvent_lig_jmc_27_solvent.json',
+            'easy_rbfe_lig_jmc_23_complex_lig_jmc_27_complex.json',
            'easy_rbfe_lig_jmc_23_solvent_lig_jmc_28_solvent.json',
-            'easy_rbfe_lig_jmc_23_complex_lig_jmc_28_complex.json']
+            'easy_rbfe_lig_jmc_23_complex_lig_jmc_28_complex.json',
+            'easy_rbfe_lig_ejm_46_solvent_lig_jmc_23_solvent.json',
+            'easy_rbfe_lig_ejm_46_complex_lig_jmc_23_complex.json']


Just curious, what caused this ordering change?

Good catch this was me playing with the edges in the network I'll undo this change.

Co-authored-by: Alyssa Travitz <[email protected]>

jthorton · 2025-01-23T15:23:44Z

Narrowed down the test error to (with openeye installed):

from gufe import SmallMoleculeComponent
gdp = SmallMoleculeComponent.from_sdf_file("eg5_cofactor.sdf")
gdp.to_openff().to_file("gdp_ugly.sdf", "sdf")

openff.toolkit.utils.exceptions.InconsistentStereochemistryError: Programming error: OpenEye atom stereochemistry assumptions failed. The atom in the oemol has stereochemistry None and the atom in the offmol has stereochemistry S.

…an CLI

github-actions · 2025-01-24T11:52:37Z

No API break detected ✅

IAlibay

I can't do a "request changes" since I'm the og PR author - but "requested changed": looks good, just a couple of things and possibly needs a test with multiprocessing.

IAlibay · 2025-01-25T02:10:07Z

openfe/tests/data/eg5/eg5_cofactor.sdf

Are you adding charges here just because of the failed openeye test or because of other reasons too?

If it's just NAGL openeye issue, then I would ask that we actually keep the falling test and we can xfail it or mark it for skip. The reason is that this failure is actually a real issue that would impact our users - so keeping a test that reminds us it is a problem is very useful.

Alternatively we add it as a separate test/issue, that's ok too - I just don't want to lose visibility on this failure (this type of problem has bitten us enough times in the past).

IAlibay · 2025-01-28T02:26:07Z

openfe/protocols/openmm_utils/charge_generation.py

+    ------
+    ValueError
+      If the ``toolkit_backend`` is not supported by the selected ``method``.
+      If ``generate_n_conformers`` is ``None``, but the input ``offmol``


Might be worth testing - could be in a follow-up (for the sake of the CLI if this fails the whole thing fails, but the API is a bit different).

IAlibay · 2025-01-28T02:28:13Z

openfe/protocols/openmm_utils/charge_generation.py

+    }
+    charged_ligands = []
+
+    if processors > 1:


Just a reminder to add a test for this.

IAlibay · 2025-01-28T02:29:39Z

openfecli/commands/generate_partial_charges.py

+    help=YAML_OPTIONS.kwargs["help"],
+)
+@OUTPUT_FILE_AND_EXT.parameter(
+    help="The name of the SDF file the charged ligands should be written to."


(apologies if it's somewhere and I've missed it), We probably need a check that we're not allowing users to overwrite their input SDF? 100% can see someone accidentally passing the same SDF as arguments (or some other existing one), by accident.

IAlibay · 2025-01-28T02:33:39Z

openfecli/commands/plan_rbfe_network.py

 @print_duration
 def plan_rbfe_network(
        molecules: list[str], protein: str, cofactors: tuple[str],
        yaml_settings: str,
        output_dir: str,
+        n_cores: int,
+        overwrite_charges: bool


I know they should know better but.. I know someone's going to make a mistake here. Should we be extra and write out that they chose this option so it's going to happen?

IAlibay · 2025-01-28T02:34:49Z

openfecli/parameters/plan_network_options.py

+Partial Charges
+===============
+
+Methods
+-------
+
+Supported partial charge method choices are:
+    - ``am1bcc``
+    - ``am1bccelf10`` (only possible if `off_toolkit_backend` in settings is set to `openeye`)
+    - ``nagl``
+    - ``espaloma`` (must have `espaloma_charge` installed)
+
+Settings
+--------
+
+The following settings can also be set:
+
+    - ``off_toolkit_backend``: The backend to use for partial charge generation. Choose from  ``ambertools`` (default), ``openeye`` or ``rdkit``.
+    - ``number_of_conformers``: The number of conformers to use for partial charge generation.
+      If unset (default), the input conformer will be used.
+    - ``nagl_model``: The NAGL model to use.
+      If unset (default), the latest available production charge model will be used.
+
+For more information on the different options, please refer to https://docs.openfree.energy/en/stable/reference/api/openmm_protocol_settings.html#openfe.protocols.openmm_utils.omm_settings.OpenFFPartialChargeSettings.
+


Yeah I'm the guilty party on this one - please don't feel like you have to go with this (or really anything else I wrote), this was some late night ramblings.

IAlibay and others added 5 commits January 7, 2025 19:03

Testing out what partial charges could look over the CLI

b716571

update that one thing

a5075eb

fix signature

020ba71

fix up a few things

45b8025

Merge branch 'main' into cli-partial-charges

c5f0371

This was linked to issues Jan 16, 2025

Add partial charge generation to the CLI setup. #1067

Open

Partial charge generation outside the protocol (user charges) #969

Open

add process pool, expose charge cli command

283d9e9

IAlibay commented Jan 16, 2025

View reviewed changes

update flag name, fix tests to use nagl for CI speed

8383718

IAlibay commented Jan 17, 2025

View reviewed changes

jameseastwood assigned atravitz and jthorton Jan 17, 2025

atravitz reviewed Jan 17, 2025

View reviewed changes

atravitz mentioned this pull request Jan 17, 2025

add support for one repeat per json #1076

Open

2 tasks

jthorton and others added 7 commits January 20, 2025 13:48

Update openfecli/parameters/plan_network_options.py

aa53762

Co-authored-by: Alyssa Travitz <[email protected]>

Update openfecli/parameters/plan_network_options.py

c5e5900

Co-authored-by: Alyssa Travitz <[email protected]>

Update openfecli/parameters/plan_network_options.py

e192f95

Co-authored-by: Alyssa Travitz <[email protected]>

Update openfecli/parameters/plan_network_options.py

43efb18

Co-authored-by: Alyssa Travitz <[email protected]>

move ncores cli flag, fix tests, add tests for charge generation

8602a39

fix edge sort dependent tests, fix mypy

c350c19

fix mypy, pre-charge ligands for reproducible tests

b3ef278

IAlibay commented Jan 21, 2025

View reviewed changes

jthorton marked this pull request as ready for review January 21, 2025 17:08

atravitz reviewed Jan 22, 2025

View reviewed changes

jthorton and others added 3 commits January 23, 2025 14:05

Update openfecli/commands/generate_partial_charges.py

c2fc143

Co-authored-by: Alyssa Travitz <[email protected]>

Update openfecli/commands/generate_partial_charges.py

d16bc3e

Co-authored-by: Alyssa Travitz <[email protected]>

set threads to 1, undo tutorial file changes

99a891f

jthorton added 2 commits January 23, 2025 14:19

add news entry

73fa615

update openff-toolkit pin

37add05

jthorton added 2 commits January 24, 2025 11:39

add charges to cofactor sdf, fix docs, expose overwrite charges to pl…

c46e7c1

…an CLI

fix test

6ab8e9f

jthorton changed the title ~~[DNM] Testing out what partial charges could look over the CLI~~ Testing out what partial charges could look over the CLI Jan 24, 2025

IAlibay commented Jan 28, 2025

View reviewed changes

Testing out what partial charges could look over the CLI #1068

Are you sure you want to change the base?

Testing out what partial charges could look over the CLI #1068

Conversation

IAlibay commented Jan 7, 2025 • edited by jthorton Loading

Developers certificate of origin

jthorton commented Jan 16, 2025

Questions:

IAlibay left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jthorton Jan 20, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

IAlibay left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atravitz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jan 21, 2025 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atravitz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jthorton commented Jan 23, 2025 • edited Loading

github-actions bot commented Jan 24, 2025

IAlibay left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

IAlibay commented Jan 7, 2025 •

edited by jthorton

Loading

jthorton Jan 20, 2025 •

edited

Loading

codecov bot commented Jan 21, 2025 •

edited

Loading

jthorton commented Jan 23, 2025 •

edited

Loading