Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalizes GPT3CompletionModel to work with other providers, adds Anthropic #71

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

falquaddoomi
Copy link
Collaborator

This PR generalizes GPT3CompletionModel to work with API clients for other model providers. The class now takes a model_provider string parameter, which must be a valid key in the manubot_ai_editor.models.MODEL_PROVIDERS dictionary. Explicit references to OpenAI have been generalized to apply to other model providers, e.g. the openai_api_key parameter is now just api_key.

GPT3CompletionModel now supports Anthropic as a second model provider, and more can be added by extending the MODEL_PROVIDERS dict mentioned previously.

The PR modifies the "cost" end-to-end test tests.test_prompt_config.test_prompts_apply_gpt3 to also check Anthropic. To run the tests against both OpenAI and Anthropic, be sure that you've exported both OPENAI_API_KEY and ANTHROPIC_API_KEY with valid API keys for each, then run poetry run pytest --runcost to run the end-to-end tests.

End-to-end test tweaks: Note that the "cost" test always has the potential to break, since the LLM doesn't always obey the prompt's request to insert a special keyword into the text. This morning, the OpenAI test was unable to add "bottle" to the "abstract" section, so I changed it to "violin", which appeared to pass. Also, it was inserting the keyword "breakdown" as "break down", so I modified the test to remove the spaces in the response before checking for the keyword.

Documentation: I've gone through the README and tried to tweak it to explain that we now support multiple model providers, but it may require further tweaking. Also, I'm unsure if "model provider" is the preferred term for companies like OpenAI and Anthropic that provide APIs to query LLMs, or if we should use something else; feedback appreciated!

1. If you haven't already, [make an OpenAI account](https://openai.com/api/) and [create an API key](https://platform.openai.com/api-keys).
1. In your fork's "⚙️ Settings" tab, make a new Actions repository secret with the name `OPENAI_API_KEY` and paste in your API key as the secret.
1. If you haven't already, follow the directions above to create an account and get an API key for your chosen model provider.
1. In your fork's "⚙️ Settings" tab, make a new Actions repository secret with the name `<PROVIDER>_API_KEY` and paste in your API key as the secret. Replace `<PROVIDER>`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to make a PR to add the Anthropic key to the rootstock workflow here:
https://github.com/manubot/rootstock/blob/main/.github/workflows/ai-revision.yaml#L59

If at some point in the future we theoretically support like a dozen or more services, maybe we just instruct the user to update their ai-revision workflow accordingly for whatever services they're using.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point; I've converted this PR into a draft until I figure out the implications upstream, including the one you raised. I'm wondering if we should relax the requirement that <PROVIDER>_API_KEY exists and has a non-empty value for every provider, and just check that it's valid when we actually use it to query the API.

I don't know how many services we'll end up providing, but ideally we won't have to make PRs in multiple repos to support the changes going forward. Let me think on it; perhaps we can take in a value in a structured format from rootstock for all the AI Editor options, and the definition of that format can be in this repo, too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take care of that small rootstock PR. Per our discussion, we'll add:

  • comment above workflow step saying something like "duplicate step as necessary to use different providers"
  • rename "open ai key" var to just "ai key"
  • add provider env var

@falquaddoomi falquaddoomi marked this pull request as draft December 18, 2024 20:56
Copy link
Collaborator

@d33bs d33bs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job, wanted to add some comments in case they're helpful along the journey here.

support whichever model providers LangChain supports. That said, we currently support OpenAI and Anthropic models only,
and are working to add support for other model providers.

When using OpenAI models, [our evaluations](https://github.com/pivlab/manubot-ai-editor-evals) show that `gpt-4-turbo`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly outside the bounds of this PR: I wondered if versioning the evals could make sense (perhaps through a DOI per finding or maybe through the poster which was shared). There could come a time (probably sooner than we think) that GPT-4-Turbo isn't available or relevant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point; I wonder if we should move the statement about which model was best in evaluation to the https://github.com/pivlab/manubot-ai-editor-evals repo, so that it can be updated without having to keep this repo up to date as well. I suppose @vincerubinetti and @miltondp might have opinions there, since they're the primary contributors on the evals repo.

libs/manubot_ai_editor/models.py Show resolved Hide resolved
tests/test_model_basics.py Outdated Show resolved Hide resolved
@falquaddoomi falquaddoomi marked this pull request as ready for review January 15, 2025 20:50
@falquaddoomi
Copy link
Collaborator Author

We need to figure out how we're going to handle the additional API key environment variables for new providers, since they require updates to rootstock as @vincerubinetti mentioned, and might quickly get unmanageable as the number of providers we support grows.

I'd be in favor of resolving the API key like so:

  • if a provider-specific API key is supplied, e.g. ANTHROPIC_API_KEY, it'll be used with that provider
  • if the provider-specific API key isn't found, a generic one, e.g. PROVIDER_API_KEY will be checked
  • if that can't be found, either, and the provider requires a key it'll throw an error

Happy to hear differing opinions, of course!

@falquaddoomi falquaddoomi marked this pull request as draft January 15, 2025 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants