git-bob uses AI to solve GitHub issues and review pull requests. It runs inside the GitHub CI, no need to install anything on your computer. Read more in the preprint.
Under the hood it uses Anthropic's Claude or OpenAI's chatGPT or Google's Gemini to understand the text and pygithub to interact with the issues and pull requests. As its discussions are conserved, you can document how things were done using AI and others can learn how to prompt for the things you did. For example, the pair-plot discussion above is available online.
git-bob
is a research project aiming at streamlining GitHub interaction in software development projects. Under the hood it uses
artificial intelligence / large language models to generate text and code fulfilling the user's requests.
Users are responsible to verify the generated code according to good scientific practice.
When using git-bob
you configure it to use an API key to access the AI models.
You have to pay for the usage and must be careful in using the software.
Do not use this technology if you are not aware of the costs and consequences.
Caution
When using the Anthropic, OpenAI, Google Gemini, Mistral or any other endpoint via git-bob, you are bound to the terms of service of the respective companies or organizations. The GitHub issues, pull requests and messages you enter are transferred to their servers and may be processed and stored there. Make sure to not submit any sensitive, confidential or personal data. Also using these services may cost money.
There is a detailed tutorial on how to install git-bob as GitHub action to your repository. In very short, to use git-bob in your GitHub repository, you need to
- Copy the git-bob GitHub workflow in folder
.github/workflows/
to your repository.- Make sure to replace
pip install -e .
with a specific git-bob version such aspip install git-bob==0.16.0
. - Configure the LLM you want to use in the workflow files by specifying the
GIT_BOB_LLM_NAME
environment variable. These were tested:claude-3-5-sonnet-20241022
gpt-4o-2024-08-06
github_models:gpt-4o
github_models:meta-llama-3.1-405b-instruct
gemini-1.5-pro-002
mistral-large-2411
(usespixtral-12b-2409
for vision tasks)
- configure a GitHub secret called
OPENAI_API_KEY
orANTHROPIC_API_KEY
orGH_MODELS_API_KEY
orGOOGLE_API_KEY
orMISTRAL_API_KEY
orKISSKI_API_KEY
orBLABBLADOR_API_KEY
with the corresponding key from the LLM provider depending on the above configured LLM. You can get these keys here: - configure GitHub actions to run the workflow on issues and pull requests. Also give write-access to the Workflow using the
GITHUB_TOKEN
.
- Make sure to replace
When using it in your repository, you can also set a custom system message, for example for:
- General Data Science / Python Programming
- Bio-Image Analysis
- Giving advice on a specific repository / library
- Manuscript writing
Furthermore, to guide discussions, you may want to setup issue templates, e.g.
Since version 0.10.1 git-bob has experimental support for gitlab. You find detailed instructions how to install it here.
To trigger git-bob, you need to comment on an issue or pull request with the comment
trigger word (or aliases think about
, review
, respond
):
git-bob comment
If you want to ask git-bob for a review of a pull-request, you can use the review
trigger word. Also make sure mention explictly what you want to be reviewed.
git-bob review this PR. Check code quality and comments.
After some back-and-forth discussion, you can also use the solve
trigger word (or aliases implement
, apply
) make git-bob solve an issue and send a pull-request.
This trigger can also be used to modify code in pull requests.
git-bob solve
You can ask git-bob to implement a solution for testing, without sending a pull-request, using the try
trigger:
git-bob try
If you have multiple API-Key for different LLMs configured, you can specify the LLM in the command using the ask <LLM-Name> to
trigger command:
git-bob ask claude-3-5-sonnet-20241022 to solve this issue.
If the issue is complex and should be split into sub-issues, you can use the following command:
git-bob split
If you have two GitHub secrets TWINE_USERNAME
and TWINE_PASSWORD
configured, you can also use the following command to publish a new version of your library to PyPI:
git-bob deploy
All trigger words can be combined with please
and/or ,
, which will make no difference to calling git-bob without these words:
git-bob, please ask gemini-1.5-pro-002 to solve this issue.
Here's the recommended workflow for using git-bob:
- Create an issue describing the problem or task.
- Comment on the issue with
git-bob comment
, orgit-bob think about this
(an alias forcomment
) to trigger git-bob making a plan. - Respond to git-bob with any clarifications or additional information it requests.
- Comment on the issue with
git-bob solve
orgit-bob implement this
(an alias forsolve
) to trigger git-bob. - Wait for git-bob to create a pull request (PR) addressing the issue.
- Review the PR and comment on the PR or on the original issue if changes are needed.
- Wait for git-bob to create new PR or modifying the existing PR with the requested changes.
- Repeat steps 3-5 as necessary until the issue is resolved satisfactorily.
A huge variety of use-cases for git-bob are thinkable. Here are some examples. Many serve purely demonstrative purposes. Some were parts of real scientific data analysis projects.
- Question answering
- Translation
- Bio-image Analysis
- Programming
- Prompting
- Continuous Integration and Deployment
- Data & Code Management
- Write a Data Management Plan (DMP)
- Research Data Management & Folder Structures
- Documenting source code
- Determining licenses of dependencies
- Assisting scientific manuscript writing
- Deleting files
- Converting tables to key-value pairs
- Exporting Google Scholar profile as bibtex
- Deciding for file formats: JSON versus YAML
- Generating Galaxy workflows
- Count citations of given DOIs
- Convert PDF documents to PNG images
- Convert PDF documents to animated GIFs
- Querying the arxiv
- Retrieving meta-data of arxiv articles
- Graphical User Interfaces
- Statistics
- Plotting
- Science Communication
- Fun
- Things that didn't work well
git clone https://github.com/haesleinhuepf/git-bob.git
cd git-bob
You can also install git-bob locally and run it from the terminal.
In this case, create a GitHub token and store it in an environment variable named GITHUB_API_KEY
.
Also create an environment variable GIT_BOB_LLM_NAME
with the name of the LLM you want to use, e.g. "gpt-4o-2024-05-13" or "claude-3-5-sonnet-20241022" or "github_models:gpt-4o".
Then you can install git-bob using pip:
pip install git-bob
You can then use git-bob from the terminal on repositories you have read/write access to. It is recommended to call it from the root folder of the repository you want to interact with.
git clone https://github.com/<organization>/<repository>
cd <repository>
git-bob <action> <organization>/<repository> <issue-number>
Available actions:
review-pull-request
comment-on-issue
solve-issue
split-issue
git-bob
is a research project and has limitations. It serves as basis for discussion and further development. Once LLMs become better, git-bob
will become better as well.
At the moment, these limitations can be observed:
git-bob
was tested for Python projects mostly. It seems to be able to process Java and C++ as well.- It can only execute code in Jupyter Notebooks.
- It sometimes hallucinates, especially in code reviews. E.g. it claimed to have tested code, which was certainly not true.
- It cannot solve issues where changing long files is required, as the output of the LLMs is limited by a maximum number of tokens (e.g. 16k for
gpt-4o-2024-08-06
). When using OpenAI's models it combines output of multiple requests to a maximum file length about 64k tokens. It may then miss some spaces or a line break where responses were stitched. When using GitHub models, the maximum file length is 4k tokens. When using Anthropic's Claude, the maximum file length is 8k tokens. - When changing multiple files, it may introduce conflicts between the files, as it does not know about the changed contents of the other files.
- It has only limited logic to control who is allowed to trigger it. If you are a repository member, you can trigger it. If others send a pull request, a repository member must allow the action to run manually.
git-bob
is incompatible with locally running open-source/-weight LLMs. This might make sense when being executed locally only. In the GitHub-CI this might be impossible.- Recently tested
claude-3-5-sonnet-20241022
,gpt-4o-2024-08-06
,github_models:gpt-4o
,github_models:meta-llama-3.1-405b-instruct
andgemini-1.5-pro-002
produced useful results. git-bob
is not allowed to modify workflow files, because it also uses GitHub workflows.- As git-bob is installed as part of git-hub workflows, its download statistics might be misleading. There are not as many people downloading it as the numer of downloads suggest.
There are similar projects out there
- Claude Engineer
- BioChatter
- aider
- OpenDevin
- Devika
- GPT-Codemaster
- GitHub Copilot Workspace
- agentless
- git-aid
- SWE-agent
- gh-gitgen
Feedback and contributions are welcome! Just open an issue and let's discuss before you send a pull request. A human will respond and comment on your ideas!
If you use git-bob, please cite it:
@misc{haase_2024_13928832,
author = {Haase, Robert},
title = {{Towards Transparency and Knowledge Exchange in AI-
assisted Data Analysis Code Generation}},
month = oct,
year = 2024,
publisher = {Zenodo},
doi = {10.5281/zenodo.13928832},
url = {https://doi.org/10.5281/zenodo.13928832}
}
We acknowledge the financial support by the Federal Ministry of Education and Research of Germany and by Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus in the programme Center of Excellence for AI-research „Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig", project identification number: ScaDS.AI