-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Jinja template support #11016
base: master
Are you sure you want to change the base?
Add Jinja template support #11016
Conversation
Feel free to add the option to llama-run for basic testing also @ochafik |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve the llama-run parts at least, but the more code we can share with llama-server, etc. the better, there's probably room for more de-duplication
IMO we can extend |
Subset of #9639 with just the Jinja templating support.
Proper tool support (grammar constraints, lazy grammar triggering, tool call parsing & stop reason) will come in a follow up PR.
--jinja
flag to llama-server, llama-cli, llama-run--chat-template-file
flag to llama-server, llama-cli (related: Added chat template support to llama-run #11215 )tokenizer.chat_template
(ortokenizer.chat_template.tool_use
if defined, only when the request has tools).trim_blocks = true, lstrip_blocks = true
)Example usage:
show output
TODO: