-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Use custom models #23
Comments
hmm no.. in the example there is a list of models... |
for example.. let's say I wish to have Mistral Instruct v0.3 quantized as: f16 (output and embed) and q6_k for the other tensors. How should I proceed? |
@0wwafa I understand the need here. Let me explain. First, the prerequisite for custom models to run on WebLLM chat is that the models must be compiled to MLC format. For more details, checking the instructions of mlc llm here. Once you got the the MLC-format models on your local, the proposal here is to allow one of the three following ways to use it on the webapp:
These are planned to be released in the next months. Does any of these fulfill what you need? |
Welll I just wish to see how mistral works in the web browser using one of my quantizations, specifically: In other words I quantized the output and embed tensors to f16 (or q8) and the other tensors to q6 or q5. |
The app has updated to support custom models through MLC-LLM REST APIs by switching model type in settings. |
My models are available here. I still don't understand how to use them with mlc_llm |
Problem Description
mlc-ai/web-llm#421
Users want to be able to upload their own models from local machine.
Solution Description
WebLLM Engine is capable of loading any MLC format models.
https://github.com/mlc-ai/web-llm/tree/main/examples/simple-chat-upload is an example of supporting local model in the app.
We want to do something similar to allow uploading.
The text was updated successfully, but these errors were encountered: