-
Notifications
You must be signed in to change notification settings - Fork 27.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds DonutSwin to models exportable with ONNX #19401
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @WaterKnight1998,
Thanks for your PR. It looks clean.
Nice catch for the model-type
variable that could be tricky to find: https://huggingface.co/naver-clova-ix/donut-base-finetuned-docvqa/blob/main/config.json#L138
First DocumentQuestionAnswering model added. It's pretty cool!
I don't see the comment. Do I need to solve anything? However, for testing locally I was using next code but I can't export the model :( I exported just encoder like this from transformers import VisionEncoderDecoderModel
model = VisionEncoderDecoderModel.from_pretrained("naver-clova-ix/donut-base")
model.encoder.save_pretrained("./swin") Then trying to convert to onnx I get:
Do I need to add more code? |
Yes, it would help if you overcharged the Check this here: transformers/src/transformers/models/layoutlmv3/configuration_layoutlmv3.py Lines 227 to 294 in bc21aac
This can help too, it's the base transformers/src/transformers/onnx/config.py Lines 264 to 378 in bc21aac
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding support for this new model @WaterKnight1998 and welcome to the 🤗 Transformers community!
As suggested by @chainyo, you'll need to override the function that generates dummy data. I also left a nit regarding one of the imports.
@chainyo @lewtun Relative imports fixed and added also the function to generate dummy functions. But when I convert the model into ONNX like this: import transformers
from pathlib import Path
from transformers import VisionEncoderDecoderModel
model = VisionEncoderDecoderModel.from_pretrained("naver-clova-ix/donut-base")
model.encoder.save_pretrained("./swin")
from transformers.onnx import export
from transformers import AutoConfig
from transformers.models.donut import *
onnx_config = AutoConfig.from_pretrained("./swin")
onnx_config = DonutSwinOnnxConfig(onnx_config)
processor = DonutProcessor.from_pretrained("naver-clova-ix/donut-base")
onnx_inputs, onnx_outputs = export(processor, model.encoder, onnx_config, onnx_config.default_onnx_opset, Path("model.onnx")) I get the following warnings:
Is it ok? |
Hi @WaterKnight1998, |
Yes, I get the files
Yes, model loaded
I am testing: from transformers.onnx import validate_model_outputs
validate_model_outputs(
onnx_config, tokenizer, base_model, onnx_path, onnx_outputs, onnx_config.atol_for_validation
) But python process is killed here in my computer: https://github.com/huggingface/transformers/blob/main/src/transformers/onnx/convert.py#L392 Maybe too big for CPU? |
Hi, I tested in Databricks and got this error:
Maybe I need to update anything @chainyo & @lewtun ? Or is it OK? |
I didn't think about this but do you have enough RAM locally? Imagine the model is 20Gb you need the double to convert one model (~40Gb) because scripts need to load both models simultaneously. The error I see on Databricks is about
|
Good point, I just have 32GB of RAM locally, probably this.
I tested with this: import transformers
from pathlib import Path
from transformers import VisionEncoderDecoderModel
model = VisionEncoderDecoderModel.from_pretrained("naver-clova-ix/donut-base")
model.encoder.save_pretrained("./swin")
from transformers.onnx import export
from transformers import AutoConfig
from transformers.models.donut import *
onnx_config = AutoConfig.from_pretrained("./swin")
onnx_config = DonutSwinOnnxConfig(onnx_config)
processor = DonutProcessor.from_pretrained("naver-clova-ix/donut-base")
onnx_inputs, onnx_outputs = export(processor, model.encoder, onnx_config, onnx_config.default_onnx_opset, Path("model.onnx"))
from transformers.onnx import validate_model_outputs
validate_model_outputs(
onnx_config, tokenizer, base_model, onnx_path, onnx_outputs, onnx_config.atol_for_validation
)
In my config it is set to: @property
def atol_for_validation(self) -> float:
return 1e-4 Should I test with 1e-3? But I am getting 0.05 I don't get why difference is too bight, maybe the warnings that I mentioned in other comment?
|
I think it just means that it's a bit random. I don't think it's linked to the hardware, test to check the atol like 10k times per hardware. IMO it seems evident that atol=1e-2 could do the trick, but it looks terrible to accept atol > 1e-3. To return to the warning, you had earlier while converting the model: did you check if all layers are implemented in ONNX? |
Hey @WaterKnight1998 I recently implemented a fix in #19475 that was causing all the Swin models to have incorrect ONNX graphs. Could you first try rebasing on |
Added document question answering task to onnx features. Adding the necessary changes to the donut module init. Black formatting. Imports are now relative. Added a function to generate dummy inputs for DonutSwin tracing. Black formatting. Reordering imports. Sorting imports.
Hi @lewtun If if you in the PR i rebased and tested again, I am seeing the same issue:
|
Thanks for that insight @WaterKnight1998, although I'd be surprised if that's the source of the issue. I'll take a closer look at the dummy data generation ASAP |
Hi @WaterKnight1998 now that #19254 has been merged, can't you export the Donut checkpoints directly using this feature:
My understanding is that Donut falls under the general class of vision encoder-decoder models, so a separate ONNX export might not be needed |
Hi @lewtun I tested this but this is not working owing to the tollerance issue. In addition, maybe some users just want to export the encoder part. adding @NielsRogge as he implemeted this in #18488 |
@lewtun While converting facing output value error (for the same command mentioned above)
But separately I am able to convert the encoder and decoder model to ONNX as well as verified the output shape, that went well. But I don't know how to implement @lewtun @WaterKnight1998 Any suggestions here ( I can share the Colab if required). Thanks and Regards. |
@BakingBrains Using the code from my PR to do the encoder conversion? |
@lewtun and @WaterKnight1998 any updates on the decoder? I am able to convert the decoder model. Not sure if that's the right method. (but the output shape from Donut decoder and ONNX decoder is same) |
Hi, @lewtun @chainyo @BakingBrains any news on this? I need this to get the model into production :( |
@sgugger could you help us? We are looking forward for this feature 🙂 |
Hey @WaterKnight1998 I'm taking a look at this, but it's turning out to be tricky to figure out why where the discrepancy arises with the ONNX graph vs PyTorch model. |
Thank you very much for looking at it 😊 |
FYI if you need a temporary workaround and are willing to tolerate some error on the decoder, you can export one of the donut checkpoints on the
This will produce two ONNX files ( |
Good question @BakingBrains ! As of now, you'll have to roll your own generation loop with |
Thank you @lewtun. Got it. |
Ok, thank you very much. I hope you find a solution and we can merge this branch. |
I've created an issue to track the issue with specifically exporting Donut checkpoints: #19983 @WaterKnight1998 can you please share some code snippets on how you currently use the DonutSwin models for document QA and image classification? If I'm not mistaken, inference with these models is only supported via the |
Yes, you are right, maybe we can remove those tasks. However, I think it will be good to allow users to export the encoder independently. Maybe some wants to re-use it for a different model or architecture |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
@lewtun reopen |
What does this PR do?
Fixes #16308
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@lewtun & @chainyo for ONNX and @NielsRogge for Donut and Document Question Answering.