You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you use this model (llava-hf/LLaVA-NeXT-Video-7B-32K-hf) on transfomers==4.47.1 you will get this error because its config specifies to use the class: LlavaNextVideoProcessor from processing_llava_next_video.py and it's __call__ method is not expecting that kwarg.
The quick fix is this:
Modify __call__ (line 101) in processing_llava_next_video.py
Notice the unused kwargs at the end. This reflects the pattern used for __init__
which looks like this:
def__init__(
self,
video_processor=None,
image_processor=None,
tokenizer=None,
chat_template=None,
patch_size=None,
vision_feature_select_strategy=None,
video_token="<video>",
image_token="<image>",
num_additional_image_tokens=0,
**kwargs, # <-- this guy
):
I ain't got time to step through the PR process, so I hope this helps the HF staff either make this quick patch, or solve the problem at a higher level in the code for image_text_to_text.py.
Who can help?
HF staff
Information
The official example scripts
My own modified scripts
Tasks
[x ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...) (image-to-text-to-text)
Reproduction
pipe=pipeline("image-text-to-text", model="llava-hf/LLaVA-NeXT-Video-7B-32K-hf")
messages= {'role': 'user', 'content': [{'type': 'text', 'text': "What's in this image?"}, {'type': 'video'}]}
videos= ["https://huggingface.co/datasets/raushan-testing-hf/videos-test/resolve/main/sample_demo_1.mp4"]
out=pipe(text=messages, videos=videos)
Expected behavior
No exception raised due to an unexpected kwarg.
The text was updated successfully, but these errors were encountered:
inf3rnus
changed the title
LlavaNextVideoProcessor -> TypeError: LlavaNextVideoProcessor.__call__() got an unexpected keyword argument 'legacy'
LlavaNextVideoProcessor -> TypeError: LlavaNextVideoProcessor.__call__() got an unexpected keyword argument 'legacy' (I have the fix)
Jan 10, 2025
I guess as quick fix, we need to standardize processor kwargs API for videoLLM also since those models can work in image-text-to-text setting. Which is why I am not surprised that users want to apply image-text-to-text pipeline, as the closest pipeline from existing ones
PS. I had an idea to add video-text-to-text as pipeline, but we know that all video models are also image models. So, not sure how to exactly separate these two. Anyway pipeline idea will go after videos have their own standard video-processors separate from images, so we can can with that :)
Agreed for the pipeline, I'm just not sure if it will be intuitive for users to use image-text-to-text with video models.
As for this issue, thank you @inf3rnus for bringing it up! I forgot the video models in the kwargs standardization, I can make a note to address that indeed.
I also think the legacy kwarg for image-text-to-text models/pipeline and the corresponding deprecation warning have been there for a while so it might be time to remove them altogether?
System Info
Problem's root cause is in
ImageTextToTextPipeline
class in theimage_text_to_text.py
pipeline.Line
438
Notice how legacy is always specified as False?
If you use this model (
llava-hf/LLaVA-NeXT-Video-7B-32K-hf
) ontransfomers==4.47.1
you will get this error because its config specifies to use the class:LlavaNextVideoProcessor
fromprocessing_llava_next_video.py
and it's__call__
method is not expecting that kwarg.The quick fix is this:
Modify
__call__
(line101
) inprocessing_llava_next_video.py
from this:
to this:
Notice the unused kwargs at the end. This reflects the pattern used for
__init__
which looks like this:
I ain't got time to step through the PR process, so I hope this helps the HF staff either make this quick patch, or solve the problem at a higher level in the code for
image_text_to_text.py
.Who can help?
HF staff
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...) (image-to-text-to-text
)Reproduction
Expected behavior
No exception raised due to an unexpected kwarg.
The text was updated successfully, but these errors were encountered: