Any plans to integrate GTE model natively into transformers #35568

yaswanth19 · 2025-01-08T18:04:25Z

Model description

Any plans to integrate gte model natively into transformers as right now we are using this model with trust_remote_code=True argument

Open source status

The model implementation is available
The model weights are available

Provide useful links for the implementation

Model Implementation: https://huggingface.co/Alibaba-NLP/new-impl/blob/main/modeling.py
Model Weights: https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5

The text was updated successfully, but these errors were encountered:

yaswanth19 · 2025-01-08T18:05:18Z

@ArthurZucker @Rocketknight1 If we do intend to integrate this model then I can work on creating a draft PR.

mahimairaja · 2025-01-08T22:06:13Z

Is there a place, I can help you to add the model @yaswanth19 ?

yaswanth19 · 2025-01-13T14:09:40Z

@ArthurZucker A gentle ping.
@tomaarsen Can this also be integrated into sentence-transformers

Rocketknight1 · 2025-01-14T14:43:33Z

This seems popular enough to justify an integration, yes. WDYT @tomaarsen?

tomaarsen · 2025-01-17T16:32:02Z

@Rocketknight1
I suspect there are 3 popular and promising models built on this architecture:

Beyond that, the authors are now using another implementation on top of Qwen:

Some of the mechanisms are similar to ModernBERT (I see unpadding), but some differ as well (xformers). It might require a good bit of effort to get everything to line up with transformers, and I think there's a chance that there will be no more big models based on this architecture.

Tom Aarsen

yaswanth19 · 2025-01-22T15:44:18Z

@Rocketknight1 Should I start implementing support for this model, or do you think the effort outweighs the potential benefit and keep using these models with trust_remote_code?

Rocketknight1 · 2025-01-22T17:47:53Z

Hi @yaswanth19, given @tomaarsen's comment above, I think it's okay to leave them as trust_remote_code models, especially if future versions of gte already exist on a different architecture.

tomaarsen · 2025-01-22T17:51:28Z

On this topic, the Alibaba team actually just released superior modes based on the new ModernBERT architecture today:

I imagine that they might not move forward with their previous architecture, especially considering they mention that the only parameter they changed for these compared to their previous models was the base model.

Tom Aarsen

yaswanth19 added the New model label Jan 8, 2025

yaswanth19 closed this as completed Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any plans to integrate GTE model natively into transformers #35568

Any plans to integrate GTE model natively into transformers #35568

yaswanth19 commented Jan 8, 2025 •

edited

Loading

yaswanth19 commented Jan 8, 2025

mahimairaja commented Jan 8, 2025

yaswanth19 commented Jan 13, 2025

Rocketknight1 commented Jan 14, 2025

tomaarsen commented Jan 17, 2025 •

edited

Loading

yaswanth19 commented Jan 22, 2025

Rocketknight1 commented Jan 22, 2025

tomaarsen commented Jan 22, 2025

Any plans to integrate GTE model natively into transformers #35568

Any plans to integrate GTE model natively into transformers #35568

Comments

yaswanth19 commented Jan 8, 2025 • edited Loading

Model description

Open source status

Provide useful links for the implementation

yaswanth19 commented Jan 8, 2025

mahimairaja commented Jan 8, 2025

yaswanth19 commented Jan 13, 2025

Rocketknight1 commented Jan 14, 2025

tomaarsen commented Jan 17, 2025 • edited Loading

yaswanth19 commented Jan 22, 2025

Rocketknight1 commented Jan 22, 2025

tomaarsen commented Jan 22, 2025

yaswanth19 commented Jan 8, 2025 •

edited

Loading

tomaarsen commented Jan 17, 2025 •

edited

Loading