Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added falcon model converter #2040

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mehtamansi29
Copy link
Collaborator

Falcon model converter is missing. Added the same. Fixes #1988

class TestTask(TestCase):
@pytest.mark.large
def test_convert_tiny_preset(self):
model = FalconCausalLM.from_preset("hf://tiiuae/falcon-7b")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can afford to download this ~15gb file in our testing setup. You could try the 1b model? Or create a small test model on hf, as was done for llama and others.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattdangerw - I'll create small test with 1b falcon model and commit again.

@mattdangerw
Copy link
Member

@SamanehSaadat can you take a look for the falcon conversions options here? I remember there were some annoying gotchas (e.g. different tokenizer types), that this might not conver.


@pytest.mark.large
def test_class_detection(self):
model = FalconCausalLM.from_preset("hf://tiiuae/falcon-7b")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? I think we only have Falcon-1b support! 7b model has a different attention mechanism which hasn't been added!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably also attach a colab verifying that output from the huggingface and KerasHub versions align. And sound like that might actually run into differences here due to what @SamanehSaadat is saying.

@SamanehSaadat how much work is needed of the architecture code to support the 7 and other variants? Is it something that could be added here or a ton to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattdangerw I think adding support for the 7b is non-trivial. There are some major architectural differences like alibi, GQA vs. MHA, and rotary embedding (to me, it's almost like adding a new architecture!).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Sounds like we will need to either throw in the converter if we encounter the falcon huggingface options we don't currently support, or add them in (on a separate pr?).

@mehtamansi29 we'd probably need a colab verifying that the output matches for some subset of falcon checkpoints on huggingface, and ideally that we throw for falcon checkpoints that needs arch options we don't yet support.

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Jan 22, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing Transformers initializer for Falcon models
5 participants