Added falcon model converter #2040

mehtamansi29 · 2025-01-09T19:36:45Z

Falcon model converter is missing. Added the same. Fixes #1988

mattdangerw · 2025-01-13T22:36:15Z

keras_hub/src/utils/transformers/convert_falcon_test.py

+class TestTask(TestCase):
+    @pytest.mark.large
+    def test_convert_tiny_preset(self):
+        model = FalconCausalLM.from_preset("hf://tiiuae/falcon-7b")


I don't think we can afford to download this ~15gb file in our testing setup. You could try the 1b model? Or create a small test model on hf, as was done for llama and others.

@mattdangerw - I'll create small test with 1b falcon model and commit again.

mattdangerw · 2025-01-13T22:39:00Z

@SamanehSaadat can you take a look for the falcon conversions options here? I remember there were some annoying gotchas (e.g. different tokenizer types), that this might not conver.

SamanehSaadat · 2025-01-13T23:13:48Z

keras_hub/src/utils/transformers/convert_falcon_test.py

+
+    @pytest.mark.large
+    def test_class_detection(self):
+        model = FalconCausalLM.from_preset("hf://tiiuae/falcon-7b")


Does this work? I think we only have Falcon-1b support! 7b model has a different attention mechanism which hasn't been added!

We should probably also attach a colab verifying that output from the huggingface and KerasHub versions align. And sound like that might actually run into differences here due to what @SamanehSaadat is saying.

@SamanehSaadat how much work is needed of the architecture code to support the 7 and other variants? Is it something that could be added here or a ton to do?

@mattdangerw I think adding support for the 7b is non-trivial. There are some major architectural differences like alibi, GQA vs. MHA, and rotary embedding (to me, it's almost like adding a new architecture!).

Thanks! Sounds like we will need to either throw in the converter if we encounter the falcon huggingface options we don't currently support, or add them in (on a separate pr?).

@mehtamansi29 we'd probably need a colab verifying that the output matches for some subset of falcon checkpoints on huggingface, and ideally that we throw for falcon checkpoints that needs arch options we don't yet support.

mehtamansi29 added 2 commits January 10, 2025 01:05

Added falcon model converter

f0d3696

Added falcon model converter -1

21df61e

mattdangerw requested review from SamanehSaadat and mattdangerw January 13, 2025 22:35

mattdangerw reviewed Jan 13, 2025

View reviewed changes

SamanehSaadat reviewed Jan 13, 2025

View reviewed changes

divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Jan 22, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added falcon model converter #2040

Added falcon model converter #2040

mehtamansi29 commented Jan 9, 2025

mattdangerw Jan 13, 2025

mehtamansi29 Jan 14, 2025

mattdangerw commented Jan 13, 2025

SamanehSaadat Jan 13, 2025

mattdangerw Jan 14, 2025

SamanehSaadat Jan 14, 2025

mattdangerw Jan 21, 2025

Added falcon model converter #2040

Are you sure you want to change the base?

Added falcon model converter #2040

Conversation

mehtamansi29 commented Jan 9, 2025

mattdangerw Jan 13, 2025

Choose a reason for hiding this comment

mehtamansi29 Jan 14, 2025

Choose a reason for hiding this comment

mattdangerw commented Jan 13, 2025

SamanehSaadat Jan 13, 2025

Choose a reason for hiding this comment

mattdangerw Jan 14, 2025

Choose a reason for hiding this comment

SamanehSaadat Jan 14, 2025

Choose a reason for hiding this comment

mattdangerw Jan 21, 2025

Choose a reason for hiding this comment