Deep Dive NB: Quick Fix for AttributeError: 'CLIPTextTransformer' object has no attribute '_build_causal_attention_mask' #37

drscotthawley · 2023-07-02T17:41:26Z

In the Stable Diffusion Deep Dive notebook, in the code plot immediately following the Transformer diagram, there is the definition of get_output_embeds which includes a call to text_encoder.text_model._build_causal_attention_mask:

def get_output_embeds(input_embeddings):
    # CLIP's text model uses causal mask, so we prepare it here:
    bsz, seq_len = input_embeddings.shape[:2]
    causal_attention_mask = text_encoder.text_model._build_causal_attention_mask(bsz, seq_len, dtype=input_embeddings.dtype)
    ...

That is currently generating an error for me when I run the notebook on Colab (from a fresh instance) or my home computer:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-33-dbb74b7ec9b4>](https://localhost:8080/#) in <cell line: 26>()
     24     return output
     25 
---> 26 out_embs_test = get_output_embeds(input_embeddings) # Feed through the model with our new function
     27 print(out_embs_test.shape) # Check the output shape
     28 out_embs_test # Inspect the output

1 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in __getattr__(self, name)
   1612             if name in modules:
   1613                 return modules[name]
-> 1614         raise AttributeError("'{}' object has no attribute '{}'".format(
   1615             type(self).__name__, name))
   1616 

AttributeError: 'CLIPTextTransformer' object has no attribute '_build_causal_attention_mask'

Everything in the notebook prior to that line runs fine.

Perhaps I'm doing something wrong, or perhaps something has changed with the HF libraries that being used, since the notebook's original conception?

UPDATE:

I see the same issue here: drboog/ProFusion#12. It seems that transformers has changed. Downgrading to version 4.25.1 fixed the problem.

Thus changing the the pip install line at the top of the notebook to

!pip install -q --upgrade transformers==4.25.1 diffusers ftfy

...will restore full functionality.

Feel free to close this issue at your convenience. Perhaps a PR is in order.

Presumably some way to keep up to date with transformers will be preferable, but for now this is a quick fix.

The text was updated successfully, but these errors were encountered:

nasheqlbrm · 2023-08-18T12:08:47Z

@drscotthawley another fix, without having to downgrade, could be to use what the function _build_causal_attention_mask used to have (from here)

def build_causal_attention_mask(bsz, seq_len, dtype):
    # lazily create causal attention mask, with full attention between the vision tokens
    # pytorch uses additive attention mask; fill with -inf
    mask = torch.empty(bsz, seq_len, seq_len, dtype=dtype)
    mask.fill_(torch.tensor(torch.finfo(dtype).min))
    mask.triu_(1)  # zero out the lower diagonal
    mask = mask.unsqueeze(1)  # expand mask
    return mask

PS: Thanks for the mps support changes

giantvision · 2024-04-24T01:05:41Z

import torch

def build_causal_attention_mask(bsz, seq_len, dtype):
    mask = torch.empty(bsz, seq_len, seq_len, dtype=dtype)
    mask.fill_(torch.tensor(torch.finfo(dtype).min))  # fill with large negative number (acts like -inf)
    mask = mask.triu_(1)  # zero out the lower diagonal to enforce causality
    return mask.unsqueeze(1)  # add a batch dimension

# Update your function call to use the new mask function
def get_output_embeds(input_embeddings):
    bsz, seq_len = input_embeddings.shape[:2]
    causal_attention_mask = build_causal_attention_mask(bsz, seq_len, dtype=input_embeddings.dtype)
    # Getting the output embeddings involves calling the model with passing output_hidden_states=True
    # so that it doesn't just return the pooled final predictions:
    encoder_outputs = text_encoder.text_model.encoder(
        inputs_embeds=input_embeddings,
        attention_mask=None, # We aren't using an attention mask so that can be None
        causal_attention_mask=causal_attention_mask.to(torch_device),
        output_attentions=None,
        output_hidden_states=True, # We want the output embs not the final output
        return_dict=None,
    )

    # We're interested in the output hidden state only
    output = encoder_outputs[0]

    # There is a final layer norm we need to pass these through
    output = text_encoder.text_model.final_layer_norm(output)

    # And now they're ready!
    return output

out_embs_test = get_output_embeds(input_embeddings) # Feed through the model with our new function
print(out_embs_test.shape) # Check the output shape
out_embs_test # Inspect the output

frankchieng · 2024-05-28T05:13:02Z

if you install the new transformers library like version of 4.40.2,you can implement as below:

from transformers.modeling_attn_mask_utils import _create_4d_casual_attention_mask
def get_output_embeds(input_embeddings):
    input_ids = (bsz, seq_len) = input_embeddings.shape[:2]
    causal_attention_mask = _create_4d_casual_attention_mask(input_ids, dtype=input_embeddings.dtype, device=torch_device)

drscotthawley mentioned this issue Jul 3, 2023

Added Apple Silicon (MPS) support; fixed tqdm pbars #38

Merged

sclfunonr mentioned this issue Dec 12, 2023

Add comment for keeping transformers to a specific version. #47

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deep Dive NB: Quick Fix for AttributeError: 'CLIPTextTransformer' object has no attribute '_build_causal_attention_mask' #37

Deep Dive NB: Quick Fix for AttributeError: 'CLIPTextTransformer' object has no attribute '_build_causal_attention_mask' #37

drscotthawley commented Jul 2, 2023

nasheqlbrm commented Aug 18, 2023 •

edited

Loading

giantvision commented Apr 24, 2024

frankchieng commented May 28, 2024

Deep Dive NB: Quick Fix for AttributeError: 'CLIPTextTransformer' object has no attribute '_build_causal_attention_mask' #37

Deep Dive NB: Quick Fix for AttributeError: 'CLIPTextTransformer' object has no attribute '_build_causal_attention_mask' #37

Comments

drscotthawley commented Jul 2, 2023

UPDATE:

nasheqlbrm commented Aug 18, 2023 • edited Loading

giantvision commented Apr 24, 2024

frankchieng commented May 28, 2024

nasheqlbrm commented Aug 18, 2023 •

edited

Loading