Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix time_distributed layer with mask and partial_batch_size #20765

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Surya2k1
Copy link
Contributor

If the model includes an Embedding layer with mask_zero = True parameter and sub sequent model has time distributed layer it is observed that training fails in graph mode if there is partial_batch_size . This happens due to concatenation of partial batch dataset which makes batch_size to None and hence shape to (None,...).

Hence the model fails with graph execution error if we try to compare batch_size with the respective value from mask.

Hence I am proposing to omit the batch_size comparison for TF backend with graph mode. It would have been better if this check is for when there is actually a partial_batch_size but not sure how to propogate this to time distributed layer .

Fixes #20754

Code to replicate the issue:

import keras
import numpy as np

model = keras.Sequential([
    keras.Input(shape = (20,)),
    keras.layers.Embedding( input_dim = 10,
                           output_dim = 5,
                            mask_zero = True
                            ),

    keras.layers.Bidirectional(keras.layers.LSTM(units = 10, return_sequences = True )),
    keras.layers.TimeDistributed(keras.layers.Dense(units = 5, activation = "softmax")  ),#not works with mask_zero
])
model.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"],
)
X_train = np.random.uniform(1,10,size = (50,20))
Y_train = np.random.randint(1,2,size = (50,20,5))

model.fit(X_train, Y_train, epochs = 2, batch_size = 8)

@codecov-commenter
Copy link

codecov-commenter commented Jan 15, 2025

Codecov Report

Attention: Patch coverage is 42.85714% with 4 lines in your changes missing coverage. Please review.

Project coverage is 82.01%. Comparing base (e345cbd) to head (028e953).
Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
keras/src/layers/rnn/time_distributed.py 42.85% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #20765      +/-   ##
==========================================
+ Coverage   81.96%   82.01%   +0.04%     
==========================================
  Files         554      555       +1     
  Lines       51656    51861     +205     
  Branches     7996     8024      +28     
==========================================
+ Hits        42342    42533     +191     
- Misses       7367     7378      +11     
- Partials     1947     1950       +3     
Flag Coverage Δ
keras 81.83% <28.57%> (+0.04%) ⬆️
keras-jax 64.10% <0.00%> (+0.07%) ⬆️
keras-numpy 58.90% <0.00%> (-0.02%) ⬇️
keras-openvino 29.91% <0.00%> (-0.05%) ⬇️
keras-tensorflow 64.75% <28.57%> (+0.01%) ⬆️
keras-torch 64.17% <0.00%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Assigned Reviewer
Development

Successfully merging this pull request may close these issues.

Issue with using masking in Embedding Layer for POS Tagging Model
3 participants