Additional position embedding increase parameters of transformer? #226

ken-ando · 2021-12-08T20:41:24Z

This work introduces additional positional embedding for the number of tokens more than 512.

Lines 150 to 154 in 70b810e

    
           if(args.max_pos>512): 
        
               my_pos_embeddings = nn.Embedding(args.max_pos, self.bert.model.config.hidden_size) 
        
               my_pos_embeddings.weight.data[:512] = self.bert.model.embeddings.position_embeddings.weight.data 
        
               my_pos_embeddings.weight.data[512:] = self.bert.model.embeddings.position_embeddings.weight.data[-1][None,:].repeat(args.max_pos-512,1) 
        
               self.bert.model.embeddings.position_embeddings = my_pos_embeddings

But, this code doesn't seem to extend transformer.
I think if the subsequent encoder does not have additional parameters, the shape will not match.

So, I guess the transformers automatically add the parameters of transformer, is this understanding correct?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional position embedding increase parameters of transformer? #226

Additional position embedding increase parameters of transformer? #226

ken-ando commented Dec 8, 2021 •

edited

Loading

Additional position embedding increase parameters of transformer? #226

Additional position embedding increase parameters of transformer? #226

Comments

ken-ando commented Dec 8, 2021 • edited Loading

ken-ando commented Dec 8, 2021 •

edited

Loading