In the field of computer vision, researchers have repeatedly shown the value of transfer learning — pre-training a neural network model on a known task, for instance ImageNet, and then performing fine-tuning — using the trained neural network as the basis of a new purpose-specific model. In recent years, researchers have been showing that a similar technique can be useful in many natural language tasks.
BERT is a multi-layer bidirectional Transformer encoder. There are two models introduced in the paper.
BERT base – 12 layers (transformer blocks), 12 attention heads, and 110 million parameters. BERT Large – 24 layers, 16 attention heads and, 340 million parameters.
we have used BERT layer and developed a sentiment analysis model using TensorFlow 2.0. I have reached to model accurecy 77% on test dataset.model working excellent on raw text data.
Reason : In BERT, the model is made to learn from words in all positions, meaning the entire sentence.BERT incorporates the mighty Transformer in its architecture, which uses attention mechanism over the input sentence. Further, Google also used Transformers, which made the model even more accurate.
- Expand contraction
- Cleaning and removing punctuations
- Cleaning and removing URL’s
- cleaning_stopwords
- Cleaning and removing repeating characters
- Cleaning and removing Numeric numbers
- Applying Stemming
- Applying Lemmatizer