- This project is about building a NLP model that can be used for Named-Entity-Recognition in the field of Cybersecurity.
distilbert-base-uncased
pretrained model was used in this project. This model is fine-tuned for this project's purpose.- To open the project, Click on the above google colab badge.
- MITRE Dataset is taken to train the model. The dataset is available in this repository as
MITRE.zip
. or Click here to download the dataset. - MITRE dataset is also uploaded to huggingface. click here to go there.
- You can also import the dataset by running the below code.
from datasets import load_dataset
dataset = load_dataset("bnsapa/cybersecurity-ner")
- The finetuned model is available on the huggingface. click here to go there.
I typed abcde is a computer malware
. I defined the context such that it implies abcde
is virus and model is able to capture that.