SpanMarker with roberta-large on FewNERD, CoNLL2003, and OntoNotes v5

This is a SpanMarker model trained on the FewNERD, CoNLL2003, and OntoNotes v5 dataset that can be used for Named Entity Recognition. This SpanMarker model uses roberta-large as the underlying encoder.

Model Details

Model Description

Model Type: SpanMarker
Encoder: roberta-large
Maximum Sequence Length: 256 tokens
Maximum Entity Length: 8 words
Training Dataset: FewNERD, CoNLL2003, and OntoNotes v5
Language: en
License: cc-by-sa-4.0

Model Sources

Repository: SpanMarker on GitHub
Thesis: SpanMarker For Named Entity Recognition

Model Labels

Label	Examples
ORG	"IAEA", "Church 's Chicken", "Texas Chicken"

Evaluation

Metrics

Label	Precision	Recall	F1
ORG	0.8238	0.7970	0.81019

Uses

Direct Use for Inference

from span_marker import SpanMarkerModel

# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("nbroad/span-marker-roberta-large-orgs-v1")
# Run inference
entities = model.predict("The program is classified in the National Collegiate Athletic Association (NCAA) Division I Bowl Subdivision (FBS), and the team competes in the Big 12 Conference.")

Downstream Use

You can finetune this model on your own dataset.

Click to expand

from span_marker import SpanMarkerModel, Trainer

# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("nbroad/span-marker-roberta-large-orgs-v1")

# Specify a Dataset with "tokens" and "ner_tag" columns
dataset = load_dataset("conll2003") # For example CoNLL2003

# Initialize a Trainer using the pretrained model & dataset
trainer = Trainer(
    model=model,
    train_dataset=dataset["train"],
    eval_dataset=dataset["validation"],
)
trainer.train()
trainer.save_model("nbroad/span-marker-roberta-large-orgs-v1-finetuned")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Sentence length	1	23.5706	263
Entities per sentence	0	0.7865	39

Training Hyperparameters

learning_rate: 3e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.05
num_epochs: 3
mixed_precision_training: Native AMP

Training Results

Epoch	Step	Validation Loss	Validation Precision	Validation Recall	Validation F1	Validation Accuracy
0.1430	600	0.0085	0.7425	0.7383	0.7404	0.9726
0.2860	1200	0.0078	0.7503	0.7516	0.7510	0.9741
0.4290	1800	0.0077	0.6962	0.8107	0.7491	0.9718
0.5720	2400	0.0060	0.8074	0.7486	0.7769	0.9753
0.7150	3000	0.0057	0.8135	0.7717	0.7921	0.9770
0.8580	3600	0.0059	0.7997	0.7764	0.7879	0.9763
1.0010	4200	0.0057	0.7860	0.8051	0.7954	0.9771
1.1439	4800	0.0058	0.7907	0.7717	0.7811	0.9763
1.2869	5400	0.0058	0.8116	0.7803	0.7956	0.9774
1.4299	6000	0.0056	0.7918	0.7850	0.7884	0.9770
1.5729	6600	0.0056	0.8097	0.7837	0.7965	0.9769
1.7159	7200	0.0055	0.8113	0.7790	0.7948	0.9765
1.8589	7800	0.0052	0.8095	0.7970	0.8032	0.9782
2.0019	8400	0.0054	0.8244	0.7782	0.8006	0.9774
2.1449	9000	0.0053	0.8238	0.7970	0.8102	0.9782
2.2879	9600	0.0053	0.82	0.7901	0.8048	0.9773
2.4309	10200	0.0053	0.8243	0.7936	0.8086	0.9785
2.5739	10800	0.0053	0.8159	0.7953	0.8055	0.9781
2.7169	11400	0.0053	0.8072	0.8034	0.8053	0.9784
2.8599	12000	0.0052	0.8111	0.8017	0.8064	0.9782

Framework Versions

Python: 3.10.12
SpanMarker: 1.5.0
Transformers: 4.35.2
PyTorch: 2.1.0a0+32f93b1
Datasets: 2.15.0
Tokenizers: 0.15.0

Citation

BibTeX

@software{Aarsen_SpanMarker,
    author = {Aarsen, Tom},
    license = {Apache-2.0},
    title = {{SpanMarker for Named Entity Recognition}},
    url = {https://github.com/tomaarsen/SpanMarkerNER}
}

Downloads last month: 28

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for nbroad/span-marker-roberta-large-orgs-v1

Base model

FacebookAI/roberta-large

Finetuned

(458)

this model

Dataset used to train nbroad/span-marker-roberta-large-orgs-v1

Evaluation results

F1 on FewNERD, CoNLL2003, and OntoNotes v5
test set self-reported

0.810
Precision on FewNERD, CoNLL2003, and OntoNotes v5
test set self-reported

0.824
Recall on FewNERD, CoNLL2003, and OntoNotes v5
test set self-reported

0.797