docling-project/DocLayNet-v1.2
Viewer • Updated • 80.9k • 1.88k • 17
How to use kbsooo/layoutlmv3_finetuned_doclaynet with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("token-classification", model="kbsooo/layoutlmv3_finetuned_doclaynet") # Load model directly
from transformers import AutoProcessor, AutoModelForTokenClassification
processor = AutoProcessor.from_pretrained("kbsooo/layoutlmv3_finetuned_doclaynet")
model = AutoModelForTokenClassification.from_pretrained("kbsooo/layoutlmv3_finetuned_doclaynet")This model is a fine-tuned version of LayoutLMv3 for token classification on the DocLayNet dataset.
It is designed to classify each token in a document image based on both textual and layout information.
This model can be used for:
from transformers import LayoutLMv3ForTokenClassification, AutoProcessor
import torch
repo = "kbsooo/layoutlmv3_finetuned_doclaynet"
model = LayoutLMv3ForTokenClassification.from_pretrained(repo)
processor = AutoProcessor.from_pretrained(repo)
image = ... # PIL.Image or np.array
text = "Sample document text"
encoding = processor(image, text, return_tensors="pt")
outputs = model(**encoding)
preds = torch.argmax(outputs.logits, dim=-1)
print(preds)
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
BibTeX:
@article{huang2022layoutlmv3,
title={LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking},
author={Huang, Zejiang and et al.},
journal={arXiv preprint arXiv:2112.01041},
year={2022}
}
APA:
Huang, Z., et al. (2022). LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking. arXiv preprint arXiv:2112.01041.
Base model
microsoft/layoutlmv3-base