agentlans/multilingual-text
Viewer • Updated • 5.03M • 320 • 4
How to use qikp/pika with Transformers:
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("qikp/pika", dtype="auto")pika is a simple and public domain-like tokenizer.
[UNK][EOS][PAD]pika was trained on the first 1000 rows of each language of agentlans/multilingual-text.
Due to its small corpus, pika may split words into smaller pieces. Also, some uncommon special tokens aren't present, you'll have to add them manually if needed.