Instructions to use jb2k/bert-base-multilingual-cased-language-detection with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jb2k/bert-base-multilingual-cased-language-detection with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="jb2k/bert-base-multilingual-cased-language-detection")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("jb2k/bert-base-multilingual-cased-language-detection") model = AutoModelForSequenceClassification.from_pretrained("jb2k/bert-base-multilingual-cased-language-detection") - Inference
- Notebooks
- Google Colab
- Kaggle
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
bert-base-multilingual-cased-language-detection
A model for language detection with support for 45 languages
Model description
This model was created by fine-tuning bert-base-multilingual-cased on the common language dataset. This dataset has support for 45 languages, which are listed below:
Arabic, Basque, Breton, Catalan, Chinese_China, Chinese_Hongkong, Chinese_Taiwan, Chuvash, Czech, Dhivehi, Dutch, English, Esperanto, Estonian, French, Frisian, Georgian, German, Greek, Hakha_Chin, Indonesian, Interlingua, Italian, Japanese, Kabyle, Kinyarwanda, Kyrgyz, Latvian, Maltese, Mongolian, Persian, Polish, Portuguese, Romanian, Romansh_Sursilvan, Russian, Sakha, Slovenian, Spanish, Swedish, Tamil, Tatar, Turkish, Ukranian, Welsh
Evaluation
This model was evaluated on the test split of the common language dataset, and achieved the following metrics:
- Accuracy: 97.8%
- Downloads last month
- 3,367