Instructions to use google-bert/bert-base-uncased with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google-bert/bert-base-uncased with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="google-bert/bert-base-uncased")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased") model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-uncased") - Inference
- Notebooks
- Google Colab
- Kaggle
Convert model to GGUF or format compatible with LM Studio
Hi @anthropoleo
BERT is a relatively small model which is not auto-regressive, in most cases using a simple python backend such as transformers suffice for most use-cases I would say, even for running the model locally on CPU.
To convert to GGUF, I would advise you to open an issue on ggml / llama.cpp repositories on GitHub and see if the maintainers are keen to add BERT support !
Hey, Did you find any way to do it?
Hi @anthropoleo , I'm in the same predicament as you, did you find a solution? I'd be grateful if you could share it.
I would also like to find a similar solution. It would be great to be able to use BERT on any UI that runs llama.cpp
I found this thread on llama.cpp issue tracker:
https://github.com/ggerganov/llama.cpp/issues/7924
It seems no one succeeded yet in converting BERT to GGUF, there is a lack of interest by experienced quantizers.
The transformers library also does not yet support BERT as a GGUF anyway, same mapping issue: