Transformers documentation
Auto Classes
Auto Classes
多くの場合、from_pretrained()メソッドに与えられた事前学習済みモデルの名前やパスから、使用したいアーキテクチャを推測することができます。自動クラスはこの仕事をあなたに代わって行うためにここにありますので、事前学習済みの重み/設定/語彙への名前/パスを与えると自動的に関連するモデルを取得できます。
AutoConfig、AutoModel、AutoTokenizerのいずれかをインスタンス化すると、関連するアーキテクチャのクラスが直接作成されます。例えば、
model = AutoModel.from_pretrained("google-bert/bert-base-cased")これはBertModelのインスタンスであるモデルを作成します。
各タスクごと、そして各バックエンド(PyTorch、TensorFlow、またはFlax)ごとにAutoModelのクラスが存在します。
自動クラスの拡張
それぞれの自動クラスには、カスタムクラスで拡張するためのメソッドがあります。例えば、NewModelというモデルのカスタムクラスを定義した場合、NewModelConfigを確保しておけばこのようにして自動クラスに追加することができます:
from transformers import AutoConfig, AutoModel
AutoConfig.register("new-model", NewModelConfig)
AutoModel.register(NewModelConfig, NewModel)その後、通常どおりauto classesを使用することができるようになります!
あなたの
NewModelConfigがPreTrainedConfigのサブクラスである場合、そのmodel_type属性がコンフィグを登録するときに使用するキー(ここでは"new-model")と同じに設定されていることを確認してください。同様に、あなたの
NewModelがPreTrainedModelのサブクラスである場合、そのconfig_class属性がモデルを登録する際に使用するクラス(ここではNewModelConfig)と同じに設定されていることを確認してください。
AutoConfig
This is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path: str | os.PathLike[str] **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model configuration hosted inside a model repo on huggingface.co.
- A path to a directory containing a configuration file saved using the
save_pretrained() method, or the save_pretrained() method,
e.g.,
./my_model_directory/. - a path to a saved configuration JSON file, e.g.,
./my_model_directory/configuration.json.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - return_unused_kwargs (
bool, optional, defaults toFalse) — IfFalse, then this function returns just the final configuration object.If
True, then this functions returns aTuple(config, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part ofkwargswhich has not been used to updateconfigand is otherwise ignored. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - kwargs(additional keyword arguments, optional) —
The values in kwargs of any keys which are configuration attributes will be used to override the loaded
values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled
by the
return_unused_kwargskeyword parameter.
Instantiate one of the configuration classes of the library from a pretrained model configuration.
The configuration class to instantiate is selected based on the model_type property of the config object that
is loaded, or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:
- EvollaModel —
EvollaConfig(EvollaConfig model) - afmoe —
AfmoeConfig(AfmoeConfig model) - aimv2 —
Aimv2Config(Aimv2Config model) - aimv2_text_model —
Aimv2TextConfig(Aimv2TextConfig model) - aimv2_vision_model —
Aimv2VisionConfig(Aimv2VisionConfig model) - albert — AlbertConfig (AlbertConfig model)
- align — AlignConfig (AlignConfig model)
- align_text_model — AlignTextConfig (AlignTextConfig model)
- align_vision_model — AlignVisionConfig (AlignVisionConfig model)
- altclip — AltCLIPConfig (AltCLIPConfig model)
- altclip_text_model — AltCLIPTextConfig (AltCLIPTextConfig model)
- altclip_vision_model — AltCLIPVisionConfig (AltCLIPVisionConfig model)
- apertus —
ApertusConfig(ApertusConfig model) - arcee —
ArceeConfig(ArceeConfig model) - aria —
AriaConfig(AriaConfig model) - aria_text —
AriaTextConfig(AriaTextConfig model) - audio-spectrogram-transformer — ASTConfig (ASTConfig model)
- audioflamingo3 —
AudioFlamingo3Config(AudioFlamingo3Config model) - audioflamingo3_encoder —
AudioFlamingo3EncoderConfig(AudioFlamingo3EncoderConfig model) - autoformer — AutoformerConfig (AutoformerConfig model)
- aya_vision —
AyaVisionConfig(AyaVisionConfig model) - bamba —
BambaConfig(BambaConfig model) - bark — BarkConfig (BarkConfig model)
- bart — BartConfig (BartConfig model)
- beit — BeitConfig (BeitConfig model)
- bert — BertConfig (BertConfig model)
- bert-generation — BertGenerationConfig (BertGenerationConfig model)
- big_bird — BigBirdConfig (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusConfig (BigBirdPegasusConfig model)
- biogpt — BioGptConfig (BioGptConfig model)
- bit — BitConfig (BitConfig model)
- bitnet —
BitNetConfig(BitNetConfig model) - blenderbot — BlenderbotConfig (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallConfig (BlenderbotSmallConfig model)
- blip — BlipConfig (BlipConfig model)
- blip-2 — Blip2Config (Blip2Config model)
- blip_2_qformer — Blip2QFormerConfig (Blip2QFormerConfig model)
- blip_2_vision_model — Blip2VisionConfig (Blip2VisionConfig model)
- blip_text_model — BlipTextConfig (BlipTextConfig model)
- blip_vision_model — BlipVisionConfig (BlipVisionConfig model)
- bloom — BloomConfig (BloomConfig model)
- blt —
BltConfig(BltConfig model) - blt_global_transformer —
BltGlobalTransformerConfig(BltGlobalTransformerConfig model) - blt_local_decoder —
BltLocalDecoderConfig(BltLocalDecoderConfig model) - blt_local_encoder —
BltLocalEncoderConfig(BltLocalEncoderConfig model) - blt_patcher —
BltPatcherConfig(BltPatcherConfig model) - bridgetower — BridgeTowerConfig (BridgeTowerConfig model)
- bridgetower_text_model — BridgeTowerTextConfig (BridgeTowerTextConfig model)
- bridgetower_vision_model — BridgeTowerVisionConfig (BridgeTowerVisionConfig model)
- bros — BrosConfig (BrosConfig model)
- camembert — CamembertConfig (CamembertConfig model)
- canine — CanineConfig (CanineConfig model)
- chameleon —
ChameleonConfig(ChameleonConfig model) - chameleon_vqgan —
ChameleonVQVAEConfig(ChameleonVQVAEConfig model) - chinese_clip — ChineseCLIPConfig (ChineseCLIPConfig model)
- chinese_clip_text_model — ChineseCLIPTextConfig (ChineseCLIPTextConfig model)
- chinese_clip_vision_model — ChineseCLIPVisionConfig (ChineseCLIPVisionConfig model)
- chmv2 —
CHMv2Config(CHMv2Config model) - clap — ClapConfig (ClapConfig model)
- clap_audio_model — ClapAudioConfig (ClapAudioConfig model)
- clap_text_model — ClapTextConfig (ClapTextConfig model)
- clip — CLIPConfig (CLIPConfig model)
- clip_text_model — CLIPTextConfig (CLIPTextConfig model)
- clip_vision_model — CLIPVisionConfig (CLIPVisionConfig model)
- clipseg — CLIPSegConfig (CLIPSegConfig model)
- clipseg_text_model — CLIPSegTextConfig (CLIPSegTextConfig model)
- clipseg_vision_model — CLIPSegVisionConfig (CLIPSegVisionConfig model)
- clvp — ClvpConfig (ClvpConfig model)
- clvp_decoder — ClvpDecoderConfig (ClvpDecoderConfig model)
- clvp_encoder — ClvpEncoderConfig (ClvpEncoderConfig model)
- codegen — CodeGenConfig (CodeGenConfig model)
- cohere —
CohereConfig(CohereConfig model) - cohere2 —
Cohere2Config(Cohere2Config model) - cohere2_vision —
Cohere2VisionConfig(Cohere2VisionConfig model) - cohere_asr —
CohereAsrConfig(CohereAsrConfig model) - colmodernvbert —
ColModernVBertConfig(ColModernVBertConfig model) - colpali —
ColPaliConfig(ColPaliConfig model) - colqwen2 —
ColQwen2Config(ColQwen2Config model) - conditional_detr — ConditionalDetrConfig (ConditionalDetrConfig model)
- convbert — ConvBertConfig (ConvBertConfig model)
- convnext — ConvNextConfig (ConvNextConfig model)
- convnextv2 — ConvNextV2Config (ConvNextV2Config model)
- cpmant — CpmAntConfig (CpmAntConfig model)
- csm —
CsmConfig(CsmConfig model) - csm_depth_decoder_model —
CsmDepthDecoderConfig(CsmDepthDecoderConfig model) - ctrl — CTRLConfig (CTRLConfig model)
- cvt — CvtConfig (CvtConfig model)
- cwm —
CwmConfig(CwmConfig model) - d_fine —
DFineConfig(DFineConfig model) - dab-detr —
DabDetrConfig(DabDetrConfig model) - dac —
DacConfig(DacConfig model) - data2vec-audio — Data2VecAudioConfig (Data2VecAudioConfig model)
- data2vec-text — Data2VecTextConfig (Data2VecTextConfig model)
- data2vec-vision — Data2VecVisionConfig (Data2VecVisionConfig model)
- dbrx —
DbrxConfig(DbrxConfig model) - deberta — DebertaConfig (DebertaConfig model)
- deberta-v2 — DebertaV2Config (DebertaV2Config model)
- decision_transformer — DecisionTransformerConfig (DecisionTransformerConfig model)
- deepseek_v2 —
DeepseekV2Config(DeepseekV2Config model) - deepseek_v3 —
DeepseekV3Config(DeepseekV3Config model) - deepseek_vl —
DeepseekVLConfig(DeepseekVLConfig model) - deepseek_vl_hybrid —
DeepseekVLHybridConfig(DeepseekVLHybridConfig model) - deformable_detr — DeformableDetrConfig (DeformableDetrConfig model)
- deit — DeiTConfig (DeiTConfig model)
- depth_anything —
DepthAnythingConfig(DepthAnythingConfig model) - depth_pro —
DepthProConfig(DepthProConfig model) - detr — DetrConfig (DetrConfig model)
- dia —
DiaConfig(DiaConfig model) - dia_decoder —
DiaDecoderConfig(DiaDecoderConfig model) - dia_encoder —
DiaEncoderConfig(DiaEncoderConfig model) - diffllama —
DiffLlamaConfig(DiffLlamaConfig model) - dinat — DinatConfig (DinatConfig model)
- dinov2 —
Dinov2Config(Dinov2Config model) - dinov2_with_registers —
Dinov2WithRegistersConfig(Dinov2WithRegistersConfig model) - dinov3_convnext —
DINOv3ConvNextConfig(DINOv3ConvNextConfig model) - dinov3_vit —
DINOv3ViTConfig(DINOv3ViTConfig model) - distilbert —
DistilBertConfig(DistilBertConfig model) - doge —
DogeConfig(DogeConfig model) - donut-swin —
DonutSwinConfig(DonutSwinConfig model) - dots1 —
Dots1Config(Dots1Config model) - dpr —
DPRConfig(DPRConfig model) - dpt —
DPTConfig(DPTConfig model) - edgetam —
EdgeTamConfig(EdgeTamConfig model) - edgetam_video —
EdgeTamVideoConfig(EdgeTamVideoConfig model) - edgetam_vision_model —
EdgeTamVisionConfig(EdgeTamVisionConfig model) - efficientloftr —
EfficientLoFTRConfig(EfficientLoFTRConfig model) - efficientnet —
EfficientNetConfig(EfficientNetConfig model) - electra —
ElectraConfig(ElectraConfig model) - emu3 —
Emu3Config(Emu3Config model) - emu3_text_model —
Emu3TextConfig(Emu3TextConfig model) - emu3_vqgan —
Emu3VQVAEConfig(Emu3VQVAEConfig model) - encodec —
EncodecConfig(EncodecConfig model) - encoder-decoder —
EncoderDecoderConfig(EncoderDecoderConfig model) - eomt —
EomtConfig(EomtConfig model) - eomt_dinov3 —
EomtDinov3Config(EomtDinov3Config model) - ernie —
ErnieConfig(ErnieConfig model) - ernie4_5 —
Ernie4_5Config(Ernie4_5Config model) - ernie4_5_moe —
Ernie4_5_MoeConfig(Ernie4_5_MoeConfig model) - ernie4_5_vl_moe —
Ernie4_5_VLMoeConfig(Ernie4_5_VLMoeConfig model) - ernie4_5_vl_moe_text —
Ernie4_5_VLMoeTextConfig(Ernie4_5_VLMoeTextConfig model) - ernie4_5_vl_moe_vision —
Ernie4_5_VLMoeVisionConfig(Ernie4_5_VLMoeVisionConfig model) - esm —
EsmConfig(EsmConfig model) - eurobert —
EuroBertConfig(EuroBertConfig model) - evolla —
EvollaConfig(EvollaConfig model) - exaone4 —
Exaone4Config(Exaone4Config model) - exaone_moe —
ExaoneMoeConfig(ExaoneMoeConfig model) - falcon —
FalconConfig(FalconConfig model) - falcon_h1 —
FalconH1Config(FalconH1Config model) - falcon_mamba —
FalconMambaConfig(FalconMambaConfig model) - fast_vlm —
FastVlmConfig(FastVlmConfig model) - fastspeech2_conformer —
FastSpeech2ConformerConfig(FastSpeech2ConformerConfig model) - fastspeech2_conformer_hifigan —
FastSpeech2ConformerHifiGanConfig(FastSpeech2ConformerHifiGanConfig model) - fastspeech2_conformer_with_hifigan —
FastSpeech2ConformerWithHifiGanConfig(FastSpeech2ConformerWithHifiGanConfig model) - flaubert —
FlaubertConfig(FlaubertConfig model) - flava —
FlavaConfig(FlavaConfig model) - flava_image_model —
FlavaImageConfig(FlavaImageConfig model) - flava_multimodal_model —
FlavaMultimodalConfig(FlavaMultimodalConfig model) - flava_text_model —
FlavaTextConfig(FlavaTextConfig model) - flex_olmo —
FlexOlmoConfig(FlexOlmoConfig model) - florence2 —
Florence2Config(Florence2Config model) - florence_vision —
Florence2VisionConfig(Florence2VisionConfig model) - fnet —
FNetConfig(FNetConfig model) - focalnet —
FocalNetConfig(FocalNetConfig model) - fsmt —
FSMTConfig(FSMTConfig model) - funnel —
FunnelConfig(FunnelConfig model) - fuyu —
FuyuConfig(FuyuConfig model) - gemma —
GemmaConfig(GemmaConfig model) - gemma2 —
Gemma2Config(Gemma2Config model) - gemma3 —
Gemma3Config(Gemma3Config model) - gemma3_text —
Gemma3TextConfig(Gemma3TextConfig model) - gemma3n —
Gemma3nConfig(Gemma3nConfig model) - gemma3n_audio —
Gemma3nAudioConfig(Gemma3nAudioConfig model) - gemma3n_text —
Gemma3nTextConfig(Gemma3nTextConfig model) - gemma3n_vision —
Gemma3nVisionConfig(Gemma3nVisionConfig model) - gemma4 —
Gemma4Config(Gemma4Config model) - gemma4_audio —
Gemma4AudioConfig(Gemma4AudioConfig model) - gemma4_text —
Gemma4TextConfig(Gemma4TextConfig model) - gemma4_vision —
Gemma4VisionConfig(Gemma4VisionConfig model) - git —
GitConfig(GitConfig model) - git_vision_model —
GitVisionConfig(GitVisionConfig model) - glm —
GlmConfig(GlmConfig model) - glm4 —
Glm4Config(Glm4Config model) - glm46v —
Glm46VConfig(Glm46VConfig model) - glm4_moe —
Glm4MoeConfig(Glm4MoeConfig model) - glm4_moe_lite —
Glm4MoeLiteConfig(Glm4MoeLiteConfig model) - glm4v —
Glm4vConfig(Glm4vConfig model) - glm4v_moe —
Glm4vMoeConfig(Glm4vMoeConfig model) - glm4v_moe_text —
Glm4vMoeTextConfig(Glm4vMoeTextConfig model) - glm4v_moe_vision —
Glm4vMoeVisionConfig(Glm4vMoeVisionConfig model) - glm4v_text —
Glm4vTextConfig(Glm4vTextConfig model) - glm4v_vision —
Glm4vVisionConfig(Glm4vVisionConfig model) - glm_image —
GlmImageConfig(GlmImageConfig model) - glm_image_text —
GlmImageTextConfig(GlmImageTextConfig model) - glm_image_vision —
GlmImageVisionConfig(GlmImageVisionConfig model) - glm_image_vqmodel —
GlmImageVQVAEConfig(GlmImageVQVAEConfig model) - glm_moe_dsa —
GlmMoeDsaConfig(GlmMoeDsaConfig model) - glm_ocr —
GlmOcrConfig(GlmOcrConfig model) - glm_ocr_text —
GlmOcrTextConfig(GlmOcrTextConfig model) - glm_ocr_vision —
GlmOcrVisionConfig(GlmOcrVisionConfig model) - glmasr —
GlmAsrConfig(GlmAsrConfig model) - glmasr_encoder —
GlmAsrEncoderConfig(GlmAsrEncoderConfig model) - glpn —
GLPNConfig(GLPNConfig model) - got_ocr2 —
GotOcr2Config(GotOcr2Config model) - gpt-sw3 —
GPT2Config(GPT2Config model) - gpt2 —
GPT2Config(GPT2Config model) - gpt_bigcode —
GPTBigCodeConfig(GPTBigCodeConfig model) - gpt_neo —
GPTNeoConfig(GPTNeoConfig model) - gpt_neox —
GPTNeoXConfig(GPTNeoXConfig model) - gpt_neox_japanese —
GPTNeoXJapaneseConfig(GPTNeoXJapaneseConfig model) - gpt_oss —
GptOssConfig(GptOssConfig model) - gptj —
GPTJConfig(GPTJConfig model) - granite —
GraniteConfig(GraniteConfig model) - granite_speech —
GraniteSpeechConfig(GraniteSpeechConfig model) - granite_speech_encoder —
GraniteSpeechEncoderConfig(GraniteSpeechEncoderConfig model) - granitemoe —
GraniteMoeConfig(GraniteMoeConfig model) - granitemoehybrid —
GraniteMoeHybridConfig(GraniteMoeHybridConfig model) - granitemoeshared —
GraniteMoeSharedConfig(GraniteMoeSharedConfig model) - grounding-dino —
GroundingDinoConfig(GroundingDinoConfig model) - groupvit —
GroupViTConfig(GroupViTConfig model) - groupvit_text_model —
GroupViTTextConfig(GroupViTTextConfig model) - groupvit_vision_model —
GroupViTVisionConfig(GroupViTVisionConfig model) - helium —
HeliumConfig(HeliumConfig model) - hgnet_v2 —
HGNetV2Config(HGNetV2Config model) - hiera —
HieraConfig(HieraConfig model) - higgs_audio_v2 —
HiggsAudioV2Config(HiggsAudioV2Config model) - higgs_audio_v2_tokenizer —
HiggsAudioV2TokenizerConfig(HiggsAudioV2TokenizerConfig model) - hubert —
HubertConfig(HubertConfig model) - hunyuan_v1_dense —
HunYuanDenseV1Config(HunYuanDenseV1Config model) - hunyuan_v1_moe —
HunYuanMoEV1Config(HunYuanMoEV1Config model) - ibert —
IBertConfig(IBertConfig model) - idefics —
IdeficsConfig(IdeficsConfig model) - idefics2 —
Idefics2Config(Idefics2Config model) - idefics2_perceiver —
Idefics2PerceiverConfig(Idefics2PerceiverConfig model) - idefics2_vision —
Idefics2VisionConfig(Idefics2VisionConfig model) - idefics3 —
Idefics3Config(Idefics3Config model) - idefics3_vision —
Idefics3VisionConfig(Idefics3VisionConfig model) - idefics_perciever —
IdeficsPerceiverConfig(IdeficsPerceiverConfig model) - idefics_vision —
IdeficsVisionConfig(IdeficsVisionConfig model) - ijepa —
IJepaConfig(IJepaConfig model) - imagegpt —
ImageGPTConfig(ImageGPTConfig model) - informer —
InformerConfig(InformerConfig model) - instructblip —
InstructBlipConfig(InstructBlipConfig model) - instructblip_qformer —
InstructBlipQFormerConfig(InstructBlipQFormerConfig model) - instructblip_vision_model —
InstructBlipVisionConfig(InstructBlipVisionConfig model) - instructblipvideo —
InstructBlipVideoConfig(InstructBlipVideoConfig model) - instructblipvideo_qformer —
InstructBlipVideoQFormerConfig(InstructBlipVideoQFormerConfig model) - instructblipvideo_vision_model —
InstructBlipVideoVisionConfig(InstructBlipVideoVisionConfig model) - internvl —
InternVLConfig(InternVLConfig model) - internvl_vision —
InternVLVisionConfig(InternVLVisionConfig model) - jais2 —
Jais2Config(Jais2Config model) - jamba —
JambaConfig(JambaConfig model) - janus —
JanusConfig(JanusConfig model) - janus_vision_model —
JanusVisionConfig(JanusVisionConfig model) - janus_vqgan —
JanusVQVAEConfig(JanusVQVAEConfig model) - jetmoe —
JetMoeConfig(JetMoeConfig model) - jina_embeddings_v3 —
JinaEmbeddingsV3Config(JinaEmbeddingsV3Config model) - kosmos-2 —
Kosmos2Config(Kosmos2Config model) - kosmos-2.5 —
Kosmos2_5Config(Kosmos2_5Config model) - kosmos_2_5_text_model —
Kosmos2_5TextConfig(Kosmos2_5TextConfig model) - kosmos_2_5_vision_model —
Kosmos2_5VisionConfig(Kosmos2_5VisionConfig model) - kosmos_2_text_model —
Kosmos2TextConfig(Kosmos2TextConfig model) - kosmos_2_vision_model —
Kosmos2VisionConfig(Kosmos2VisionConfig model) - kyutai_speech_to_text —
KyutaiSpeechToTextConfig(KyutaiSpeechToTextConfig model) - lasr_ctc —
LasrCTCConfig(LasrCTCConfig model) - lasr_encoder —
LasrEncoderConfig(LasrEncoderConfig model) - layoutlm —
LayoutLMConfig(LayoutLMConfig model) - layoutlmv2 —
LayoutLMv2Config(LayoutLMv2Config model) - layoutlmv3 —
LayoutLMv3Config(LayoutLMv3Config model) - layoutxlm —
LayoutXLMConfig(LayoutXLMConfig model) - led —
LEDConfig(LEDConfig model) - levit —
LevitConfig(LevitConfig model) - lfm2 —
Lfm2Config(Lfm2Config model) - lfm2_moe —
Lfm2MoeConfig(Lfm2MoeConfig model) - lfm2_vl —
Lfm2VlConfig(Lfm2VlConfig model) - lightglue —
LightGlueConfig(LightGlueConfig model) - lighton_ocr —
LightOnOcrConfig(LightOnOcrConfig model) - lilt —
LiltConfig(LiltConfig model) - llama —
LlamaConfig(LlamaConfig model) - llama4 —
Llama4Config(Llama4Config model) - llama4_text —
Llama4TextConfig(Llama4TextConfig model) - llama4_vision_model —
Llama4VisionConfig(Llama4VisionConfig model) - llava —
LlavaConfig(LlavaConfig model) - llava_next —
LlavaNextConfig(LlavaNextConfig model) - llava_next_video —
LlavaNextVideoConfig(LlavaNextVideoConfig model) - llava_onevision —
LlavaOnevisionConfig(LlavaOnevisionConfig model) - longcat_flash —
LongcatFlashConfig(LongcatFlashConfig model) - longformer —
LongformerConfig(LongformerConfig model) - longt5 —
LongT5Config(LongT5Config model) - luke —
LukeConfig(LukeConfig model) - lw_detr —
LwDetrConfig(LwDetrConfig model) - lw_detr_vit —
LwDetrViTConfig(LwDetrViTConfig model) - lxmert —
LxmertConfig(LxmertConfig model) - m2m_100 —
M2M100Config(M2M100Config model) - mamba —
MambaConfig(MambaConfig model) - mamba2 —
Mamba2Config(Mamba2Config model) - marian —
MarianConfig(MarianConfig model) - markuplm —
MarkupLMConfig(MarkupLMConfig model) - mask2former —
Mask2FormerConfig(Mask2FormerConfig model) - maskformer —
MaskFormerConfig(MaskFormerConfig model) - maskformer-swin —
MaskFormerSwinConfig(MaskFormerSwinConfig model) - mbart —
MBartConfig(MBartConfig model) - megatron-bert —
MegatronBertConfig(MegatronBertConfig model) - metaclip_2 —
MetaClip2Config(MetaClip2Config model) - metaclip_2_text_model —
MetaClip2TextConfig(MetaClip2TextConfig model) - metaclip_2_vision_model —
MetaClip2VisionConfig(MetaClip2VisionConfig model) - mgp-str —
MgpstrConfig(MgpstrConfig model) - mimi —
MimiConfig(MimiConfig model) - minimax —
MiniMaxConfig(MiniMaxConfig model) - minimax_m2 —
MiniMaxM2Config(MiniMaxM2Config model) - ministral —
MinistralConfig(MinistralConfig model) - ministral3 —
Ministral3Config(Ministral3Config model) - mistral —
MistralConfig(MistralConfig model) - mistral3 —
Mistral3Config(Mistral3Config model) - mistral4 —
Mistral4Config(Mistral4Config model) - mixtral —
MixtralConfig(MixtralConfig model) - mlcd —
MLCDVisionConfig(MLCDVisionConfig model) - mlcd_vision_model —
MLCDVisionConfig(MLCDVisionConfig model) - mllama —
MllamaConfig(MllamaConfig model) - mllama_text_model —
MllamaTextConfig(MllamaTextConfig model) - mllama_vision_model —
MllamaVisionConfig(MllamaVisionConfig model) - mm-grounding-dino —
MMGroundingDinoConfig(MMGroundingDinoConfig model) - mobilebert —
MobileBertConfig(MobileBertConfig model) - mobilenet_v1 —
MobileNetV1Config(MobileNetV1Config model) - mobilenet_v2 —
MobileNetV2Config(MobileNetV2Config model) - mobilevit —
MobileViTConfig(MobileViTConfig model) - mobilevitv2 —
MobileViTV2Config(MobileViTV2Config model) - modernbert —
ModernBertConfig(ModernBertConfig model) - modernbert-decoder —
ModernBertDecoderConfig(ModernBertDecoderConfig model) - modernvbert —
ModernVBertConfig(ModernVBertConfig model) - moonshine —
MoonshineConfig(MoonshineConfig model) - moonshine_streaming —
MoonshineStreamingConfig(MoonshineStreamingConfig model) - moonshine_streaming_encoder —
MoonshineStreamingEncoderConfig(MoonshineStreamingEncoderConfig model) - moshi —
MoshiConfig(MoshiConfig model) - moshi_depth —
MoshiDepthConfig(MoshiDepthConfig model) - mpnet —
MPNetConfig(MPNetConfig model) - mpt —
MptConfig(MptConfig model) - mra —
MraConfig(MraConfig model) - mt5 —
MT5Config(MT5Config model) - musicflamingo —
MusicFlamingoConfig(MusicFlamingoConfig model) - musicgen —
MusicgenConfig(MusicgenConfig model) - musicgen_decoder —
MusicgenDecoderConfig(MusicgenDecoderConfig model) - musicgen_melody —
MusicgenMelodyConfig(MusicgenMelodyConfig model) - musicgen_melody_decoder —
MusicgenMelodyDecoderConfig(MusicgenMelodyDecoderConfig model) - mvp —
MvpConfig(MvpConfig model) - nanochat —
NanoChatConfig(NanoChatConfig model) - nemotron —
NemotronConfig(NemotronConfig model) - nemotron_h —
NemotronHConfig(NemotronHConfig model) - nllb-moe —
NllbMoeConfig(NllbMoeConfig model) - nomic_bert —
NomicBertConfig(NomicBertConfig model) - nougat —
NougatConfig(NougatConfig model) - nystromformer —
NystromformerConfig(NystromformerConfig model) - olmo —
OlmoConfig(OlmoConfig model) - olmo2 —
Olmo2Config(Olmo2Config model) - olmo3 —
Olmo3Config(Olmo3Config model) - olmo_hybrid —
OlmoHybridConfig(OlmoHybridConfig model) - olmoe —
OlmoeConfig(OlmoeConfig model) - omdet-turbo —
OmDetTurboConfig(OmDetTurboConfig model) - oneformer —
OneFormerConfig(OneFormerConfig model) - openai-gpt —
OpenAIGPTConfig(OpenAIGPTConfig model) - opt —
OPTConfig(OPTConfig model) - ovis2 —
Ovis2Config(Ovis2Config model) - owlv2 —
Owlv2Config(Owlv2Config model) - owlv2_text_model —
Owlv2TextConfig(Owlv2TextConfig model) - owlv2_vision_model —
Owlv2VisionConfig(Owlv2VisionConfig model) - owlvit —
OwlViTConfig(OwlViTConfig model) - owlvit_text_model —
OwlViTTextConfig(OwlViTTextConfig model) - owlvit_vision_model —
OwlViTVisionConfig(OwlViTVisionConfig model) - paddleocr_vl —
PaddleOCRVLConfig(PaddleOCRVLConfig model) - paddleocr_vl_text —
PaddleOCRTextConfig(PaddleOCRTextConfig model) - paddleocr_vl_vision —
PaddleOCRVisionConfig(PaddleOCRVisionConfig model) - paligemma —
PaliGemmaConfig(PaliGemmaConfig model) - parakeet_ctc —
ParakeetCTCConfig(ParakeetCTCConfig model) - parakeet_encoder —
ParakeetEncoderConfig(ParakeetEncoderConfig model) - patchtsmixer —
PatchTSMixerConfig(PatchTSMixerConfig model) - patchtst —
PatchTSTConfig(PatchTSTConfig model) - pe_audio —
PeAudioConfig(PeAudioConfig model) - pe_audio_encoder —
PeAudioEncoderConfig(PeAudioEncoderConfig model) - pe_audio_video —
PeAudioVideoConfig(PeAudioVideoConfig model) - pe_audio_video_encoder —
PeAudioVideoEncoderConfig(PeAudioVideoEncoderConfig model) - pe_video —
PeVideoConfig(PeVideoConfig model) - pe_video_encoder —
PeVideoEncoderConfig(PeVideoEncoderConfig model) - pegasus —
PegasusConfig(PegasusConfig model) - pegasus_x —
PegasusXConfig(PegasusXConfig model) - perceiver —
PerceiverConfig(PerceiverConfig model) - perception_lm —
PerceptionLMConfig(PerceptionLMConfig model) - persimmon —
PersimmonConfig(PersimmonConfig model) - phi —
PhiConfig(PhiConfig model) - phi3 —
Phi3Config(Phi3Config model) - phi4_multimodal —
Phi4MultimodalConfig(Phi4MultimodalConfig model) - phi4_multimodal_audio —
Phi4MultimodalAudioConfig(Phi4MultimodalAudioConfig model) - phi4_multimodal_vision —
Phi4MultimodalVisionConfig(Phi4MultimodalVisionConfig model) - phimoe —
PhimoeConfig(PhimoeConfig model) - pi0 —
PI0Config(PI0Config model) - pix2struct —
Pix2StructConfig(Pix2StructConfig model) - pix2struct_text_model —
Pix2StructTextConfig(Pix2StructTextConfig model) - pix2struct_vision_model —
Pix2StructVisionConfig(Pix2StructVisionConfig model) - pixio —
PixioConfig(PixioConfig model) - pixtral —
PixtralVisionConfig(PixtralVisionConfig model) - plbart —
PLBartConfig(PLBartConfig model) - poolformer —
PoolFormerConfig(PoolFormerConfig model) - pop2piano —
Pop2PianoConfig(Pop2PianoConfig model) - pp_chart2table —
PPChart2TableConfig(PPChart2TableConfig model) - pp_doclayout_v2 —
PPDocLayoutV2Config(PPDocLayoutV2Config model) - pp_doclayout_v3 —
PPDocLayoutV3Config(PPDocLayoutV3Config model) - pp_lcnet —
PPLCNetConfig(PPLCNetConfig model) - pp_lcnet_v3 —
PPLCNetV3Config(PPLCNetV3Config model) - pp_ocrv5_mobile_det —
PPOCRV5MobileDetConfig(PPOCRV5MobileDetConfig model) - pp_ocrv5_mobile_rec —
PPOCRV5MobileRecConfig(PPOCRV5MobileRecConfig model) - pp_ocrv5_server_det —
PPOCRV5ServerDetConfig(PPOCRV5ServerDetConfig model) - pp_ocrv5_server_rec —
PPOCRV5ServerRecConfig(PPOCRV5ServerRecConfig model) - prompt_depth_anything —
PromptDepthAnythingConfig(PromptDepthAnythingConfig model) - prophetnet —
ProphetNetConfig(ProphetNetConfig model) - pvt —
PvtConfig(PvtConfig model) - pvt_v2 —
PvtV2Config(PvtV2Config model) - qianfan_ocr —
QianfanOCRConfig(QianfanOCRConfig model) - qianfan_ocr_vision —
QianfanOCRVisionConfig(QianfanOCRVisionConfig model) - qwen2 —
Qwen2Config(Qwen2Config model) - qwen2_5_omni —
Qwen2_5OmniConfig(Qwen2_5OmniConfig model) - qwen2_5_omni_audio_encoder —
Qwen2_5OmniAudioEncoderConfig(Qwen2_5OmniAudioEncoderConfig model) - qwen2_5_omni_bigvgan —
Qwen2_5OmniBigVGANConfig(Qwen2_5OmniBigVGANConfig model) - qwen2_5_omni_dit —
Qwen2_5OmniDiTConfig(Qwen2_5OmniDiTConfig model) - qwen2_5_omni_talker —
Qwen2_5OmniTalkerConfig(Qwen2_5OmniTalkerConfig model) - qwen2_5_omni_text —
Qwen2_5OmniTextConfig(Qwen2_5OmniTextConfig model) - qwen2_5_omni_thinker —
Qwen2_5OmniThinkerConfig(Qwen2_5OmniThinkerConfig model) - qwen2_5_omni_token2wav —
Qwen2_5OmniToken2WavConfig(Qwen2_5OmniToken2WavConfig model) - qwen2_5_omni_vision_encoder —
Qwen2_5OmniVisionEncoderConfig(Qwen2_5OmniVisionEncoderConfig model) - qwen2_5_vl —
Qwen2_5_VLConfig(Qwen2_5_VLConfig model) - qwen2_5_vl_text —
Qwen2_5_VLTextConfig(Qwen2_5_VLTextConfig model) - qwen2_5_vl_vision —
Qwen2_5_VLVisionConfig(Qwen2_5_VLVisionConfig model) - qwen2_audio —
Qwen2AudioConfig(Qwen2AudioConfig model) - qwen2_audio_encoder —
Qwen2AudioEncoderConfig(Qwen2AudioEncoderConfig model) - qwen2_moe —
Qwen2MoeConfig(Qwen2MoeConfig model) - qwen2_vl —
Qwen2VLConfig(Qwen2VLConfig model) - qwen2_vl_text —
Qwen2VLTextConfig(Qwen2VLTextConfig model) - qwen2_vl_vision —
Qwen2VLVisionConfig(Qwen2VLVisionConfig model) - qwen3 —
Qwen3Config(Qwen3Config model) - qwen3_5 —
Qwen3_5Config(Qwen3_5Config model) - qwen3_5_moe —
Qwen3_5MoeConfig(Qwen3_5MoeConfig model) - qwen3_5_moe_text —
Qwen3_5MoeTextConfig(Qwen3_5MoeTextConfig model) - qwen3_5_moe_vision —
Qwen3_5MoeVisionConfig(Qwen3_5MoeVisionConfig model) - qwen3_5_text —
Qwen3_5TextConfig(Qwen3_5TextConfig model) - qwen3_5_vision —
Qwen3_5VisionConfig(Qwen3_5VisionConfig model) - qwen3_moe —
Qwen3MoeConfig(Qwen3MoeConfig model) - qwen3_next —
Qwen3NextConfig(Qwen3NextConfig model) - qwen3_omni_moe —
Qwen3OmniMoeConfig(Qwen3OmniMoeConfig model) - qwen3_omni_moe_audio_encoder —
Qwen3OmniMoeAudioEncoderConfig(Qwen3OmniMoeAudioEncoderConfig model) - qwen3_omni_moe_talker_code_predictor —
Qwen3OmniMoeTalkerCodePredictorConfig(Qwen3OmniMoeTalkerCodePredictorConfig model) - qwen3_omni_moe_talker_text —
Qwen3OmniMoeTalkerTextConfig(Qwen3OmniMoeTalkerTextConfig model) - qwen3_omni_moe_text —
Qwen3OmniMoeTextConfig(Qwen3OmniMoeTextConfig model) - qwen3_omni_moe_thinker —
Qwen3OmniMoeThinkerConfig(Qwen3OmniMoeThinkerConfig model) - qwen3_omni_moe_vision_encoder —
Qwen3OmniMoeVisionEncoderConfig(Qwen3OmniMoeVisionEncoderConfig model) - qwen3_vl —
Qwen3VLConfig(Qwen3VLConfig model) - qwen3_vl_moe —
Qwen3VLMoeConfig(Qwen3VLMoeConfig model) - qwen3_vl_moe_text —
Qwen3VLMoeTextConfig(Qwen3VLMoeTextConfig model) - qwen3_vl_moe_vision —
Qwen3VLMoeVisionConfig(Qwen3VLMoeVisionConfig model) - qwen3_vl_text —
Qwen3VLTextConfig(Qwen3VLTextConfig model) - qwen3_vl_vision —
Qwen3VLVisionConfig(Qwen3VLVisionConfig model) - rag —
RagConfig(RagConfig model) - recurrent_gemma —
RecurrentGemmaConfig(RecurrentGemmaConfig model) - reformer —
ReformerConfig(ReformerConfig model) - regnet —
RegNetConfig(RegNetConfig model) - rembert —
RemBertConfig(RemBertConfig model) - resnet —
ResNetConfig(ResNetConfig model) - roberta —
RobertaConfig(RobertaConfig model) - roberta-prelayernorm —
RobertaPreLayerNormConfig(RobertaPreLayerNormConfig model) - roc_bert —
RoCBertConfig(RoCBertConfig model) - roformer —
RoFormerConfig(RoFormerConfig model) - rt_detr —
RTDetrConfig(RTDetrConfig model) - rt_detr_resnet —
RTDetrResNetConfig(RTDetrResNetConfig model) - rt_detr_v2 —
RTDetrV2Config(RTDetrV2Config model) - rwkv —
RwkvConfig(RwkvConfig model) - sam —
SamConfig(SamConfig model) - sam2 —
Sam2Config(Sam2Config model) - sam2_hiera_det_model —
Sam2HieraDetConfig(Sam2HieraDetConfig model) - sam2_video —
Sam2VideoConfig(Sam2VideoConfig model) - sam2_vision_model —
Sam2VisionConfig(Sam2VisionConfig model) - sam3 —
Sam3Config(Sam3Config model) - sam3_detr_decoder —
Sam3DETRDecoderConfig(Sam3DETRDecoderConfig model) - sam3_detr_encoder —
Sam3DETREncoderConfig(Sam3DETREncoderConfig model) - sam3_geometry_encoder —
Sam3GeometryEncoderConfig(Sam3GeometryEncoderConfig model) - sam3_lite_text —
Sam3LiteTextConfig(Sam3LiteTextConfig model) - sam3_lite_text_detr_decoder —
Sam3LiteTextDETRDecoderConfig(Sam3LiteTextDETRDecoderConfig model) - sam3_lite_text_detr_encoder —
Sam3LiteTextDETREncoderConfig(Sam3LiteTextDETREncoderConfig model) - sam3_lite_text_geometry_encoder —
Sam3LiteTextGeometryEncoderConfig(Sam3LiteTextGeometryEncoderConfig model) - sam3_lite_text_mask_decoder —
Sam3LiteTextMaskDecoderConfig(Sam3LiteTextMaskDecoderConfig model) - sam3_lite_text_text_model —
Sam3LiteTextTextConfig(Sam3LiteTextTextConfig model) - sam3_mask_decoder —
Sam3MaskDecoderConfig(Sam3MaskDecoderConfig model) - sam3_tracker —
Sam3TrackerConfig(Sam3TrackerConfig model) - sam3_tracker_video —
Sam3TrackerVideoConfig(Sam3TrackerVideoConfig model) - sam3_video —
Sam3VideoConfig(Sam3VideoConfig model) - sam3_vision_model —
Sam3VisionConfig(Sam3VisionConfig model) - sam3_vit_model —
Sam3ViTConfig(Sam3ViTConfig model) - sam_hq —
SamHQConfig(SamHQConfig model) - sam_hq_vision_model —
SamHQVisionConfig(SamHQVisionConfig model) - sam_vision_model —
SamVisionConfig(SamVisionConfig model) - seamless_m4t —
SeamlessM4TConfig(SeamlessM4TConfig model) - seamless_m4t_v2 —
SeamlessM4Tv2Config(SeamlessM4Tv2Config model) - seed_oss —
SeedOssConfig(SeedOssConfig model) - segformer —
SegformerConfig(SegformerConfig model) - seggpt —
SegGptConfig(SegGptConfig model) - sew —
SEWConfig(SEWConfig model) - sew-d —
SEWDConfig(SEWDConfig model) - shieldgemma2 —
ShieldGemma2Config(ShieldGemma2Config model) - siglip —
SiglipConfig(SiglipConfig model) - siglip2 —
Siglip2Config(Siglip2Config model) - siglip2_text_model —
Siglip2TextConfig(Siglip2TextConfig model) - siglip2_vision_model —
Siglip2VisionConfig(Siglip2VisionConfig model) - siglip_text_model —
SiglipTextConfig(SiglipTextConfig model) - siglip_vision_model —
SiglipVisionConfig(SiglipVisionConfig model) - slanext —
SLANeXtConfig(SLANeXtConfig model) - smollm3 —
SmolLM3Config(SmolLM3Config model) - smolvlm —
SmolVLMConfig(SmolVLMConfig model) - smolvlm_vision —
SmolVLMVisionConfig(SmolVLMVisionConfig model) - solar_open —
SolarOpenConfig(SolarOpenConfig model) - speech-encoder-decoder —
SpeechEncoderDecoderConfig(SpeechEncoderDecoderConfig model) - speech_to_text —
Speech2TextConfig(Speech2TextConfig model) - speecht5 —
SpeechT5Config(SpeechT5Config model) - speecht5_hifigan —
SpeechT5HifiGanConfig(SpeechT5HifiGanConfig model) - splinter —
SplinterConfig(SplinterConfig model) - squeezebert —
SqueezeBertConfig(SqueezeBertConfig model) - stablelm —
StableLmConfig(StableLmConfig model) - starcoder2 —
Starcoder2Config(Starcoder2Config model) - superglue —
SuperGlueConfig(SuperGlueConfig model) - superpoint —
SuperPointConfig(SuperPointConfig model) - swiftformer —
SwiftFormerConfig(SwiftFormerConfig model) - swin —
SwinConfig(SwinConfig model) - swin2sr —
Swin2SRConfig(Swin2SRConfig model) - swinv2 —
Swinv2Config(Swinv2Config model) - switch_transformers —
SwitchTransformersConfig(SwitchTransformersConfig model) - t5 —
T5Config(T5Config model) - t5_gemma_module —
T5GemmaModuleConfig(T5GemmaModuleConfig model) - t5gemma —
T5GemmaConfig(T5GemmaConfig model) - t5gemma2 —
T5Gemma2Config(T5Gemma2Config model) - t5gemma2_decoder —
T5Gemma2DecoderConfig(T5Gemma2DecoderConfig model) - t5gemma2_encoder —
T5Gemma2EncoderConfig(T5Gemma2EncoderConfig model) - t5gemma2_text —
T5Gemma2TextConfig(T5Gemma2TextConfig model) - table-transformer —
TableTransformerConfig(TableTransformerConfig model) - tapas —
TapasConfig(TapasConfig model) - textnet —
TextNetConfig(TextNetConfig model) - time_series_transformer —
TimeSeriesTransformerConfig(TimeSeriesTransformerConfig model) - timesfm —
TimesFmConfig(TimesFmConfig model) - timesfm2_5 —
TimesFm2_5Config(TimesFm2_5Config model) - timesformer —
TimesformerConfig(TimesformerConfig model) - timm_backbone —
TimmBackboneConfig(TimmBackboneConfig model) - timm_wrapper —
TimmWrapperConfig(TimmWrapperConfig model) - trocr —
TrOCRConfig(TrOCRConfig model) - tvp —
TvpConfig(TvpConfig model) - udop —
UdopConfig(UdopConfig model) - umt5 —
UMT5Config(UMT5Config model) - unispeech —
UniSpeechConfig(UniSpeechConfig model) - unispeech-sat —
UniSpeechSatConfig(UniSpeechSatConfig model) - univnet —
UnivNetConfig(UnivNetConfig model) - upernet —
UperNetConfig(UperNetConfig model) - uvdoc —
UVDocConfig(UVDocConfig model) - uvdoc_backbone —
UVDocBackboneConfig(UVDocBackboneConfig model) - vaultgemma —
VaultGemmaConfig(VaultGemmaConfig model) - vibevoice_acoustic_tokenizer —
VibeVoiceAcousticTokenizerConfig(VibeVoiceAcousticTokenizerConfig model) - vibevoice_acoustic_tokenizer_decoder —
VibeVoiceAcousticTokenizerDecoderConfig(VibeVoiceAcousticTokenizerDecoderConfig model) - vibevoice_acoustic_tokenizer_encoder —
VibeVoiceAcousticTokenizerEncoderConfig(VibeVoiceAcousticTokenizerEncoderConfig model) - vibevoice_asr —
VibeVoiceAsrConfig(VibeVoiceAsrConfig model) - video_llama_3 —
VideoLlama3Config(VideoLlama3Config model) - video_llama_3_vision —
VideoLlama3VisionConfig(VideoLlama3VisionConfig model) - video_llava —
VideoLlavaConfig(VideoLlavaConfig model) - videomae —
VideoMAEConfig(VideoMAEConfig model) - videomt —
VideomtConfig(VideomtConfig model) - vilt —
ViltConfig(ViltConfig model) - vipllava —
VipLlavaConfig(VipLlavaConfig model) - vision-encoder-decoder —
VisionEncoderDecoderConfig(VisionEncoderDecoderConfig model) - vision-text-dual-encoder —
VisionTextDualEncoderConfig(VisionTextDualEncoderConfig model) - visual_bert —
VisualBertConfig(VisualBertConfig model) - vit —
ViTConfig(ViTConfig model) - vit_mae —
ViTMAEConfig(ViTMAEConfig model) - vit_msn —
ViTMSNConfig(ViTMSNConfig model) - vitdet —
VitDetConfig(VitDetConfig model) - vitmatte —
VitMatteConfig(VitMatteConfig model) - vitpose —
VitPoseConfig(VitPoseConfig model) - vitpose_backbone —
VitPoseBackboneConfig(VitPoseBackboneConfig model) - vits —
VitsConfig(VitsConfig model) - vivit —
VivitConfig(VivitConfig model) - vjepa2 —
VJEPA2Config(VJEPA2Config model) - voxtral —
VoxtralConfig(VoxtralConfig model) - voxtral_encoder —
VoxtralEncoderConfig(VoxtralEncoderConfig model) - voxtral_realtime —
VoxtralRealtimeConfig(VoxtralRealtimeConfig model) - voxtral_realtime_encoder —
VoxtralRealtimeEncoderConfig(VoxtralRealtimeEncoderConfig model) - voxtral_realtime_text —
VoxtralRealtimeTextConfig(VoxtralRealtimeTextConfig model) - wav2vec2 —
Wav2Vec2Config(Wav2Vec2Config model) - wav2vec2-bert —
Wav2Vec2BertConfig(Wav2Vec2BertConfig model) - wav2vec2-conformer —
Wav2Vec2ConformerConfig(Wav2Vec2ConformerConfig model) - wavlm —
WavLMConfig(WavLMConfig model) - whisper —
WhisperConfig(WhisperConfig model) - xclip —
XCLIPConfig(XCLIPConfig model) - xclip_text_model —
XCLIPTextConfig(XCLIPTextConfig model) - xclip_vision_model —
XCLIPVisionConfig(XCLIPVisionConfig model) - xcodec —
XcodecConfig(XcodecConfig model) - xglm —
XGLMConfig(XGLMConfig model) - xlm —
XLMConfig(XLMConfig model) - xlm-roberta —
XLMRobertaConfig(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaXLConfig(XLMRobertaXLConfig model) - xlnet —
XLNetConfig(XLNetConfig model) - xlstm —
xLSTMConfig(xLSTMConfig model) - xmod —
XmodConfig(XmodConfig model) - yolos —
YolosConfig(YolosConfig model) - yoso —
YosoConfig(YosoConfig model) - youtu —
YoutuConfig(YoutuConfig model) - zamba —
ZambaConfig(ZambaConfig model) - zamba2 —
Zamba2Config(Zamba2Config model) - zoedepth —
ZoeDepthConfig(ZoeDepthConfig model)
Examples:
>>> from transformers import AutoConfig
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased")
>>> # Download configuration from huggingface.co (user-uploaded) and cache.
>>> config = AutoConfig.from_pretrained("dbmdz/bert-base-german-cased")
>>> # If configuration file is in a directory (e.g., was saved using *save_pretrained('./test/saved_model/')*).
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/")
>>> # Load a specific configuration file.
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/my_configuration.json")
>>> # Change some config attributes when loading a pretrained config.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased", output_attentions=True, foo=False)
>>> config.output_attentions
True
>>> config, unused_kwargs = AutoConfig.from_pretrained(
... "google-bert/bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True
... )
>>> config.output_attentions
True
>>> unused_kwargs
{'foo': False}register
< source >( model_type config exist_ok = False )
Parameters
- model_type (
str) — The model type like “bert” or “gpt”. - config (PreTrainedConfig) — The config to register.
Register a new configuration for this class.
AutoTokenizer
This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the AutoTokenizer.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path *inputs **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a predefined tokenizer hosted inside a model repo on huggingface.co.
- A path to a directory containing vocabulary files required by the tokenizer, for instance saved
using the save_pretrained() method, e.g.,
./my_model_directory/. - a path to a single saved vocabulary file if and only if the tokenizer only requires a
single vocabulary file (like Bert or XLNet), e.g.:
./my_model_directory/vocab.txt. (Not applicable to all derived classes)
- inputs (additional positional arguments, optional) —
Will be passed along to the Tokenizer
__init__()method. - config (PreTrainedConfig, optional) — The configuration object used to determine the tokenizer class to instantiate.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - subfolder (
str, optional) — In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here. - tokenizer_type (
str, optional) — Tokenizer type to be loaded. - backend (
str, optional, defaults to"tokenizers") — Backend to use for tokenization. Valid options are:"tokenizers": Use the HuggingFace tokenizers library backend (default)"sentencepiece": Use the SentencePiece backend
- trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - kwargs (additional keyword arguments, optional) —
Will be passed to the Tokenizer
__init__()method. Can be used to set special tokens likebos_token,eos_token,unk_token,sep_token,pad_token,cls_token,mask_token,additional_special_tokens. See parameters in the__init__()for more details.
Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.
The tokenizer class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- aimv2 — CLIPTokenizer (Aimv2Config model)
- albert — AlbertTokenizer (AlbertConfig model)
- align — BertTokenizer (AlignConfig model)
- audioflamingo3 —
Qwen2Tokenizer(AudioFlamingo3Config model) - aya_vision —
CohereTokenizer(AyaVisionConfig model) - bark — BertTokenizer (BarkConfig model)
- bart — RobertaTokenizer (BartConfig model)
- bert — BertTokenizer (BertConfig model)
- bert-generation — BertGenerationTokenizer (BertGenerationConfig model)
- big_bird — BigBirdTokenizer (BigBirdConfig model)
- bigbird_pegasus —
PegasusTokenizer(BigBirdPegasusConfig model) - biogpt — BioGptTokenizer (BioGptConfig model)
- blenderbot — BlenderbotTokenizer (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallTokenizer (BlenderbotSmallConfig model)
- blip — BertTokenizer (BlipConfig model)
- blip-2 —
GPT2Tokenizer(Blip2Config model) - bridgetower — RobertaTokenizer (BridgeTowerConfig model)
- bros — BertTokenizer (BrosConfig model)
- camembert — CamembertTokenizer (CamembertConfig model)
- canine — CanineTokenizer (CanineConfig model)
- chameleon — TokenizersBackend (ChameleonConfig model)
- chinese_clip — BertTokenizer (ChineseCLIPConfig model)
- clap — RobertaTokenizer (ClapConfig model)
- clip — CLIPTokenizer (CLIPConfig model)
- clipseg — CLIPTokenizer (CLIPSegConfig model)
- clvp — ClvpTokenizer (ClvpConfig model)
- codegen —
GPT2Tokenizer(CodeGenConfig model) - cohere —
CohereTokenizer(CohereConfig model) - cohere2 —
CohereTokenizer(Cohere2Config model) - cohere_asr — TokenizersBackend (CohereAsrConfig model)
- colqwen2 —
Qwen2Tokenizer(ColQwen2Config model) - convbert — BertTokenizer (ConvBertConfig model)
- cpmant — CpmAntTokenizer (CpmAntConfig model)
- ctrl — CTRLTokenizer (CTRLConfig model)
- data2vec-audio —
Wav2Vec2CTCTokenizer(Data2VecAudioConfig model) - data2vec-text — RobertaTokenizer (Data2VecTextConfig model)
- dbrx —
GPT2Tokenizer(DbrxConfig model) - deberta — DebertaTokenizer (DebertaConfig model)
- deberta-v2 — DebertaV2Tokenizer (DebertaV2Config model)
- deepseek_v2 — TokenizersBackend (DeepseekV2Config model)
- deepseek_v3 — TokenizersBackend (DeepseekV3Config model)
- deepseek_vl — TokenizersBackend (DeepseekVLConfig model)
- deepseek_vl_hybrid — TokenizersBackend (DeepseekVLHybridConfig model)
- dia —
DiaTokenizer(DiaConfig model) - distilbert — BertTokenizer (DistilBertConfig model)
- dpr —
DPRQuestionEncoderTokenizer(DPRConfig model) - electra — BertTokenizer (ElectraConfig model)
- emu3 —
GPT2Tokenizer(Emu3Config model) - ernie — BertTokenizer (ErnieConfig model)
- esm —
EsmTokenizer(EsmConfig model) - falcon_mamba —
GPTNeoXTokenizer(FalconMambaConfig model) - fastspeech2_conformer —
None(FastSpeech2ConformerConfig model) - flaubert —
FlaubertTokenizer(FlaubertConfig model) - flava — BertTokenizer (FlavaConfig model)
- flex_olmo —
GPT2Tokenizer(FlexOlmoConfig model) - florence2 — BartTokenizer (Florence2Config model)
- fnet —
FNetTokenizer(FNetConfig model) - fsmt —
FSMTTokenizer(FSMTConfig model) - funnel —
FunnelTokenizer(FunnelConfig model) - fuyu — TokenizersBackend (FuyuConfig model)
- gemma —
GemmaTokenizer(GemmaConfig model) - gemma2 —
GemmaTokenizer(Gemma2Config model) - gemma3 —
GemmaTokenizer(Gemma3Config model) - gemma3_text —
GemmaTokenizer(Gemma3TextConfig model) - gemma3n —
GemmaTokenizer(Gemma3nConfig model) - gemma3n_text —
GemmaTokenizer(Gemma3nTextConfig model) - git — BertTokenizer (GitConfig model)
- glm — TokenizersBackend (GlmConfig model)
- glm4 — TokenizersBackend (Glm4Config model)
- glm4_moe — TokenizersBackend (Glm4MoeConfig model)
- glm4_moe_lite — TokenizersBackend (Glm4MoeLiteConfig model)
- glm4v — TokenizersBackend (Glm4vConfig model)
- glm4v_moe — TokenizersBackend (Glm4vMoeConfig model)
- glm_image — TokenizersBackend (GlmImageConfig model)
- glmasr — TokenizersBackend (GlmAsrConfig model)
- got_ocr2 — TokenizersBackend (GotOcr2Config model)
- gpt-sw3 —
GPTSw3Tokenizer(GPT2Config model) - gpt2 —
GPT2Tokenizer(GPT2Config model) - gpt_bigcode —
GPT2Tokenizer(GPTBigCodeConfig model) - gpt_neo —
GPT2Tokenizer(GPTNeoConfig model) - gpt_neox —
GPTNeoXTokenizer(GPTNeoXConfig model) - gpt_neox_japanese —
GPTNeoXJapaneseTokenizer(GPTNeoXJapaneseConfig model) - gptj —
GPT2Tokenizer(GPTJConfig model) - granite —
GPT2Tokenizer(GraniteConfig model) - granitemoe —
GPT2Tokenizer(GraniteMoeConfig model) - granitemoehybrid —
GPT2Tokenizer(GraniteMoeHybridConfig model) - granitemoeshared —
GPT2Tokenizer(GraniteMoeSharedConfig model) - grounding-dino — BertTokenizer (GroundingDinoConfig model)
- groupvit — CLIPTokenizer (GroupViTConfig model)
- hubert —
Wav2Vec2CTCTokenizer(HubertConfig model) - ibert — RobertaTokenizer (IBertConfig model)
- idefics —
LlamaTokenizer(IdeficsConfig model) - idefics2 —
LlamaTokenizer(Idefics2Config model) - instructblip —
GPT2Tokenizer(InstructBlipConfig model) - instructblipvideo —
GPT2Tokenizer(InstructBlipVideoConfig model) - internvl —
Qwen2Tokenizer(InternVLConfig model) - jais2 —
GPT2Tokenizer(Jais2Config model) - jamba — TokenizersBackend (JambaConfig model)
- janus — TokenizersBackend (JanusConfig model)
- jina_embeddings_v3 —
XLMRobertaTokenizer(JinaEmbeddingsV3Config model) - kosmos-2 —
XLMRobertaTokenizer(Kosmos2Config model) - lasr_ctc —
LasrTokenizer(LasrCTCConfig model) - lasr_encoder —
LasrTokenizer(LasrEncoderConfig model) - layoutlm — BertTokenizer (LayoutLMConfig model)
- layoutlmv2 —
LayoutLMv2Tokenizer(LayoutLMv2Config model) - layoutlmv3 —
LayoutLMv3Tokenizer(LayoutLMv3Config model) - layoutxlm —
LayoutXLMTokenizer(LayoutXLMConfig model) - led — LEDTokenizer (LEDConfig model)
- lighton_ocr —
Qwen2TokenizerFast(LightOnOcrConfig model) - lilt — RobertaTokenizer (LiltConfig model)
- llava — TokenizersBackend (LlavaConfig model)
- llava_next — TokenizersBackend (LlavaNextConfig model)
- longformer — RobertaTokenizer (LongformerConfig model)
- luke —
LukeTokenizer(LukeConfig model) - lxmert — LxmertTokenizer (LxmertConfig model)
- m2m_100 —
M2M100Tokenizer(M2M100Config model) - mamba —
GPTNeoXTokenizer(MambaConfig model) - mamba2 —
GPTNeoXTokenizer(Mamba2Config model) - marian —
MarianTokenizer(MarianConfig model) - markuplm —
MarkupLMTokenizer(MarkupLMConfig model) - mbart —
MBartTokenizer(MBartConfig model) - megatron-bert — BertTokenizer (MegatronBertConfig model)
- metaclip_2 —
XLMRobertaTokenizer(MetaClip2Config model) - mgp-str —
MgpstrTokenizer(MgpstrConfig model) - minimax_m2 — TokenizersBackend (MiniMaxM2Config model)
- ministral —
MistralCommonBackend(MinistralConfig model) - ministral3 —
MistralCommonBackend(Ministral3Config model) - mistral —
MistralCommonBackend(MistralConfig model) - mistral3 —
MistralCommonBackend(Mistral3Config model) - mixtral —
MistralCommonBackend(MixtralConfig model) - mm-grounding-dino — BertTokenizer (MMGroundingDinoConfig model)
- mobilebert — MobileBertTokenizer (MobileBertConfig model)
- modernbert — TokenizersBackend (ModernBertConfig model)
- mpnet —
MPNetTokenizer(MPNetConfig model) - mpt —
GPTNeoXTokenizer(MptConfig model) - mra — RobertaTokenizer (MraConfig model)
- mt5 —
T5Tokenizer(MT5Config model) - musicgen —
T5Tokenizer(MusicgenConfig model) - musicgen_melody —
T5Tokenizer(MusicgenMelodyConfig model) - mvp — MvpTokenizer (MvpConfig model)
- nemotron — TokenizersBackend (NemotronConfig model)
- nllb-moe —
NllbTokenizer(NllbMoeConfig model) - nomic_bert — BertTokenizer (NomicBertConfig model)
- nougat —
NougatTokenizer(NougatConfig model) - nystromformer — AlbertTokenizer (NystromformerConfig model)
- olmo —
GPTNeoXTokenizer(OlmoConfig model) - olmo2 —
GPTNeoXTokenizer(Olmo2Config model) - olmo3 — TokenizersBackend (Olmo3Config model)
- olmo_hybrid — TokenizersBackend (OlmoHybridConfig model)
- olmoe —
GPTNeoXTokenizer(OlmoeConfig model) - omdet-turbo — CLIPTokenizer (OmDetTurboConfig model)
- oneformer — CLIPTokenizer (OneFormerConfig model)
- openai-gpt —
OpenAIGPTTokenizer(OpenAIGPTConfig model) - opt —
GPT2Tokenizer(OPTConfig model) - ovis2 —
Qwen2Tokenizer(Ovis2Config model) - owlv2 — CLIPTokenizer (Owlv2Config model)
- owlvit — CLIPTokenizer (OwlViTConfig model)
- pegasus —
PegasusTokenizer(PegasusConfig model) - pegasus_x —
PegasusTokenizer(PegasusXConfig model) - perceiver —
PerceiverTokenizer(PerceiverConfig model) - phi —
GPT2Tokenizer(PhiConfig model) - phi3 — TokenizersBackend (Phi3Config model)
- phimoe — TokenizersBackend (PhimoeConfig model)
- pix2struct —
T5Tokenizer(Pix2StructConfig model) - pixtral —
MistralCommonBackend(PixtralVisionConfig model) - plbart —
PLBartTokenizer(PLBartConfig model) - prophetnet —
ProphetNetTokenizer(ProphetNetConfig model) - qianfan_ocr —
Qwen2Tokenizer(QianfanOCRConfig model) - qwen2 —
Qwen2Tokenizer(Qwen2Config model) - qwen2_5_omni —
Qwen2Tokenizer(Qwen2_5OmniConfig model) - qwen2_5_vl —
Qwen2Tokenizer(Qwen2_5_VLConfig model) - qwen2_audio —
Qwen2Tokenizer(Qwen2AudioConfig model) - qwen2_moe —
Qwen2Tokenizer(Qwen2MoeConfig model) - qwen2_vl —
Qwen2Tokenizer(Qwen2VLConfig model) - qwen3 —
Qwen2Tokenizer(Qwen3Config model) - qwen3_5 —
Qwen3_5Tokenizer(Qwen3_5Config model) - qwen3_5_moe —
Qwen3_5Tokenizer(Qwen3_5MoeConfig model) - qwen3_moe —
Qwen2Tokenizer(Qwen3MoeConfig model) - qwen3_next —
Qwen2Tokenizer(Qwen3NextConfig model) - qwen3_omni_moe —
Qwen2Tokenizer(Qwen3OmniMoeConfig model) - qwen3_vl —
Qwen2Tokenizer(Qwen3VLConfig model) - qwen3_vl_moe —
Qwen2Tokenizer(Qwen3VLMoeConfig model) - rag —
RagTokenizer(RagConfig model) - recurrent_gemma —
GemmaTokenizer(RecurrentGemmaConfig model) - reformer —
ReformerTokenizer(ReformerConfig model) - rembert —
RemBertTokenizer(RemBertConfig model) - roberta — RobertaTokenizer (RobertaConfig model)
- roberta-prelayernorm — RobertaTokenizer (RobertaPreLayerNormConfig model)
- roc_bert —
RoCBertTokenizer(RoCBertConfig model) - roformer —
RoFormerTokenizer(RoFormerConfig model) - rwkv —
GPTNeoXTokenizer(RwkvConfig model) - sam3 — CLIPTokenizer (Sam3Config model)
- sam3_video — CLIPTokenizer (Sam3VideoConfig model)
- seamless_m4t —
SeamlessM4TTokenizer(SeamlessM4TConfig model) - seamless_m4t_v2 —
SeamlessM4TTokenizer(SeamlessM4Tv2Config model) - shieldgemma2 —
GemmaTokenizer(ShieldGemma2Config model) - siglip —
SiglipTokenizer(SiglipConfig model) - siglip2 —
Siglip2Tokenizer(Siglip2Config model) - speech_to_text —
Speech2TextTokenizer(Speech2TextConfig model) - speecht5 —
SpeechT5Tokenizer(SpeechT5Config model) - splinter —
SplinterTokenizer(SplinterConfig model) - squeezebert — BertTokenizer (SqueezeBertConfig model)
- stablelm —
GPTNeoXTokenizer(StableLmConfig model) - starcoder2 —
GPT2Tokenizer(Starcoder2Config model) - switch_transformers —
T5Tokenizer(SwitchTransformersConfig model) - t5 —
T5Tokenizer(T5Config model) - t5gemma —
GemmaTokenizer(T5GemmaConfig model) - tapas —
TapasTokenizer(TapasConfig model) - trocr —
XLMRobertaTokenizer(TrOCRConfig model) - tvp — BertTokenizer (TvpConfig model)
- udop —
UdopTokenizer(UdopConfig model) - umt5 —
T5Tokenizer(UMT5Config model) - unispeech —
Wav2Vec2CTCTokenizer(UniSpeechConfig model) - unispeech-sat —
Wav2Vec2CTCTokenizer(UniSpeechSatConfig model) - vilt — BertTokenizer (ViltConfig model)
- vipllava — TokenizersBackend (VipLlavaConfig model)
- visual_bert — BertTokenizer (VisualBertConfig model)
- vits —
VitsTokenizer(VitsConfig model) - voxtral —
MistralCommonBackend(VoxtralConfig model) - voxtral_realtime —
MistralCommonBackend(VoxtralRealtimeConfig model) - wav2vec2 —
Wav2Vec2CTCTokenizer(Wav2Vec2Config model) - wav2vec2-bert —
Wav2Vec2CTCTokenizer(Wav2Vec2BertConfig model) - wav2vec2-conformer —
Wav2Vec2CTCTokenizer(Wav2Vec2ConformerConfig model) - whisper —
WhisperTokenizer(WhisperConfig model) - xclip — CLIPTokenizer (XCLIPConfig model)
- xglm —
XGLMTokenizer(XGLMConfig model) - xlm —
XLMTokenizer(XLMConfig model) - xlm-roberta —
XLMRobertaTokenizer(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaTokenizer(XLMRobertaXLConfig model) - xlnet —
XLNetTokenizer(XLNetConfig model) - xlstm —
GPTNeoXTokenizer(xLSTMConfig model) - xmod —
XLMRobertaTokenizer(XmodConfig model) - yoso — AlbertTokenizer (YosoConfig model)
Examples:
>>> from transformers import AutoTokenizer
>>> # Download vocabulary from huggingface.co and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
>>> # Download vocabulary from huggingface.co (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-cased")
>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using *save_pretrained('./test/saved_model/')*)
>>> # tokenizer = AutoTokenizer.from_pretrained("./test/bert_saved_model/")
>>> # Download vocabulary from huggingface.co and define model-specific arguments
>>> tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base", add_prefix_space=True)
>>> # Explicitly use the tokenizers backend
>>> tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer", backend="tokenizers")
>>> # Explicitly use the sentencepiece backend
>>> tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer", backend="sentencepiece")register
< source >( config_class tokenizer_class = None slow_tokenizer_class = None fast_tokenizer_class = None exist_ok = False )
Parameters
- config_class (PreTrainedConfig) — The configuration corresponding to the model to register.
- tokenizer_class — The tokenizer class to register (V5 - preferred parameter).
- slow_tokenizer_class — (Deprecated) The slow tokenizer to register.
- fast_tokenizer_class — (Deprecated) The fast tokenizer to register.
Register a new tokenizer in this mapping.
AutoFeatureExtractor
This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the library when created with the AutoFeatureExtractor.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — This can be either:- a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co.
- a path to a directory containing a feature extractor file saved using the
save_pretrained() method, e.g.,
./my_model_directory/. - a path to a saved feature extractor JSON file, e.g.,
./my_model_directory/preprocessor_config.json.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.The proxies are used on each request. - token (
stror bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue, will use the token generated when runninghf auth login(stored in~/.huggingface). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - return_unused_kwargs (
bool, optional, defaults toFalse) — IfFalse, then this function returns just the final feature extractor object. IfTrue, then this functions returns aTuple(feature_extractor, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part ofkwargswhich has not been used to updatefeature_extractorand is otherwise ignored. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - kwargs (
dict[str, Any], optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by thereturn_unused_kwargskeyword parameter.
Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.
The feature extractor class to instantiate is selected based on the model_type property of the config object
(either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s
missing, by falling back to using pattern matching on pretrained_model_name_or_path:
- audio-spectrogram-transformer — ASTFeatureExtractor (ASTConfig model)
- audioflamingo3 —
WhisperFeatureExtractor(AudioFlamingo3Config model) - clap — ClapFeatureExtractor (ClapConfig model)
- clvp — ClvpFeatureExtractor (ClvpConfig model)
- cohere_asr —
CohereAsrFeatureExtractor(CohereAsrConfig model) - csm —
EncodecFeatureExtractor(CsmConfig model) - dac —
DacFeatureExtractor(DacConfig model) - data2vec-audio —
Wav2Vec2FeatureExtractor(Data2VecAudioConfig model) - dia —
DiaFeatureExtractor(DiaConfig model) - encodec —
EncodecFeatureExtractor(EncodecConfig model) - gemma3n —
Gemma3nAudioFeatureExtractor(Gemma3nConfig model) - gemma4 —
Gemma4AudioFeatureExtractor(Gemma4Config model) - glmasr —
WhisperFeatureExtractor(GlmAsrConfig model) - granite_speech —
GraniteSpeechFeatureExtractor(GraniteSpeechConfig model) - higgs_audio_v2_tokenizer —
DacFeatureExtractor(HiggsAudioV2TokenizerConfig model) - hubert —
Wav2Vec2FeatureExtractor(HubertConfig model) - kyutai_speech_to_text —
KyutaiSpeechToTextFeatureExtractor(KyutaiSpeechToTextConfig model) - lasr_ctc —
LasrFeatureExtractor(LasrCTCConfig model) - lasr_encoder —
LasrFeatureExtractor(LasrEncoderConfig model) - markuplm —
MarkupLMFeatureExtractor(MarkupLMConfig model) - mimi —
EncodecFeatureExtractor(MimiConfig model) - moonshine —
Wav2Vec2FeatureExtractor(MoonshineConfig model) - moshi —
EncodecFeatureExtractor(MoshiConfig model) - musicgen —
EncodecFeatureExtractor(MusicgenConfig model) - musicgen_melody —
MusicgenMelodyFeatureExtractor(MusicgenMelodyConfig model) - parakeet_ctc —
ParakeetFeatureExtractor(ParakeetCTCConfig model) - parakeet_encoder —
ParakeetFeatureExtractor(ParakeetEncoderConfig model) - pe_audio —
PeAudioFeatureExtractor(PeAudioConfig model) - pe_audio_video —
PeAudioFeatureExtractor(PeAudioVideoConfig model) - phi4_multimodal —
Phi4MultimodalFeatureExtractor(Phi4MultimodalConfig model) - pop2piano —
Pop2PianoFeatureExtractor(Pop2PianoConfig model) - qwen2_5_omni —
WhisperFeatureExtractor(Qwen2_5OmniConfig model) - qwen2_audio —
WhisperFeatureExtractor(Qwen2AudioConfig model) - qwen3_omni_moe —
WhisperFeatureExtractor(Qwen3OmniMoeConfig model) - seamless_m4t —
SeamlessM4TFeatureExtractor(SeamlessM4TConfig model) - seamless_m4t_v2 —
SeamlessM4TFeatureExtractor(SeamlessM4Tv2Config model) - sew —
Wav2Vec2FeatureExtractor(SEWConfig model) - sew-d —
Wav2Vec2FeatureExtractor(SEWDConfig model) - speech_to_text —
Speech2TextFeatureExtractor(Speech2TextConfig model) - speecht5 —
SpeechT5FeatureExtractor(SpeechT5Config model) - unispeech —
Wav2Vec2FeatureExtractor(UniSpeechConfig model) - unispeech-sat —
Wav2Vec2FeatureExtractor(UniSpeechSatConfig model) - univnet —
UnivNetFeatureExtractor(UnivNetConfig model) - vibevoice_acoustic_tokenizer —
VibeVoiceAcousticTokenizerFeatureExtractor(VibeVoiceAcousticTokenizerConfig model) - vibevoice_asr —
VibeVoiceAcousticTokenizerFeatureExtractor(VibeVoiceAsrConfig model) - voxtral —
WhisperFeatureExtractor(VoxtralConfig model) - voxtral_realtime —
VoxtralRealtimeFeatureExtractor(VoxtralRealtimeConfig model) - wav2vec2 —
Wav2Vec2FeatureExtractor(Wav2Vec2Config model) - wav2vec2-bert —
Wav2Vec2FeatureExtractor(Wav2Vec2BertConfig model) - wav2vec2-conformer —
Wav2Vec2FeatureExtractor(Wav2Vec2ConformerConfig model) - wavlm —
Wav2Vec2FeatureExtractor(WavLMConfig model) - whisper —
WhisperFeatureExtractor(WhisperConfig model) - xcodec —
DacFeatureExtractor(XcodecConfig model)
Passing
token=Trueis required when you want to use a private model.
Examples:
>>> from transformers import AutoFeatureExtractor
>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")
>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> # feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")register
< source >( config_class feature_extractor_class exist_ok = False )
Parameters
- config_class (PreTrainedConfig) — The configuration corresponding to the model to register.
- feature_extractor_class (
FeatureExtractorMixin) — The feature extractor to register.
Register a new feature extractor for this class.
AutoImageProcessor
This is a generic image processor class that will be instantiated as one of the image processor classes of the library when created with the AutoImageProcessor.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path *inputs **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — This can be either:- a string, the model id of a pretrained image_processor hosted inside a model repo on huggingface.co.
- a path to a directory containing a image processor file saved using the
save_pretrained() method, e.g.,
./my_model_directory/. - a path to a saved image processor JSON file, e.g.,
./my_model_directory/preprocessor_config.json.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model image processor should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force to (re-)download the image processor files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.The proxies are used on each request. - token (
stror bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue, will use the token generated when runninghf auth login(stored in~/.huggingface). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - use_fast (
bool, optional, defaults toFalse) — Deprecated: Usebackend="torchvision"instead. This parameter is kept for backward compatibility. Use a fast torchvision-based image processor if it is supported for a given model. If a fast image processor is not available for a given model, a normal numpy-based image processor is returned instead. - backend (
str, optional, defaults toNone) — The backend to use for image processing. Can be:None: Automatically select the best available backend (torchvision if available, otherwise pil)"torchvision": Use Torchvision backend (GPU-accelerated, faster)"pil": Use PIL backend (portable, CPU-only)- Any custom backend name registered via
register()method
- return_unused_kwargs (
bool, optional, defaults toFalse) — IfFalse, then this function returns just the final image processor object. IfTrue, then this functions returns aTuple(image_processor, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not image processor attributes: i.e., the part ofkwargswhich has not been used to updateimage_processorand is otherwise ignored. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - image_processor_filename (
str, optional, defaults to"config.json") — The name of the file in the model directory to use for the image processor config. - kwargs (
dict[str, Any], optional) — The values in kwargs of any keys which are image processor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not image processor attributes is controlled by thereturn_unused_kwargskeyword parameter.
Instantiate one of the image processor classes of the library from a pretrained model vocabulary.
The image processor class to instantiate is selected based on the model_type property of the config object
(either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s
missing, by falling back to using pattern matching on pretrained_model_name_or_path:
- aimv2 —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(Aimv2Config model) - aimv2_vision_model —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(Aimv2VisionConfig model) - align —
{'torchvision': 'EfficientNetImageProcessor', 'pil': 'EfficientNetImageProcessorPil'}(AlignConfig model) - altclip —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(AltCLIPConfig model) - aria —
{'pil': 'AriaImageProcessorPil', 'torchvision': 'AriaImageProcessor'}(AriaConfig model) - aya_vision —
{'torchvision': 'GotOcr2ImageProcessor', 'pil': 'GotOcr2ImageProcessorPil'}(AyaVisionConfig model) - beit —
{'pil': 'BeitImageProcessorPil', 'torchvision': 'BeitImageProcessor'}(BeitConfig model) - bit —
{'pil': 'BitImageProcessorPil', 'torchvision': 'BitImageProcessor'}(BitConfig model) - blip —
{'pil': 'BlipImageProcessorPil', 'torchvision': 'BlipImageProcessor'}(BlipConfig model) - blip-2 —
{'torchvision': 'BlipImageProcessor', 'pil': 'BlipImageProcessorPil'}(Blip2Config model) - bridgetower —
{'pil': 'BridgeTowerImageProcessorPil', 'torchvision': 'BridgeTowerImageProcessor'}(BridgeTowerConfig model) - chameleon —
{'pil': 'ChameleonImageProcessorPil', 'torchvision': 'ChameleonImageProcessor'}(ChameleonConfig model) - chinese_clip —
{'pil': 'ChineseCLIPImageProcessorPil', 'torchvision': 'ChineseCLIPImageProcessor'}(ChineseCLIPConfig model) - chmv2 —
{'torchvision': 'CHMv2ImageProcessor'}(CHMv2Config model) - clip —
{'pil': 'CLIPImageProcessorPil', 'torchvision': 'CLIPImageProcessor'}(CLIPConfig model) - clipseg —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(CLIPSegConfig model) - cohere2_vision —
{'torchvision': 'Cohere2VisionImageProcessor'}(Cohere2VisionConfig model) - colpali —
{'torchvision': 'SiglipImageProcessor', 'pil': 'SiglipImageProcessorPil'}(ColPaliConfig model) - colqwen2 —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(ColQwen2Config model) - conditional_detr —
{'pil': 'ConditionalDetrImageProcessorPil', 'torchvision': 'ConditionalDetrImageProcessor'}(ConditionalDetrConfig model) - convnext —
{'pil': 'ConvNextImageProcessorPil', 'torchvision': 'ConvNextImageProcessor'}(ConvNextConfig model) - convnextv2 —
{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}(ConvNextV2Config model) - cvt —
{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}(CvtConfig model) - data2vec-vision —
{'torchvision': 'BeitImageProcessor', 'pil': 'BeitImageProcessorPil'}(Data2VecVisionConfig model) - deepseek_vl —
{'pil': 'DeepseekVLImageProcessorPil', 'torchvision': 'DeepseekVLImageProcessor'}(DeepseekVLConfig model) - deepseek_vl_hybrid —
{'pil': 'DeepseekVLHybridImageProcessorPil', 'torchvision': 'DeepseekVLHybridImageProcessor'}(DeepseekVLHybridConfig model) - deformable_detr —
{'pil': 'DeformableDetrImageProcessorPil', 'torchvision': 'DeformableDetrImageProcessor'}(DeformableDetrConfig model) - deit —
{'pil': 'DeiTImageProcessorPil', 'torchvision': 'DeiTImageProcessor'}(DeiTConfig model) - depth_anything —
{'torchvision': 'DPTImageProcessor', 'pil': 'DPTImageProcessorPil'}(DepthAnythingConfig model) - depth_pro —
{'torchvision': 'DepthProImageProcessor'}(DepthProConfig model) - detr —
{'pil': 'DetrImageProcessorPil', 'torchvision': 'DetrImageProcessor'}(DetrConfig model) - dinat —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(DinatConfig model) - dinov2 —
{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}(Dinov2Config model) - dinov3_vit —
{'torchvision': 'DINOv3ViTImageProcessor'}(DINOv3ViTConfig model) - donut-swin —
{'torchvision': 'DonutImageProcessor', 'pil': 'DonutImageProcessorPil'}(DonutSwinConfig model) - dpt —
{'pil': 'DPTImageProcessorPil', 'torchvision': 'DPTImageProcessor'}(DPTConfig model) - edgetam —
{'torchvision': 'Sam2ImageProcessor'}(EdgeTamConfig model) - efficientloftr —
{'pil': 'EfficientLoFTRImageProcessorPil', 'torchvision': 'EfficientLoFTRImageProcessor'}(EfficientLoFTRConfig model) - efficientnet —
{'pil': 'EfficientNetImageProcessorPil', 'torchvision': 'EfficientNetImageProcessor'}(EfficientNetConfig model) - emu3 —
{'pil': 'Emu3ImageProcessor'}(Emu3Config model) - eomt —
{'pil': 'EomtImageProcessorPil', 'torchvision': 'EomtImageProcessor'}(EomtConfig model) - eomt_dinov3 —
{'torchvision': 'EomtImageProcessor', 'pil': 'EomtImageProcessorPil'}(EomtDinov3Config model) - ernie4_5_vl_moe —
{'pil': 'Ernie4_5_VLMoeImageProcessorPil', 'torchvision': 'Ernie4_5_VLMoeImageProcessor'}(Ernie4_5_VLMoeConfig model) - flava —
{'pil': 'FlavaImageProcessorPil', 'torchvision': 'FlavaImageProcessor'}(FlavaConfig model) - florence2 —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(Florence2Config model) - focalnet —
{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}(FocalNetConfig model) - fuyu —
{'pil': 'FuyuImageProcessorPil', 'torchvision': 'FuyuImageProcessor'}(FuyuConfig model) - gemma3 —
{'pil': 'Gemma3ImageProcessorPil', 'torchvision': 'Gemma3ImageProcessor'}(Gemma3Config model) - gemma3n —
{'torchvision': 'SiglipImageProcessor', 'pil': 'SiglipImageProcessorPil'}(Gemma3nConfig model) - gemma4 —
{'pil': 'Gemma4ImageProcessorPil', 'torchvision': 'Gemma4ImageProcessor'}(Gemma4Config model) - git —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(GitConfig model) - glm46v —
{'pil': 'Glm46VImageProcessorPil', 'torchvision': 'Glm46VImageProcessor'}(Glm46VConfig model) - glm4v —
{'pil': 'Glm4vImageProcessorPil', 'torchvision': 'Glm4vImageProcessor'}(Glm4vConfig model) - glm_image —
{'pil': 'GlmImageImageProcessorPil', 'torchvision': 'GlmImageImageProcessor'}(GlmImageConfig model) - glpn —
{'pil': 'GLPNImageProcessorPil', 'torchvision': 'GLPNImageProcessor'}(GLPNConfig model) - got_ocr2 —
{'pil': 'GotOcr2ImageProcessorPil', 'torchvision': 'GotOcr2ImageProcessor'}(GotOcr2Config model) - grounding-dino —
{'pil': 'GroundingDinoImageProcessorPil', 'torchvision': 'GroundingDinoImageProcessor'}(GroundingDinoConfig model) - groupvit —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(GroupViTConfig model) - hiera —
{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}(HieraConfig model) - idefics —
{'pil': 'IdeficsImageProcessorPil', 'torchvision': 'IdeficsImageProcessor'}(IdeficsConfig model) - idefics2 —
{'pil': 'Idefics2ImageProcessorPil', 'torchvision': 'Idefics2ImageProcessor'}(Idefics2Config model) - idefics3 —
{'pil': 'Idefics3ImageProcessorPil', 'torchvision': 'Idefics3ImageProcessor'}(Idefics3Config model) - ijepa —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(IJepaConfig model) - imagegpt —
{'pil': 'ImageGPTImageProcessorPil', 'torchvision': 'ImageGPTImageProcessor'}(ImageGPTConfig model) - instructblip —
{'torchvision': 'BlipImageProcessor', 'pil': 'BlipImageProcessorPil'}(InstructBlipConfig model) - internvl —
{'torchvision': 'GotOcr2ImageProcessor', 'pil': 'GotOcr2ImageProcessorPil'}(InternVLConfig model) - janus —
{'pil': 'JanusImageProcessorPil', 'torchvision': 'JanusImageProcessor'}(JanusConfig model) - kosmos-2 —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(Kosmos2Config model) - kosmos-2.5 —
{'torchvision': 'Kosmos2_5ImageProcessor', 'pil': 'Kosmos2_5ImageProcessorPil'}(Kosmos2_5Config model) - layoutlmv2 —
{'pil': 'LayoutLMv2ImageProcessorPil', 'torchvision': 'LayoutLMv2ImageProcessor'}(LayoutLMv2Config model) - layoutlmv3 —
{'pil': 'LayoutLMv3ImageProcessorPil', 'torchvision': 'LayoutLMv3ImageProcessor'}(LayoutLMv3Config model) - layoutxlm —
{'torchvision': 'LayoutLMv2ImageProcessor', 'pil': 'LayoutLMv2ImageProcessorPil'}(LayoutXLMConfig model) - levit —
{'pil': 'LevitImageProcessorPil', 'torchvision': 'LevitImageProcessor'}(LevitConfig model) - lfm2_vl —
{'torchvision': 'Lfm2VlImageProcessor'}(Lfm2VlConfig model) - lightglue —
{'pil': 'LightGlueImageProcessorPil', 'torchvision': 'LightGlueImageProcessor'}(LightGlueConfig model) - lighton_ocr —
{'torchvision': 'PixtralImageProcessor', 'pil': 'PixtralImageProcessorPil'}(LightOnOcrConfig model) - llama4 —
{'torchvision': 'Llama4ImageProcessor'}(Llama4Config model) - llava —
{'pil': 'LlavaImageProcessorPil', 'torchvision': 'LlavaImageProcessor'}(LlavaConfig model) - llava_next —
{'pil': 'LlavaNextImageProcessorPil', 'torchvision': 'LlavaNextImageProcessor'}(LlavaNextConfig model) - llava_next_video —
{'torchvision': 'LlavaNextImageProcessor', 'pil': 'LlavaNextImageProcessorPil'}(LlavaNextVideoConfig model) - llava_onevision —
{'pil': 'LlavaOnevisionImageProcessorPil', 'torchvision': 'LlavaOnevisionImageProcessor'}(LlavaOnevisionConfig model) - lw_detr —
{'torchvision': 'DeformableDetrImageProcessor', 'pil': 'DeformableDetrImageProcessorPil'}(LwDetrConfig model) - mask2former —
{'pil': 'Mask2FormerImageProcessorPil', 'torchvision': 'Mask2FormerImageProcessor'}(Mask2FormerConfig model) - maskformer —
{'pil': 'MaskFormerImageProcessorPil', 'torchvision': 'MaskFormerImageProcessor'}(MaskFormerConfig model) - metaclip_2 —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(MetaClip2Config model) - mgp-str —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(MgpstrConfig model) - mistral3 —
{'torchvision': 'PixtralImageProcessor', 'pil': 'PixtralImageProcessorPil'}(Mistral3Config model) - mlcd —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(MLCDVisionConfig model) - mllama —
{'pil': 'MllamaImageProcessorPil', 'torchvision': 'MllamaImageProcessor'}(MllamaConfig model) - mm-grounding-dino —
{'torchvision': 'GroundingDinoImageProcessor', 'pil': 'GroundingDinoImageProcessorPil'}(MMGroundingDinoConfig model) - mobilenet_v1 —
{'pil': 'MobileNetV1ImageProcessorPil', 'torchvision': 'MobileNetV1ImageProcessor'}(MobileNetV1Config model) - mobilenet_v2 —
{'pil': 'MobileNetV2ImageProcessorPil', 'torchvision': 'MobileNetV2ImageProcessor'}(MobileNetV2Config model) - mobilevit —
{'pil': 'MobileViTImageProcessorPil', 'torchvision': 'MobileViTImageProcessor'}(MobileViTConfig model) - mobilevitv2 —
{'torchvision': 'MobileViTImageProcessor', 'pil': 'MobileViTImageProcessorPil'}(MobileViTV2Config model) - nougat —
{'pil': 'NougatImageProcessorPil', 'torchvision': 'NougatImageProcessor'}(NougatConfig model) - omdet-turbo —
{'torchvision': 'DetrImageProcessor', 'pil': 'DetrImageProcessorPil'}(OmDetTurboConfig model) - oneformer —
{'pil': 'OneFormerImageProcessorPil', 'torchvision': 'OneFormerImageProcessor'}(OneFormerConfig model) - ovis2 —
{'pil': 'Ovis2ImageProcessorPil', 'torchvision': 'Ovis2ImageProcessor'}(Ovis2Config model) - owlv2 —
{'pil': 'Owlv2ImageProcessorPil', 'torchvision': 'Owlv2ImageProcessor'}(Owlv2Config model) - owlvit —
{'pil': 'OwlViTImageProcessorPil', 'torchvision': 'OwlViTImageProcessor'}(OwlViTConfig model) - paddleocr_vl —
{'pil': 'PaddleOCRVLImageProcessorPil', 'torchvision': 'PaddleOCRVLImageProcessor'}(PaddleOCRVLConfig model) - paligemma —
{'torchvision': 'SiglipImageProcessor', 'pil': 'SiglipImageProcessorPil'}(PaliGemmaConfig model) - perceiver —
{'pil': 'PerceiverImageProcessorPil', 'torchvision': 'PerceiverImageProcessor'}(PerceiverConfig model) - perception_lm —
{'torchvision': 'PerceptionLMImageProcessor'}(PerceptionLMConfig model) - phi4_multimodal —
{'torchvision': 'Phi4MultimodalImageProcessor'}(Phi4MultimodalConfig model) - pi0 —
{'torchvision': 'PI0ImageProcessor'}(PI0Config model) - pix2struct —
{'pil': 'Pix2StructImageProcessorPil', 'torchvision': 'Pix2StructImageProcessor'}(Pix2StructConfig model) - pixio —
{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}(PixioConfig model) - pixtral —
{'pil': 'PixtralImageProcessorPil', 'torchvision': 'PixtralImageProcessor'}(PixtralVisionConfig model) - poolformer —
{'pil': 'PoolFormerImageProcessorPil', 'torchvision': 'PoolFormerImageProcessor'}(PoolFormerConfig model) - pp_chart2table —
{'pil': 'PPChart2TableImageProcessorPil', 'torchvision': 'PPChart2TableImageProcessor'}(PPChart2TableConfig model) - pp_doclayout_v2 —
{'torchvision': 'PPDocLayoutV2ImageProcessor'}(PPDocLayoutV2Config model) - pp_doclayout_v3 —
{'torchvision': 'PPDocLayoutV3ImageProcessor'}(PPDocLayoutV3Config model) - pp_lcnet —
{'torchvision': 'PPLCNetImageProcessor'}(PPLCNetConfig model) - pp_ocrv5_mobile_det —
{'torchvision': 'PPOCRV5ServerDetImageProcessor'}(PPOCRV5MobileDetConfig model) - pp_ocrv5_mobile_rec —
{'torchvision': 'PPOCRV5ServerRecImageProcessor'}(PPOCRV5MobileRecConfig model) - pp_ocrv5_server_det —
{'torchvision': 'PPOCRV5ServerDetImageProcessor'}(PPOCRV5ServerDetConfig model) - pp_ocrv5_server_rec —
{'torchvision': 'PPOCRV5ServerRecImageProcessor'}(PPOCRV5ServerRecConfig model) - prompt_depth_anything —
{'pil': 'PromptDepthAnythingImageProcessorPil', 'torchvision': 'PromptDepthAnythingImageProcessor'}(PromptDepthAnythingConfig model) - pvt —
{'pil': 'PvtImageProcessorPil', 'torchvision': 'PvtImageProcessor'}(PvtConfig model) - pvt_v2 —
{'torchvision': 'PvtImageProcessor', 'pil': 'PvtImageProcessorPil'}(PvtV2Config model) - qianfan_ocr —
{'torchvision': 'GotOcr2ImageProcessor', 'pil': 'GotOcr2ImageProcessorPil'}(QianfanOCRConfig model) - qwen2_5_omni —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen2_5OmniConfig model) - qwen2_5_vl —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen2_5_VLConfig model) - qwen2_vl —
{'pil': 'Qwen2VLImageProcessorPil', 'torchvision': 'Qwen2VLImageProcessor'}(Qwen2VLConfig model) - qwen3_5 —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen3_5Config model) - qwen3_5_moe —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen3_5MoeConfig model) - qwen3_omni_moe —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen3OmniMoeConfig model) - qwen3_vl —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen3VLConfig model) - regnet —
{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}(RegNetConfig model) - resnet —
{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}(ResNetConfig model) - rt_detr —
{'pil': 'RTDetrImageProcessorPil', 'torchvision': 'RTDetrImageProcessor'}(RTDetrConfig model) - sam —
{'pil': 'SamImageProcessorPil', 'torchvision': 'SamImageProcessor'}(SamConfig model) - sam2 —
{'torchvision': 'Sam2ImageProcessor'}(Sam2Config model) - sam2_video —
{'torchvision': 'Sam2ImageProcessor'}(Sam2VideoConfig model) - sam3 —
{'torchvision': 'Sam3ImageProcessor'}(Sam3Config model) - sam3_lite_text —
{'torchvision': 'Sam3ImageProcessor'}(Sam3LiteTextConfig model) - sam3_tracker —
{'torchvision': 'Sam3ImageProcessor'}(Sam3TrackerConfig model) - sam3_tracker_video —
{'torchvision': 'Sam3ImageProcessor'}(Sam3TrackerVideoConfig model) - sam3_video —
{'torchvision': 'Sam3ImageProcessor'}(Sam3VideoConfig model) - sam_hq —
{'torchvision': 'SamImageProcessor', 'pil': 'SamImageProcessorPil'}(SamHQConfig model) - segformer —
{'pil': 'SegformerImageProcessorPil', 'torchvision': 'SegformerImageProcessor'}(SegformerConfig model) - seggpt —
{'pil': 'SegGptImageProcessorPil', 'torchvision': 'SegGptImageProcessor'}(SegGptConfig model) - shieldgemma2 —
{'torchvision': 'Gemma3ImageProcessor', 'pil': 'Gemma3ImageProcessorPil'}(ShieldGemma2Config model) - siglip —
{'pil': 'SiglipImageProcessorPil', 'torchvision': 'SiglipImageProcessor'}(SiglipConfig model) - siglip2 —
{'pil': 'Siglip2ImageProcessorPil', 'torchvision': 'Siglip2ImageProcessor'}(Siglip2Config model) - slanext —
{'torchvision': 'SLANeXtImageProcessor'}(SLANeXtConfig model) - smolvlm —
{'pil': 'SmolVLMImageProcessorPil', 'torchvision': 'SmolVLMImageProcessor'}(SmolVLMConfig model) - superglue —
{'pil': 'SuperGlueImageProcessorPil', 'torchvision': 'SuperGlueImageProcessor'}(SuperGlueConfig model) - superpoint —
{'pil': 'SuperPointImageProcessorPil', 'torchvision': 'SuperPointImageProcessor'}(SuperPointConfig model) - swiftformer —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(SwiftFormerConfig model) - swin —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(SwinConfig model) - swin2sr —
{'pil': 'Swin2SRImageProcessorPil', 'torchvision': 'Swin2SRImageProcessor'}(Swin2SRConfig model) - swinv2 —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(Swinv2Config model) - t5gemma2 —
{'torchvision': 'Gemma3ImageProcessor', 'pil': 'Gemma3ImageProcessorPil'}(T5Gemma2Config model) - t5gemma2_encoder —
{'torchvision': 'Gemma3ImageProcessor', 'pil': 'Gemma3ImageProcessorPil'}(T5Gemma2EncoderConfig model) - table-transformer —
{'torchvision': 'DetrImageProcessor', 'pil': 'DetrImageProcessorPil'}(TableTransformerConfig model) - textnet —
{'pil': 'TextNetImageProcessorPil', 'torchvision': 'TextNetImageProcessor'}(TextNetConfig model) - timesformer —
{'pil': 'VideoMAEImageProcessorPil', 'torchvision': 'VideoMAEImageProcessor'}(TimesformerConfig model) - timm_wrapper —
{'pil': 'TimmWrapperImageProcessor'}(TimmWrapperConfig model) - trocr —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(TrOCRConfig model) - tvp —
{'pil': 'TvpImageProcessorPil', 'torchvision': 'TvpImageProcessor'}(TvpConfig model) - udop —
{'torchvision': 'LayoutLMv3ImageProcessor', 'pil': 'LayoutLMv3ImageProcessorPil'}(UdopConfig model) - upernet —
{'torchvision': 'SegformerImageProcessor', 'pil': 'SegformerImageProcessorPil'}(UperNetConfig model) - uvdoc —
{'torchvision': 'UVDocImageProcessor'}(UVDocConfig model) - video_llama_3 —
{'pil': 'VideoLlama3ImageProcessorPil', 'torchvision': 'VideoLlama3ImageProcessor'}(VideoLlama3Config model) - video_llava —
{'pil': 'VideoLlavaImageProcessor'}(VideoLlavaConfig model) - videomae —
{'pil': 'VideoMAEImageProcessorPil', 'torchvision': 'VideoMAEImageProcessor'}(VideoMAEConfig model) - vilt —
{'pil': 'ViltImageProcessorPil', 'torchvision': 'ViltImageProcessor'}(ViltConfig model) - vipllava —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(VipLlavaConfig model) - vit —
{'pil': 'ViTImageProcessorPil', 'torchvision': 'ViTImageProcessor'}(ViTConfig model) - vit_mae —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(ViTMAEConfig model) - vit_msn —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(ViTMSNConfig model) - vitmatte —
{'pil': 'VitMatteImageProcessorPil', 'torchvision': 'VitMatteImageProcessor'}(VitMatteConfig model) - vitpose —
{'pil': 'VitPoseImageProcessorPil', 'torchvision': 'VitPoseImageProcessor'}(VitPoseConfig model) - vivit —
{'torchvision': 'VivitImageProcessor'}(VivitConfig model) - xclip —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(XCLIPConfig model) - yolos —
{'pil': 'YolosImageProcessorPil', 'torchvision': 'YolosImageProcessor'}(YolosConfig model) - zoedepth —
{'pil': 'ZoeDepthImageProcessorPil', 'torchvision': 'ZoeDepthImageProcessor'}(ZoeDepthConfig model)
Passing
token=Trueis required when you want to use a private model.
Examples:
>>> from transformers import AutoImageProcessor
>>> # Download image processor from huggingface.co and cache.
>>> image_processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")
>>> # If image processor files are in a directory (e.g. image processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # image_processor = AutoImageProcessor.from_pretrained("./test/saved_model/")register
< source >( config_class slow_image_processor_class: type | None = None fast_image_processor_class: type | None = None image_processor_classes: dict[str, type] | None = None exist_ok: bool = False )
Parameters
- config_class (PreTrainedConfig) — The configuration corresponding to the model to register.
- slow_image_processor_class (
type, optional) — The PIL backend image processor class (deprecated, useimage_processor_classes={"pil": ...}). - fast_image_processor_class (
type, optional) — The Torchvision backend image processor class (deprecated, useimage_processor_classes={"torchvision": ...}). - image_processor_classes (
dict[str, type], optional) — Dictionary mapping backend names to image processor classes. Allows registering custom backends. Example:{"pil": MyPilProcessor, "torchvision": MyTorchvisionProcessor, "custom": MyCustomProcessor} - exist_ok (
bool, optional, defaults toFalse) — IfTrue, allow overwriting existing registrations.
Register a new image processor for this class.
AutoProcessor
This is a generic processor class that will be instantiated as one of the processor classes of the library when created with the AutoProcessor.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — This can be either:- a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co.
- a path to a directory containing a processor files saved using the
save_pretrained()method, e.g.,./my_model_directory/.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.The proxies are used on each request. - token (
stror bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue, will use the token generated when runninghf auth login(stored in~/.huggingface). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - return_unused_kwargs (
bool, optional, defaults toFalse) — IfFalse, then this function returns just the final feature extractor object. IfTrue, then this functions returns aTuple(feature_extractor, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part ofkwargswhich has not been used to updatefeature_extractorand is otherwise ignored. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - kwargs (
dict[str, Any], optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by thereturn_unused_kwargskeyword parameter.
Instantiate one of the processor classes of the library from a pretrained model vocabulary.
The processor class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible):
- aimv2 — CLIPProcessor (Aimv2Config model)
- align — AlignProcessor (AlignConfig model)
- altclip — AltCLIPProcessor (AltCLIPConfig model)
- aria —
AriaProcessor(AriaConfig model) - audioflamingo3 —
AudioFlamingo3Processor(AudioFlamingo3Config model) - aya_vision —
AyaVisionProcessor(AyaVisionConfig model) - bark — BarkProcessor (BarkConfig model)
- blip — BlipProcessor (BlipConfig model)
- blip-2 — Blip2Processor (Blip2Config model)
- bridgetower — BridgeTowerProcessor (BridgeTowerConfig model)
- chameleon —
ChameleonProcessor(ChameleonConfig model) - chinese_clip — ChineseCLIPProcessor (ChineseCLIPConfig model)
- clap — ClapProcessor (ClapConfig model)
- clip — CLIPProcessor (CLIPConfig model)
- clipseg — CLIPSegProcessor (CLIPSegConfig model)
- clvp — ClvpProcessor (ClvpConfig model)
- cohere2_vision —
Cohere2VisionProcessor(Cohere2VisionConfig model) - cohere_asr —
CohereAsrProcessor(CohereAsrConfig model) - colmodernvbert —
ColModernVBertProcessor(ColModernVBertConfig model) - colpali —
ColPaliProcessor(ColPaliConfig model) - colqwen2 —
ColQwen2Processor(ColQwen2Config model) - deepseek_vl —
DeepseekVLProcessor(DeepseekVLConfig model) - deepseek_vl_hybrid —
DeepseekVLHybridProcessor(DeepseekVLHybridConfig model) - dia —
DiaProcessor(DiaConfig model) - edgetam —
Sam2Processor(EdgeTamConfig model) - emu3 —
Emu3Processor(Emu3Config model) - ernie4_5_vl_moe —
Ernie4_5_VLMoeProcessor(Ernie4_5_VLMoeConfig model) - evolla —
EvollaProcessor(EvollaConfig model) - flava —
FlavaProcessor(FlavaConfig model) - florence2 —
Florence2Processor(Florence2Config model) - fuyu —
FuyuProcessor(FuyuConfig model) - gemma3 —
Gemma3Processor(Gemma3Config model) - gemma3n —
Gemma3nProcessor(Gemma3nConfig model) - gemma4 —
Gemma4Processor(Gemma4Config model) - git —
GitProcessor(GitConfig model) - glm46v —
Glm46VProcessor(Glm46VConfig model) - glm4v —
Glm4vProcessor(Glm4vConfig model) - glm4v_moe —
Glm4vProcessor(Glm4vMoeConfig model) - glm_image —
Glm4vProcessor(GlmImageConfig model) - glmasr —
GlmAsrProcessor(GlmAsrConfig model) - got_ocr2 —
GotOcr2Processor(GotOcr2Config model) - granite_speech —
GraniteSpeechProcessor(GraniteSpeechConfig model) - grounding-dino —
GroundingDinoProcessor(GroundingDinoConfig model) - groupvit — CLIPProcessor (GroupViTConfig model)
- higgs_audio_v2 —
HiggsAudioV2Processor(HiggsAudioV2Config model) - hubert —
Wav2Vec2Processor(HubertConfig model) - idefics —
IdeficsProcessor(IdeficsConfig model) - idefics2 —
Idefics2Processor(Idefics2Config model) - idefics3 —
Idefics3Processor(Idefics3Config model) - instructblip —
InstructBlipProcessor(InstructBlipConfig model) - instructblipvideo —
InstructBlipVideoProcessor(InstructBlipVideoConfig model) - internvl —
InternVLProcessor(InternVLConfig model) - janus —
JanusProcessor(JanusConfig model) - kosmos-2 —
Kosmos2Processor(Kosmos2Config model) - kosmos-2.5 —
Kosmos2_5Processor(Kosmos2_5Config model) - kyutai_speech_to_text —
KyutaiSpeechToTextProcessor(KyutaiSpeechToTextConfig model) - lasr_ctc —
LasrProcessor(LasrCTCConfig model) - lasr_encoder —
LasrProcessor(LasrEncoderConfig model) - layoutlmv2 —
LayoutLMv2Processor(LayoutLMv2Config model) - layoutlmv3 —
LayoutLMv3Processor(LayoutLMv3Config model) - layoutxlm —
LayoutXLMProcessor(LayoutXLMConfig model) - lfm2_vl —
Lfm2VlProcessor(Lfm2VlConfig model) - lighton_ocr —
LightOnOcrProcessor(LightOnOcrConfig model) - llama4 —
Llama4Processor(Llama4Config model) - llava —
LlavaProcessor(LlavaConfig model) - llava_next —
LlavaNextProcessor(LlavaNextConfig model) - llava_next_video —
LlavaNextVideoProcessor(LlavaNextVideoConfig model) - llava_onevision —
LlavaOnevisionProcessor(LlavaOnevisionConfig model) - markuplm —
MarkupLMProcessor(MarkupLMConfig model) - metaclip_2 — CLIPProcessor (MetaClip2Config model)
- mgp-str —
MgpstrProcessor(MgpstrConfig model) - mistral3 —
PixtralProcessor(Mistral3Config model) - mllama —
MllamaProcessor(MllamaConfig model) - mm-grounding-dino —
GroundingDinoProcessor(MMGroundingDinoConfig model) - modernvbert —
Idefics3Processor(ModernVBertConfig model) - moonshine —
Wav2Vec2Processor(MoonshineConfig model) - moonshine_streaming —
MoonshineStreamingProcessor(MoonshineStreamingConfig model) - musicflamingo —
MusicFlamingoProcessor(MusicFlamingoConfig model) - omdet-turbo —
OmDetTurboProcessor(OmDetTurboConfig model) - oneformer —
OneFormerProcessor(OneFormerConfig model) - ovis2 —
Ovis2Processor(Ovis2Config model) - owlv2 —
Owlv2Processor(Owlv2Config model) - owlvit —
OwlViTProcessor(OwlViTConfig model) - paddleocr_vl —
PaddleOCRVLProcessor(PaddleOCRVLConfig model) - paligemma —
PaliGemmaProcessor(PaliGemmaConfig model) - perception_lm —
PerceptionLMProcessor(PerceptionLMConfig model) - phi4_multimodal —
Phi4MultimodalProcessor(Phi4MultimodalConfig model) - pi0 —
PI0Processor(PI0Config model) - pix2struct —
Pix2StructProcessor(Pix2StructConfig model) - pixtral —
PixtralProcessor(PixtralVisionConfig model) - pop2piano —
Pop2PianoProcessor(Pop2PianoConfig model) - pp_chart2table —
PPChart2TableProcessor(PPChart2TableConfig model) - qianfan_ocr —
QianfanOCRProcessor(QianfanOCRConfig model) - qwen2_5_omni —
Qwen2_5OmniProcessor(Qwen2_5OmniConfig model) - qwen2_5_vl —
Qwen2_5_VLProcessor(Qwen2_5_VLConfig model) - qwen2_audio —
Qwen2AudioProcessor(Qwen2AudioConfig model) - qwen2_vl —
Qwen2VLProcessor(Qwen2VLConfig model) - qwen3_5 —
Qwen3VLProcessor(Qwen3_5Config model) - qwen3_5_moe —
Qwen3VLProcessor(Qwen3_5MoeConfig model) - qwen3_omni_moe —
Qwen3OmniMoeProcessor(Qwen3OmniMoeConfig model) - qwen3_vl —
Qwen3VLProcessor(Qwen3VLConfig model) - qwen3_vl_moe —
Qwen3VLProcessor(Qwen3VLMoeConfig model) - sam —
SamProcessor(SamConfig model) - sam2 —
Sam2Processor(Sam2Config model) - sam3 —
Sam3Processor(Sam3Config model) - sam3_lite_text —
Sam3Processor(Sam3LiteTextConfig model) - sam_hq —
SamHQProcessor(SamHQConfig model) - seamless_m4t —
SeamlessM4TProcessor(SeamlessM4TConfig model) - sew —
Wav2Vec2Processor(SEWConfig model) - sew-d —
Wav2Vec2Processor(SEWDConfig model) - shieldgemma2 —
ShieldGemma2Processor(ShieldGemma2Config model) - siglip —
SiglipProcessor(SiglipConfig model) - siglip2 —
Siglip2Processor(Siglip2Config model) - smolvlm —
SmolVLMProcessor(SmolVLMConfig model) - speech_to_text —
Speech2TextProcessor(Speech2TextConfig model) - speecht5 —
SpeechT5Processor(SpeechT5Config model) - t5gemma2 —
Gemma3Processor(T5Gemma2Config model) - t5gemma2_encoder —
Gemma3Processor(T5Gemma2EncoderConfig model) - trocr —
TrOCRProcessor(TrOCRConfig model) - tvp —
TvpProcessor(TvpConfig model) - udop —
UdopProcessor(UdopConfig model) - unispeech —
Wav2Vec2Processor(UniSpeechConfig model) - unispeech-sat —
Wav2Vec2Processor(UniSpeechSatConfig model) - vibevoice_asr —
VibeVoiceAsrProcessor(VibeVoiceAsrConfig model) - video_llava —
VideoLlavaProcessor(VideoLlavaConfig model) - vilt —
ViltProcessor(ViltConfig model) - vipllava —
LlavaProcessor(VipLlavaConfig model) - vision-text-dual-encoder —
VisionTextDualEncoderProcessor(VisionTextDualEncoderConfig model) - voxtral —
VoxtralProcessor(VoxtralConfig model) - voxtral_realtime —
VoxtralRealtimeProcessor(VoxtralRealtimeConfig model) - wav2vec2 —
Wav2Vec2Processor(Wav2Vec2Config model) - wav2vec2-bert —
Wav2Vec2Processor(Wav2Vec2BertConfig model) - wav2vec2-conformer —
Wav2Vec2Processor(Wav2Vec2ConformerConfig model) - wavlm —
Wav2Vec2Processor(WavLMConfig model) - whisper —
WhisperProcessor(WhisperConfig model) - xclip —
XCLIPProcessor(XCLIPConfig model)
Passing
token=Trueis required when you want to use a private model.
Examples:
>>> from transformers import AutoProcessor
>>> # Download processor from huggingface.co and cache.
>>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base-960h")
>>> # If processor files are in a directory (e.g. processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # processor = AutoProcessor.from_pretrained("./test/saved_model/")register
< source >( config_class processor_class exist_ok = False )
Parameters
- config_class (PreTrainedConfig) — The configuration corresponding to the model to register.
- processor_class (ProcessorMixin) — The processor to register.
Register a new processor for this class.
Generic model classes
以下の自動クラスは、特定のヘッドを持たないベースモデルクラスをインスタンス化するために利用可能です。
AutoModel
This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- ASTConfig configuration class: ASTModel (ASTConfig model)
AfmoeConfigconfiguration class:AfmoeModel(AfmoeConfig model)Aimv2Configconfiguration class:Aimv2Model(Aimv2Config model)Aimv2VisionConfigconfiguration class:Aimv2VisionModel(Aimv2VisionConfig model)- AlbertConfig configuration class: AlbertModel (AlbertConfig model)
- AlignConfig configuration class: AlignModel (AlignConfig model)
- AltCLIPConfig configuration class: AltCLIPModel (AltCLIPConfig model)
ApertusConfigconfiguration class:ApertusModel(ApertusConfig model)ArceeConfigconfiguration class:ArceeModel(ArceeConfig model)AriaConfigconfiguration class:AriaModel(AriaConfig model)AriaTextConfigconfiguration class:AriaTextModel(AriaTextConfig model)AudioFlamingo3Configconfiguration class:AudioFlamingo3ForConditionalGeneration(AudioFlamingo3Config model)AudioFlamingo3EncoderConfigconfiguration class:AudioFlamingo3Encoder(AudioFlamingo3EncoderConfig model)- AutoformerConfig configuration class: AutoformerModel (AutoformerConfig model)
AyaVisionConfigconfiguration class:AyaVisionModel(AyaVisionConfig model)BambaConfigconfiguration class:BambaModel(BambaConfig model)- BarkConfig configuration class: BarkModel (BarkConfig model)
- BartConfig configuration class: BartModel (BartConfig model)
- BeitConfig configuration class: BeitModel (BeitConfig model)
- BertConfig configuration class: BertModel (BertConfig model)
- BertGenerationConfig configuration class: BertGenerationEncoder (BertGenerationConfig model)
- BigBirdConfig configuration class: BigBirdModel (BigBirdConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusModel (BigBirdPegasusConfig model)
- BioGptConfig configuration class: BioGptModel (BioGptConfig model)
- BitConfig configuration class: BitModel (BitConfig model)
BitNetConfigconfiguration class:BitNetModel(BitNetConfig model)- BlenderbotConfig configuration class: BlenderbotModel (BlenderbotConfig model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallModel (BlenderbotSmallConfig model)
- Blip2Config configuration class: Blip2Model (Blip2Config model)
- Blip2QFormerConfig configuration class: Blip2QFormerModel (Blip2QFormerConfig model)
- BlipConfig configuration class: BlipModel (BlipConfig model)
- BloomConfig configuration class: BloomModel (BloomConfig model)
BltConfigconfiguration class:BltModel(BltConfig model)- BridgeTowerConfig configuration class: BridgeTowerModel (BridgeTowerConfig model)
- BrosConfig configuration class: BrosModel (BrosConfig model)
- CLIPConfig configuration class: CLIPModel (CLIPConfig model)
- CLIPSegConfig configuration class: CLIPSegModel (CLIPSegConfig model)
- CLIPTextConfig configuration class: CLIPTextModel (CLIPTextConfig model)
- CLIPVisionConfig configuration class: CLIPVisionModel (CLIPVisionConfig model)
- CTRLConfig configuration class: CTRLModel (CTRLConfig model)
- CamembertConfig configuration class: CamembertModel (CamembertConfig model)
- CanineConfig configuration class: CanineModel (CanineConfig model)
ChameleonConfigconfiguration class:ChameleonModel(ChameleonConfig model)- ChineseCLIPConfig configuration class: ChineseCLIPModel (ChineseCLIPConfig model)
- ChineseCLIPVisionConfig configuration class: ChineseCLIPVisionModel (ChineseCLIPVisionConfig model)
- ClapConfig configuration class: ClapModel (ClapConfig model)
- ClvpConfig configuration class: ClvpModelForConditionalGeneration (ClvpConfig model)
- CodeGenConfig configuration class: CodeGenModel (CodeGenConfig model)
Cohere2Configconfiguration class:Cohere2Model(Cohere2Config model)Cohere2VisionConfigconfiguration class:Cohere2VisionModel(Cohere2VisionConfig model)CohereAsrConfigconfiguration class:CohereAsrModel(CohereAsrConfig model)CohereConfigconfiguration class:CohereModel(CohereConfig model)- ConditionalDetrConfig configuration class: ConditionalDetrModel (ConditionalDetrConfig model)
- ConvBertConfig configuration class: ConvBertModel (ConvBertConfig model)
- ConvNextConfig configuration class: ConvNextModel (ConvNextConfig model)
- ConvNextV2Config configuration class: ConvNextV2Model (ConvNextV2Config model)
- CpmAntConfig configuration class: CpmAntModel (CpmAntConfig model)
CsmConfigconfiguration class:CsmForConditionalGeneration(CsmConfig model)- CvtConfig configuration class: CvtModel (CvtConfig model)
CwmConfigconfiguration class:CwmModel(CwmConfig model)DFineConfigconfiguration class:DFineModel(DFineConfig model)DINOv3ConvNextConfigconfiguration class:DINOv3ConvNextModel(DINOv3ConvNextConfig model)DINOv3ViTConfigconfiguration class:DINOv3ViTModel(DINOv3ViTConfig model)DPRConfigconfiguration class:DPRQuestionEncoder(DPRConfig model)DPTConfigconfiguration class:DPTModel(DPTConfig model)DabDetrConfigconfiguration class:DabDetrModel(DabDetrConfig model)DacConfigconfiguration class:DacModel(DacConfig model)- Data2VecAudioConfig configuration class: Data2VecAudioModel (Data2VecAudioConfig model)
- Data2VecTextConfig configuration class: Data2VecTextModel (Data2VecTextConfig model)
- Data2VecVisionConfig configuration class: Data2VecVisionModel (Data2VecVisionConfig model)
DbrxConfigconfiguration class:DbrxModel(DbrxConfig model)- DebertaConfig configuration class: DebertaModel (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2Model (DebertaV2Config model)
- DecisionTransformerConfig configuration class:
DecisionTransformerModel(DecisionTransformerConfig model) DeepseekV2Configconfiguration class:DeepseekV2Model(DeepseekV2Config model)DeepseekV3Configconfiguration class:DeepseekV3Model(DeepseekV3Config model)DeepseekVLConfigconfiguration class:DeepseekVLModel(DeepseekVLConfig model)DeepseekVLHybridConfigconfiguration class:DeepseekVLHybridModel(DeepseekVLHybridConfig model)- DeformableDetrConfig configuration class: DeformableDetrModel (DeformableDetrConfig model)
- DeiTConfig configuration class: DeiTModel (DeiTConfig model)
DepthProConfigconfiguration class:DepthProModel(DepthProConfig model)- DetrConfig configuration class: DetrModel (DetrConfig model)
DiaConfigconfiguration class:DiaModel(DiaConfig model)DiffLlamaConfigconfiguration class:DiffLlamaModel(DiffLlamaConfig model)- DinatConfig configuration class: DinatModel (DinatConfig model)
Dinov2Configconfiguration class:Dinov2Model(Dinov2Config model)Dinov2WithRegistersConfigconfiguration class:Dinov2WithRegistersModel(Dinov2WithRegistersConfig model)DistilBertConfigconfiguration class:DistilBertModel(DistilBertConfig model)DogeConfigconfiguration class:DogeModel(DogeConfig model)DonutSwinConfigconfiguration class:DonutSwinModel(DonutSwinConfig model)Dots1Configconfiguration class:Dots1Model(Dots1Config model)EdgeTamConfigconfiguration class:EdgeTamModel(EdgeTamConfig model)EdgeTamVideoConfigconfiguration class:EdgeTamVideoModel(EdgeTamVideoConfig model)EdgeTamVisionConfigconfiguration class:EdgeTamVisionModel(EdgeTamVisionConfig model)EfficientLoFTRConfigconfiguration class:EfficientLoFTRModel(EfficientLoFTRConfig model)EfficientNetConfigconfiguration class:EfficientNetModel(EfficientNetConfig model)ElectraConfigconfiguration class:ElectraModel(ElectraConfig model)Emu3Configconfiguration class:Emu3Model(Emu3Config model)EncodecConfigconfiguration class:EncodecModel(EncodecConfig model)Ernie4_5Configconfiguration class:Ernie4_5Model(Ernie4_5Config model)Ernie4_5_MoeConfigconfiguration class:Ernie4_5_MoeModel(Ernie4_5_MoeConfig model)Ernie4_5_VLMoeConfigconfiguration class:Ernie4_5_VLMoeModel(Ernie4_5_VLMoeConfig model)ErnieConfigconfiguration class:ErnieModel(ErnieConfig model)EsmConfigconfiguration class:EsmModel(EsmConfig model)EuroBertConfigconfiguration class:EuroBertModel(EuroBertConfig model)EvollaConfigconfiguration class:EvollaModel(EvollaConfig model)Exaone4Configconfiguration class:Exaone4Model(Exaone4Config model)ExaoneMoeConfigconfiguration class:ExaoneMoeModel(ExaoneMoeConfig model)FNetConfigconfiguration class:FNetModel(FNetConfig model)FSMTConfigconfiguration class:FSMTModel(FSMTConfig model)FalconConfigconfiguration class:FalconModel(FalconConfig model)FalconH1Configconfiguration class:FalconH1Model(FalconH1Config model)FalconMambaConfigconfiguration class:FalconMambaModel(FalconMambaConfig model)FastSpeech2ConformerConfigconfiguration class:FastSpeech2ConformerModel(FastSpeech2ConformerConfig model)FastSpeech2ConformerWithHifiGanConfigconfiguration class:FastSpeech2ConformerWithHifiGan(FastSpeech2ConformerWithHifiGanConfig model)FastVlmConfigconfiguration class:FastVlmModel(FastVlmConfig model)FlaubertConfigconfiguration class:FlaubertModel(FlaubertConfig model)FlavaConfigconfiguration class:FlavaModel(FlavaConfig model)FlexOlmoConfigconfiguration class:FlexOlmoModel(FlexOlmoConfig model)Florence2Configconfiguration class:Florence2Model(Florence2Config model)FocalNetConfigconfiguration class:FocalNetModel(FocalNetConfig model)FunnelConfigconfiguration class:FunnelModelorFunnelBaseModel(FunnelConfig model)FuyuConfigconfiguration class:FuyuModel(FuyuConfig model)GLPNConfigconfiguration class:GLPNModel(GLPNConfig model)GPT2Configconfiguration class:GPT2Model(GPT2Config model)GPTBigCodeConfigconfiguration class:GPTBigCodeModel(GPTBigCodeConfig model)GPTJConfigconfiguration class:GPTJModel(GPTJConfig model)GPTNeoConfigconfiguration class:GPTNeoModel(GPTNeoConfig model)GPTNeoXConfigconfiguration class:GPTNeoXModel(GPTNeoXConfig model)GPTNeoXJapaneseConfigconfiguration class:GPTNeoXJapaneseModel(GPTNeoXJapaneseConfig model)Gemma2Configconfiguration class:Gemma2Model(Gemma2Config model)Gemma3Configconfiguration class:Gemma3Model(Gemma3Config model)Gemma3TextConfigconfiguration class:Gemma3TextModel(Gemma3TextConfig model)Gemma3nAudioConfigconfiguration class:Gemma3nAudioEncoder(Gemma3nAudioConfig model)Gemma3nConfigconfiguration class:Gemma3nModel(Gemma3nConfig model)Gemma3nTextConfigconfiguration class:Gemma3nTextModel(Gemma3nTextConfig model)Gemma3nVisionConfigconfiguration class:TimmWrapperModel(Gemma3nVisionConfig model)Gemma4AudioConfigconfiguration class:Gemma4AudioModel(Gemma4AudioConfig model)Gemma4Configconfiguration class:Gemma4Model(Gemma4Config model)Gemma4TextConfigconfiguration class:Gemma4TextModel(Gemma4TextConfig model)Gemma4VisionConfigconfiguration class:Gemma4VisionModel(Gemma4VisionConfig model)GemmaConfigconfiguration class:GemmaModel(GemmaConfig model)GitConfigconfiguration class:GitModel(GitConfig model)Glm46VConfigconfiguration class:Glm46VModel(Glm46VConfig model)Glm4Configconfiguration class:Glm4Model(Glm4Config model)Glm4MoeConfigconfiguration class:Glm4MoeModel(Glm4MoeConfig model)Glm4MoeLiteConfigconfiguration class:Glm4MoeLiteModel(Glm4MoeLiteConfig model)Glm4vConfigconfiguration class:Glm4vModel(Glm4vConfig model)Glm4vMoeConfigconfiguration class:Glm4vMoeModel(Glm4vMoeConfig model)Glm4vMoeTextConfigconfiguration class:Glm4vMoeTextModel(Glm4vMoeTextConfig model)Glm4vMoeVisionConfigconfiguration class:Glm4vMoeVisionModel(Glm4vMoeVisionConfig model)Glm4vTextConfigconfiguration class:Glm4vTextModel(Glm4vTextConfig model)Glm4vVisionConfigconfiguration class:Glm4vVisionModel(Glm4vVisionConfig model)GlmAsrConfigconfiguration class:GlmAsrForConditionalGeneration(GlmAsrConfig model)GlmAsrEncoderConfigconfiguration class:GlmAsrEncoder(GlmAsrEncoderConfig model)GlmConfigconfiguration class:GlmModel(GlmConfig model)GlmImageConfigconfiguration class:GlmImageModel(GlmImageConfig model)GlmImageTextConfigconfiguration class:GlmImageTextModel(GlmImageTextConfig model)GlmImageVQVAEConfigconfiguration class:GlmImageVQVAE(GlmImageVQVAEConfig model)GlmImageVisionConfigconfiguration class:GlmImageVisionModel(GlmImageVisionConfig model)GlmMoeDsaConfigconfiguration class:GlmMoeDsaModel(GlmMoeDsaConfig model)GlmOcrConfigconfiguration class:GlmOcrModel(GlmOcrConfig model)GlmOcrTextConfigconfiguration class:GlmOcrTextModel(GlmOcrTextConfig model)GlmOcrVisionConfigconfiguration class:GlmOcrVisionModel(GlmOcrVisionConfig model)GotOcr2Configconfiguration class:GotOcr2Model(GotOcr2Config model)GptOssConfigconfiguration class:GptOssModel(GptOssConfig model)GraniteConfigconfiguration class:GraniteModel(GraniteConfig model)GraniteMoeConfigconfiguration class:GraniteMoeModel(GraniteMoeConfig model)GraniteMoeHybridConfigconfiguration class:GraniteMoeHybridModel(GraniteMoeHybridConfig model)GraniteMoeSharedConfigconfiguration class:GraniteMoeSharedModel(GraniteMoeSharedConfig model)GroundingDinoConfigconfiguration class:GroundingDinoModel(GroundingDinoConfig model)GroupViTConfigconfiguration class:GroupViTModel(GroupViTConfig model)HGNetV2Configconfiguration class:HGNetV2Backbone(HGNetV2Config model)HeliumConfigconfiguration class:HeliumModel(HeliumConfig model)HieraConfigconfiguration class:HieraModel(HieraConfig model)HiggsAudioV2Configconfiguration class:HiggsAudioV2ForConditionalGeneration(HiggsAudioV2Config model)HiggsAudioV2TokenizerConfigconfiguration class:HiggsAudioV2TokenizerModel(HiggsAudioV2TokenizerConfig model)HubertConfigconfiguration class:HubertModel(HubertConfig model)HunYuanDenseV1Configconfiguration class:HunYuanDenseV1Model(HunYuanDenseV1Config model)HunYuanMoEV1Configconfiguration class:HunYuanMoEV1Model(HunYuanMoEV1Config model)IBertConfigconfiguration class:IBertModel(IBertConfig model)IJepaConfigconfiguration class:IJepaModel(IJepaConfig model)Idefics2Configconfiguration class:Idefics2Model(Idefics2Config model)Idefics3Configconfiguration class:Idefics3Model(Idefics3Config model)Idefics3VisionConfigconfiguration class:Idefics3VisionTransformer(Idefics3VisionConfig model)IdeficsConfigconfiguration class:IdeficsModel(IdeficsConfig model)ImageGPTConfigconfiguration class:ImageGPTModel(ImageGPTConfig model)InformerConfigconfiguration class:InformerModel(InformerConfig model)InstructBlipConfigconfiguration class:InstructBlipModel(InstructBlipConfig model)InstructBlipVideoConfigconfiguration class:InstructBlipVideoModel(InstructBlipVideoConfig model)InternVLConfigconfiguration class:InternVLModel(InternVLConfig model)InternVLVisionConfigconfiguration class:InternVLVisionModel(InternVLVisionConfig model)Jais2Configconfiguration class:Jais2Model(Jais2Config model)JambaConfigconfiguration class:JambaModel(JambaConfig model)JanusConfigconfiguration class:JanusModel(JanusConfig model)JetMoeConfigconfiguration class:JetMoeModel(JetMoeConfig model)JinaEmbeddingsV3Configconfiguration class:JinaEmbeddingsV3Model(JinaEmbeddingsV3Config model)Kosmos2Configconfiguration class:Kosmos2Model(Kosmos2Config model)Kosmos2_5Configconfiguration class:Kosmos2_5Model(Kosmos2_5Config model)KyutaiSpeechToTextConfigconfiguration class:KyutaiSpeechToTextModel(KyutaiSpeechToTextConfig model)LEDConfigconfiguration class:LEDModel(LEDConfig model)LasrCTCConfigconfiguration class:LasrForCTC(LasrCTCConfig model)LasrEncoderConfigconfiguration class:LasrEncoder(LasrEncoderConfig model)LayoutLMConfigconfiguration class:LayoutLMModel(LayoutLMConfig model)LayoutLMv2Configconfiguration class:LayoutLMv2Model(LayoutLMv2Config model)LayoutLMv3Configconfiguration class:LayoutLMv3Model(LayoutLMv3Config model)LevitConfigconfiguration class:LevitModel(LevitConfig model)Lfm2Configconfiguration class:Lfm2Model(Lfm2Config model)Lfm2MoeConfigconfiguration class:Lfm2MoeModel(Lfm2MoeConfig model)Lfm2VlConfigconfiguration class:Lfm2VlModel(Lfm2VlConfig model)LightGlueConfigconfiguration class:LightGlueForKeypointMatching(LightGlueConfig model)LightOnOcrConfigconfiguration class:LightOnOcrModel(LightOnOcrConfig model)LiltConfigconfiguration class:LiltModel(LiltConfig model)Llama4Configconfiguration class:Llama4ForConditionalGeneration(Llama4Config model)Llama4TextConfigconfiguration class:Llama4TextModel(Llama4TextConfig model)LlamaConfigconfiguration class:LlamaModel(LlamaConfig model)LlavaConfigconfiguration class:LlavaModel(LlavaConfig model)LlavaNextConfigconfiguration class:LlavaNextModel(LlavaNextConfig model)LlavaNextVideoConfigconfiguration class:LlavaNextVideoModel(LlavaNextVideoConfig model)LlavaOnevisionConfigconfiguration class:LlavaOnevisionModel(LlavaOnevisionConfig model)LongT5Configconfiguration class:LongT5Model(LongT5Config model)LongcatFlashConfigconfiguration class:LongcatFlashModel(LongcatFlashConfig model)LongformerConfigconfiguration class:LongformerModel(LongformerConfig model)LukeConfigconfiguration class:LukeModel(LukeConfig model)LwDetrConfigconfiguration class:LwDetrModel(LwDetrConfig model)LxmertConfigconfiguration class:LxmertModel(LxmertConfig model)M2M100Configconfiguration class:M2M100Model(M2M100Config model)MBartConfigconfiguration class:MBartModel(MBartConfig model)MLCDVisionConfigconfiguration class:MLCDVisionModel(MLCDVisionConfig model)MMGroundingDinoConfigconfiguration class:MMGroundingDinoModel(MMGroundingDinoConfig model)MPNetConfigconfiguration class:MPNetModel(MPNetConfig model)MT5Configconfiguration class:MT5Model(MT5Config model)Mamba2Configconfiguration class:Mamba2Model(Mamba2Config model)MambaConfigconfiguration class:MambaModel(MambaConfig model)MarianConfigconfiguration class:MarianModel(MarianConfig model)MarkupLMConfigconfiguration class:MarkupLMModel(MarkupLMConfig model)Mask2FormerConfigconfiguration class:Mask2FormerModel(Mask2FormerConfig model)MaskFormerConfigconfiguration class:MaskFormerModel(MaskFormerConfig model)MaskFormerSwinConfigconfiguration class:MaskFormerSwinModel(MaskFormerSwinConfig model)MegatronBertConfigconfiguration class:MegatronBertModel(MegatronBertConfig model)MetaClip2Configconfiguration class:MetaClip2Model(MetaClip2Config model)MgpstrConfigconfiguration class:MgpstrForSceneTextRecognition(MgpstrConfig model)MimiConfigconfiguration class:MimiModel(MimiConfig model)MiniMaxConfigconfiguration class:MiniMaxModel(MiniMaxConfig model)MiniMaxM2Configconfiguration class:MiniMaxM2Model(MiniMaxM2Config model)Ministral3Configconfiguration class:Ministral3Model(Ministral3Config model)MinistralConfigconfiguration class:MinistralModel(MinistralConfig model)Mistral3Configconfiguration class:Mistral3Model(Mistral3Config model)Mistral4Configconfiguration class:Mistral4Model(Mistral4Config model)MistralConfigconfiguration class:MistralModel(MistralConfig model)MixtralConfigconfiguration class:MixtralModel(MixtralConfig model)MllamaConfigconfiguration class:MllamaModel(MllamaConfig model)MobileBertConfigconfiguration class:MobileBertModel(MobileBertConfig model)MobileNetV1Configconfiguration class:MobileNetV1Model(MobileNetV1Config model)MobileNetV2Configconfiguration class:MobileNetV2Model(MobileNetV2Config model)MobileViTConfigconfiguration class:MobileViTModel(MobileViTConfig model)MobileViTV2Configconfiguration class:MobileViTV2Model(MobileViTV2Config model)ModernBertConfigconfiguration class:ModernBertModel(ModernBertConfig model)ModernBertDecoderConfigconfiguration class:ModernBertDecoderModel(ModernBertDecoderConfig model)ModernVBertConfigconfiguration class:ModernVBertModel(ModernVBertConfig model)MoonshineConfigconfiguration class:MoonshineModel(MoonshineConfig model)MoonshineStreamingConfigconfiguration class:MoonshineStreamingModel(MoonshineStreamingConfig model)MoshiConfigconfiguration class:MoshiModel(MoshiConfig model)MptConfigconfiguration class:MptModel(MptConfig model)MraConfigconfiguration class:MraModel(MraConfig model)MusicFlamingoConfigconfiguration class:MusicFlamingoForConditionalGeneration(MusicFlamingoConfig model)MusicgenConfigconfiguration class:MusicgenModel(MusicgenConfig model)MusicgenMelodyConfigconfiguration class:MusicgenMelodyModel(MusicgenMelodyConfig model)MvpConfigconfiguration class:MvpModel(MvpConfig model)NanoChatConfigconfiguration class:NanoChatModel(NanoChatConfig model)NemotronConfigconfiguration class:NemotronModel(NemotronConfig model)NemotronHConfigconfiguration class:NemotronHModel(NemotronHConfig model)NllbMoeConfigconfiguration class:NllbMoeModel(NllbMoeConfig model)NomicBertConfigconfiguration class:NomicBertModel(NomicBertConfig model)NystromformerConfigconfiguration class:NystromformerModel(NystromformerConfig model)OPTConfigconfiguration class:OPTModel(OPTConfig model)Olmo2Configconfiguration class:Olmo2Model(Olmo2Config model)Olmo3Configconfiguration class:Olmo3Model(Olmo3Config model)OlmoConfigconfiguration class:OlmoModel(OlmoConfig model)OlmoHybridConfigconfiguration class:OlmoHybridModel(OlmoHybridConfig model)OlmoeConfigconfiguration class:OlmoeModel(OlmoeConfig model)OmDetTurboConfigconfiguration class:OmDetTurboForObjectDetection(OmDetTurboConfig model)OneFormerConfigconfiguration class:OneFormerModel(OneFormerConfig model)OpenAIGPTConfigconfiguration class:OpenAIGPTModel(OpenAIGPTConfig model)Ovis2Configconfiguration class:Ovis2Model(Ovis2Config model)OwlViTConfigconfiguration class:OwlViTModel(OwlViTConfig model)Owlv2Configconfiguration class:Owlv2Model(Owlv2Config model)PI0Configconfiguration class:PI0Model(PI0Config model)PLBartConfigconfiguration class:PLBartModel(PLBartConfig model)PPDocLayoutV3Configconfiguration class:PPDocLayoutV3Model(PPDocLayoutV3Config model)PPOCRV5MobileRecConfigconfiguration class:PPOCRV5MobileRecModel(PPOCRV5MobileRecConfig model)PPOCRV5ServerRecConfigconfiguration class:PPOCRV5ServerRecModel(PPOCRV5ServerRecConfig model)PaliGemmaConfigconfiguration class:PaliGemmaModel(PaliGemmaConfig model)ParakeetCTCConfigconfiguration class:ParakeetForCTC(ParakeetCTCConfig model)ParakeetEncoderConfigconfiguration class:ParakeetEncoder(ParakeetEncoderConfig model)PatchTSMixerConfigconfiguration class:PatchTSMixerModel(PatchTSMixerConfig model)PatchTSTConfigconfiguration class:PatchTSTModel(PatchTSTConfig model)PeAudioConfigconfiguration class:PeAudioModel(PeAudioConfig model)PeAudioEncoderConfigconfiguration class:PeAudioEncoder(PeAudioEncoderConfig model)PeAudioVideoConfigconfiguration class:PeAudioVideoModel(PeAudioVideoConfig model)PeAudioVideoEncoderConfigconfiguration class:PeAudioVideoEncoder(PeAudioVideoEncoderConfig model)PeVideoConfigconfiguration class:PeVideoModel(PeVideoConfig model)PeVideoEncoderConfigconfiguration class:PeVideoEncoder(PeVideoEncoderConfig model)PegasusConfigconfiguration class:PegasusModel(PegasusConfig model)PegasusXConfigconfiguration class:PegasusXModel(PegasusXConfig model)PerceiverConfigconfiguration class:PerceiverModel(PerceiverConfig model)PerceptionLMConfigconfiguration class:PerceptionLMModel(PerceptionLMConfig model)PersimmonConfigconfiguration class:PersimmonModel(PersimmonConfig model)Phi3Configconfiguration class:Phi3Model(Phi3Config model)Phi4MultimodalConfigconfiguration class:Phi4MultimodalModel(Phi4MultimodalConfig model)PhiConfigconfiguration class:PhiModel(PhiConfig model)PhimoeConfigconfiguration class:PhimoeModel(PhimoeConfig model)PixioConfigconfiguration class:PixioModel(PixioConfig model)PixtralVisionConfigconfiguration class:PixtralVisionModel(PixtralVisionConfig model)PoolFormerConfigconfiguration class:PoolFormerModel(PoolFormerConfig model)ProphetNetConfigconfiguration class:ProphetNetModel(ProphetNetConfig model)PvtConfigconfiguration class:PvtModel(PvtConfig model)PvtV2Configconfiguration class:PvtV2Model(PvtV2Config model)QianfanOCRConfigconfiguration class:QianfanOCRModel(QianfanOCRConfig model)QianfanOCRVisionConfigconfiguration class:QianfanOCRVisionModel(QianfanOCRVisionConfig model)Qwen2AudioEncoderConfigconfiguration class:Qwen2AudioEncoder(Qwen2AudioEncoderConfig model)Qwen2Configconfiguration class:Qwen2Model(Qwen2Config model)Qwen2MoeConfigconfiguration class:Qwen2MoeModel(Qwen2MoeConfig model)Qwen2VLConfigconfiguration class:Qwen2VLModel(Qwen2VLConfig model)Qwen2VLTextConfigconfiguration class:Qwen2VLTextModel(Qwen2VLTextConfig model)Qwen2_5_VLConfigconfiguration class:Qwen2_5_VLModel(Qwen2_5_VLConfig model)Qwen2_5_VLTextConfigconfiguration class:Qwen2_5_VLTextModel(Qwen2_5_VLTextConfig model)Qwen3Configconfiguration class:Qwen3Model(Qwen3Config model)Qwen3MoeConfigconfiguration class:Qwen3MoeModel(Qwen3MoeConfig model)Qwen3NextConfigconfiguration class:Qwen3NextModel(Qwen3NextConfig model)Qwen3VLConfigconfiguration class:Qwen3VLModel(Qwen3VLConfig model)Qwen3VLMoeConfigconfiguration class:Qwen3VLMoeModel(Qwen3VLMoeConfig model)Qwen3VLMoeTextConfigconfiguration class:Qwen3VLMoeTextModel(Qwen3VLMoeTextConfig model)Qwen3VLTextConfigconfiguration class:Qwen3VLTextModel(Qwen3VLTextConfig model)Qwen3_5Configconfiguration class:Qwen3_5Model(Qwen3_5Config model)Qwen3_5MoeConfigconfiguration class:Qwen3_5MoeModel(Qwen3_5MoeConfig model)Qwen3_5MoeTextConfigconfiguration class:Qwen3_5MoeTextModel(Qwen3_5MoeTextConfig model)Qwen3_5TextConfigconfiguration class:Qwen3_5TextModel(Qwen3_5TextConfig model)RTDetrConfigconfiguration class:RTDetrModel(RTDetrConfig model)RTDetrV2Configconfiguration class:RTDetrV2Model(RTDetrV2Config model)RecurrentGemmaConfigconfiguration class:RecurrentGemmaModel(RecurrentGemmaConfig model)ReformerConfigconfiguration class:ReformerModel(ReformerConfig model)RegNetConfigconfiguration class:RegNetModel(RegNetConfig model)RemBertConfigconfiguration class:RemBertModel(RemBertConfig model)ResNetConfigconfiguration class:ResNetModel(ResNetConfig model)RoCBertConfigconfiguration class:RoCBertModel(RoCBertConfig model)RoFormerConfigconfiguration class:RoFormerModel(RoFormerConfig model)RobertaConfigconfiguration class:RobertaModel(RobertaConfig model)RobertaPreLayerNormConfigconfiguration class:RobertaPreLayerNormModel(RobertaPreLayerNormConfig model)RwkvConfigconfiguration class:RwkvModel(RwkvConfig model)SEWConfigconfiguration class:SEWModel(SEWConfig model)SEWDConfigconfiguration class:SEWDModel(SEWDConfig model)Sam2Configconfiguration class:Sam2Model(Sam2Config model)Sam2HieraDetConfigconfiguration class:Sam2HieraDetModel(Sam2HieraDetConfig model)Sam2VideoConfigconfiguration class:Sam2VideoModel(Sam2VideoConfig model)Sam2VisionConfigconfiguration class:Sam2VisionModel(Sam2VisionConfig model)Sam3Configconfiguration class:Sam3Model(Sam3Config model)Sam3LiteTextConfigconfiguration class:Sam3LiteTextModel(Sam3LiteTextConfig model)Sam3LiteTextTextConfigconfiguration class:Sam3LiteTextTextModel(Sam3LiteTextTextConfig model)Sam3TrackerConfigconfiguration class:Sam3TrackerModel(Sam3TrackerConfig model)Sam3TrackerVideoConfigconfiguration class:Sam3TrackerVideoModel(Sam3TrackerVideoConfig model)Sam3ViTConfigconfiguration class:Sam3ViTModel(Sam3ViTConfig model)Sam3VideoConfigconfiguration class:Sam3VideoModel(Sam3VideoConfig model)Sam3VisionConfigconfiguration class:Sam3VisionModel(Sam3VisionConfig model)SamConfigconfiguration class:SamModel(SamConfig model)SamHQConfigconfiguration class:SamHQModel(SamHQConfig model)SamHQVisionConfigconfiguration class:SamHQVisionModel(SamHQVisionConfig model)SamVisionConfigconfiguration class:SamVisionModel(SamVisionConfig model)SeamlessM4TConfigconfiguration class:SeamlessM4TModel(SeamlessM4TConfig model)SeamlessM4Tv2Configconfiguration class:SeamlessM4Tv2Model(SeamlessM4Tv2Config model)SeedOssConfigconfiguration class:SeedOssModel(SeedOssConfig model)SegGptConfigconfiguration class:SegGptModel(SegGptConfig model)SegformerConfigconfiguration class:SegformerModel(SegformerConfig model)Siglip2Configconfiguration class:Siglip2Model(Siglip2Config model)Siglip2VisionConfigconfiguration class:Siglip2VisionModel(Siglip2VisionConfig model)SiglipConfigconfiguration class:SiglipModel(SiglipConfig model)SiglipVisionConfigconfiguration class:SiglipVisionModel(SiglipVisionConfig model)SmolLM3Configconfiguration class:SmolLM3Model(SmolLM3Config model)SmolVLMConfigconfiguration class:SmolVLMModel(SmolVLMConfig model)SmolVLMVisionConfigconfiguration class:SmolVLMVisionTransformer(SmolVLMVisionConfig model)SolarOpenConfigconfiguration class:SolarOpenModel(SolarOpenConfig model)Speech2TextConfigconfiguration class:Speech2TextModel(Speech2TextConfig model)SpeechT5Configconfiguration class:SpeechT5Model(SpeechT5Config model)SplinterConfigconfiguration class:SplinterModel(SplinterConfig model)SqueezeBertConfigconfiguration class:SqueezeBertModel(SqueezeBertConfig model)StableLmConfigconfiguration class:StableLmModel(StableLmConfig model)Starcoder2Configconfiguration class:Starcoder2Model(Starcoder2Config model)SwiftFormerConfigconfiguration class:SwiftFormerModel(SwiftFormerConfig model)Swin2SRConfigconfiguration class:Swin2SRModel(Swin2SRConfig model)SwinConfigconfiguration class:SwinModel(SwinConfig model)Swinv2Configconfiguration class:Swinv2Model(Swinv2Config model)SwitchTransformersConfigconfiguration class:SwitchTransformersModel(SwitchTransformersConfig model)T5Configconfiguration class:T5Model(T5Config model)T5Gemma2Configconfiguration class:T5Gemma2Model(T5Gemma2Config model)T5Gemma2EncoderConfigconfiguration class:T5Gemma2Encoder(T5Gemma2EncoderConfig model)T5GemmaConfigconfiguration class:T5GemmaModel(T5GemmaConfig model)TableTransformerConfigconfiguration class:TableTransformerModel(TableTransformerConfig model)TapasConfigconfiguration class:TapasModel(TapasConfig model)TextNetConfigconfiguration class:TextNetModel(TextNetConfig model)TimeSeriesTransformerConfigconfiguration class:TimeSeriesTransformerModel(TimeSeriesTransformerConfig model)TimesFm2_5Configconfiguration class:TimesFm2_5Model(TimesFm2_5Config model)TimesFmConfigconfiguration class:TimesFmModel(TimesFmConfig model)TimesformerConfigconfiguration class:TimesformerModel(TimesformerConfig model)TimmBackboneConfigconfiguration class:TimmBackbone(TimmBackboneConfig model)TimmWrapperConfigconfiguration class:TimmWrapperModel(TimmWrapperConfig model)TvpConfigconfiguration class:TvpModel(TvpConfig model)UMT5Configconfiguration class:UMT5Model(UMT5Config model)UVDocConfigconfiguration class:UVDocModel(UVDocConfig model)UdopConfigconfiguration class:UdopModel(UdopConfig model)UniSpeechConfigconfiguration class:UniSpeechModel(UniSpeechConfig model)UniSpeechSatConfigconfiguration class:UniSpeechSatModel(UniSpeechSatConfig model)UnivNetConfigconfiguration class:UnivNetModel(UnivNetConfig model)VJEPA2Configconfiguration class:VJEPA2Model(VJEPA2Config model)VaultGemmaConfigconfiguration class:VaultGemmaModel(VaultGemmaConfig model)ViTConfigconfiguration class:ViTModel(ViTConfig model)ViTMAEConfigconfiguration class:ViTMAEModel(ViTMAEConfig model)ViTMSNConfigconfiguration class:ViTMSNModel(ViTMSNConfig model)VibeVoiceAcousticTokenizerConfigconfiguration class:VibeVoiceAcousticTokenizerModel(VibeVoiceAcousticTokenizerConfig model)VibeVoiceAcousticTokenizerDecoderConfigconfiguration class:VibeVoiceAcousticTokenizerDecoderModel(VibeVoiceAcousticTokenizerDecoderConfig model)VibeVoiceAcousticTokenizerEncoderConfigconfiguration class:VibeVoiceAcousticTokenizerEncoderModel(VibeVoiceAcousticTokenizerEncoderConfig model)VibeVoiceAsrConfigconfiguration class:VibeVoiceAsrForConditionalGeneration(VibeVoiceAsrConfig model)VideoLlama3Configconfiguration class:VideoLlama3Model(VideoLlama3Config model)VideoLlama3VisionConfigconfiguration class:VideoLlama3VisionModel(VideoLlama3VisionConfig model)VideoLlavaConfigconfiguration class:VideoLlavaModel(VideoLlavaConfig model)VideoMAEConfigconfiguration class:VideoMAEModel(VideoMAEConfig model)ViltConfigconfiguration class:ViltModel(ViltConfig model)VipLlavaConfigconfiguration class:VipLlavaModel(VipLlavaConfig model)VisionTextDualEncoderConfigconfiguration class:VisionTextDualEncoderModel(VisionTextDualEncoderConfig model)VisualBertConfigconfiguration class:VisualBertModel(VisualBertConfig model)VitDetConfigconfiguration class:VitDetModel(VitDetConfig model)VitsConfigconfiguration class:VitsModel(VitsConfig model)VivitConfigconfiguration class:VivitModel(VivitConfig model)VoxtralConfigconfiguration class:VoxtralForConditionalGeneration(VoxtralConfig model)VoxtralEncoderConfigconfiguration class:VoxtralEncoder(VoxtralEncoderConfig model)VoxtralRealtimeConfigconfiguration class:VoxtralRealtimeForConditionalGeneration(VoxtralRealtimeConfig model)VoxtralRealtimeEncoderConfigconfiguration class:VoxtralRealtimeEncoder(VoxtralRealtimeEncoderConfig model)VoxtralRealtimeTextConfigconfiguration class:VoxtralRealtimeTextModel(VoxtralRealtimeTextConfig model)Wav2Vec2BertConfigconfiguration class:Wav2Vec2BertModel(Wav2Vec2BertConfig model)Wav2Vec2Configconfiguration class:Wav2Vec2Model(Wav2Vec2Config model)Wav2Vec2ConformerConfigconfiguration class:Wav2Vec2ConformerModel(Wav2Vec2ConformerConfig model)WavLMConfigconfiguration class:WavLMModel(WavLMConfig model)WhisperConfigconfiguration class:WhisperModel(WhisperConfig model)XCLIPConfigconfiguration class:XCLIPModel(XCLIPConfig model)XGLMConfigconfiguration class:XGLMModel(XGLMConfig model)XLMConfigconfiguration class:XLMModel(XLMConfig model)XLMRobertaConfigconfiguration class:XLMRobertaModel(XLMRobertaConfig model)XLMRobertaXLConfigconfiguration class:XLMRobertaXLModel(XLMRobertaXLConfig model)XLNetConfigconfiguration class:XLNetModel(XLNetConfig model)XcodecConfigconfiguration class:XcodecModel(XcodecConfig model)XmodConfigconfiguration class:XmodModel(XmodConfig model)YolosConfigconfiguration class:YolosModel(YolosConfig model)YosoConfigconfiguration class:YosoModel(YosoConfig model)YoutuConfigconfiguration class:YoutuModel(YoutuConfig model)Zamba2Configconfiguration class:Zamba2Model(Zamba2Config model)ZambaConfigconfiguration class:ZambaModel(ZambaConfig model)xLSTMConfigconfiguration class:xLSTMModel(xLSTMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the base model classes of the library from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the base model classes of the library from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- afmoe —
AfmoeModel(AfmoeConfig model) - aimv2 —
Aimv2Model(Aimv2Config model) - aimv2_vision_model —
Aimv2VisionModel(Aimv2VisionConfig model) - albert — AlbertModel (AlbertConfig model)
- align — AlignModel (AlignConfig model)
- altclip — AltCLIPModel (AltCLIPConfig model)
- apertus —
ApertusModel(ApertusConfig model) - arcee —
ArceeModel(ArceeConfig model) - aria —
AriaModel(AriaConfig model) - aria_text —
AriaTextModel(AriaTextConfig model) - audio-spectrogram-transformer — ASTModel (ASTConfig model)
- audioflamingo3 —
AudioFlamingo3ForConditionalGeneration(AudioFlamingo3Config model) - audioflamingo3_encoder —
AudioFlamingo3Encoder(AudioFlamingo3EncoderConfig model) - autoformer — AutoformerModel (AutoformerConfig model)
- aya_vision —
AyaVisionModel(AyaVisionConfig model) - bamba —
BambaModel(BambaConfig model) - bark — BarkModel (BarkConfig model)
- bart — BartModel (BartConfig model)
- beit — BeitModel (BeitConfig model)
- bert — BertModel (BertConfig model)
- bert-generation — BertGenerationEncoder (BertGenerationConfig model)
- big_bird — BigBirdModel (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusModel (BigBirdPegasusConfig model)
- biogpt — BioGptModel (BioGptConfig model)
- bit — BitModel (BitConfig model)
- bitnet —
BitNetModel(BitNetConfig model) - blenderbot — BlenderbotModel (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallModel (BlenderbotSmallConfig model)
- blip — BlipModel (BlipConfig model)
- blip-2 — Blip2Model (Blip2Config model)
- blip_2_qformer — Blip2QFormerModel (Blip2QFormerConfig model)
- bloom — BloomModel (BloomConfig model)
- blt —
BltModel(BltConfig model) - bridgetower — BridgeTowerModel (BridgeTowerConfig model)
- bros — BrosModel (BrosConfig model)
- camembert — CamembertModel (CamembertConfig model)
- canine — CanineModel (CanineConfig model)
- chameleon —
ChameleonModel(ChameleonConfig model) - chinese_clip — ChineseCLIPModel (ChineseCLIPConfig model)
- chinese_clip_vision_model — ChineseCLIPVisionModel (ChineseCLIPVisionConfig model)
- clap — ClapModel (ClapConfig model)
- clip — CLIPModel (CLIPConfig model)
- clip_text_model — CLIPTextModel (CLIPTextConfig model)
- clip_vision_model — CLIPVisionModel (CLIPVisionConfig model)
- clipseg — CLIPSegModel (CLIPSegConfig model)
- clvp — ClvpModelForConditionalGeneration (ClvpConfig model)
- codegen — CodeGenModel (CodeGenConfig model)
- cohere —
CohereModel(CohereConfig model) - cohere2 —
Cohere2Model(Cohere2Config model) - cohere2_vision —
Cohere2VisionModel(Cohere2VisionConfig model) - cohere_asr —
CohereAsrModel(CohereAsrConfig model) - conditional_detr — ConditionalDetrModel (ConditionalDetrConfig model)
- convbert — ConvBertModel (ConvBertConfig model)
- convnext — ConvNextModel (ConvNextConfig model)
- convnextv2 — ConvNextV2Model (ConvNextV2Config model)
- cpmant — CpmAntModel (CpmAntConfig model)
- csm —
CsmForConditionalGeneration(CsmConfig model) - ctrl — CTRLModel (CTRLConfig model)
- cvt — CvtModel (CvtConfig model)
- cwm —
CwmModel(CwmConfig model) - d_fine —
DFineModel(DFineConfig model) - dab-detr —
DabDetrModel(DabDetrConfig model) - dac —
DacModel(DacConfig model) - data2vec-audio — Data2VecAudioModel (Data2VecAudioConfig model)
- data2vec-text — Data2VecTextModel (Data2VecTextConfig model)
- data2vec-vision — Data2VecVisionModel (Data2VecVisionConfig model)
- dbrx —
DbrxModel(DbrxConfig model) - deberta — DebertaModel (DebertaConfig model)
- deberta-v2 — DebertaV2Model (DebertaV2Config model)
- decision_transformer —
DecisionTransformerModel(DecisionTransformerConfig model) - deepseek_v2 —
DeepseekV2Model(DeepseekV2Config model) - deepseek_v3 —
DeepseekV3Model(DeepseekV3Config model) - deepseek_vl —
DeepseekVLModel(DeepseekVLConfig model) - deepseek_vl_hybrid —
DeepseekVLHybridModel(DeepseekVLHybridConfig model) - deformable_detr — DeformableDetrModel (DeformableDetrConfig model)
- deit — DeiTModel (DeiTConfig model)
- depth_pro —
DepthProModel(DepthProConfig model) - detr — DetrModel (DetrConfig model)
- dia —
DiaModel(DiaConfig model) - diffllama —
DiffLlamaModel(DiffLlamaConfig model) - dinat — DinatModel (DinatConfig model)
- dinov2 —
Dinov2Model(Dinov2Config model) - dinov2_with_registers —
Dinov2WithRegistersModel(Dinov2WithRegistersConfig model) - dinov3_convnext —
DINOv3ConvNextModel(DINOv3ConvNextConfig model) - dinov3_vit —
DINOv3ViTModel(DINOv3ViTConfig model) - distilbert —
DistilBertModel(DistilBertConfig model) - doge —
DogeModel(DogeConfig model) - donut-swin —
DonutSwinModel(DonutSwinConfig model) - dots1 —
Dots1Model(Dots1Config model) - dpr —
DPRQuestionEncoder(DPRConfig model) - dpt —
DPTModel(DPTConfig model) - edgetam —
EdgeTamModel(EdgeTamConfig model) - edgetam_video —
EdgeTamVideoModel(EdgeTamVideoConfig model) - edgetam_vision_model —
EdgeTamVisionModel(EdgeTamVisionConfig model) - efficientloftr —
EfficientLoFTRModel(EfficientLoFTRConfig model) - efficientnet —
EfficientNetModel(EfficientNetConfig model) - electra —
ElectraModel(ElectraConfig model) - emu3 —
Emu3Model(Emu3Config model) - encodec —
EncodecModel(EncodecConfig model) - ernie —
ErnieModel(ErnieConfig model) - ernie4_5 —
Ernie4_5Model(Ernie4_5Config model) - ernie4_5_moe —
Ernie4_5_MoeModel(Ernie4_5_MoeConfig model) - ernie4_5_vl_moe —
Ernie4_5_VLMoeModel(Ernie4_5_VLMoeConfig model) - esm —
EsmModel(EsmConfig model) - eurobert —
EuroBertModel(EuroBertConfig model) - evolla —
EvollaModel(EvollaConfig model) - exaone4 —
Exaone4Model(Exaone4Config model) - exaone_moe —
ExaoneMoeModel(ExaoneMoeConfig model) - falcon —
FalconModel(FalconConfig model) - falcon_h1 —
FalconH1Model(FalconH1Config model) - falcon_mamba —
FalconMambaModel(FalconMambaConfig model) - fast_vlm —
FastVlmModel(FastVlmConfig model) - fastspeech2_conformer —
FastSpeech2ConformerModel(FastSpeech2ConformerConfig model) - fastspeech2_conformer_with_hifigan —
FastSpeech2ConformerWithHifiGan(FastSpeech2ConformerWithHifiGanConfig model) - flaubert —
FlaubertModel(FlaubertConfig model) - flava —
FlavaModel(FlavaConfig model) - flex_olmo —
FlexOlmoModel(FlexOlmoConfig model) - florence2 —
Florence2Model(Florence2Config model) - fnet —
FNetModel(FNetConfig model) - focalnet —
FocalNetModel(FocalNetConfig model) - fsmt —
FSMTModel(FSMTConfig model) - funnel —
FunnelModelorFunnelBaseModel(FunnelConfig model) - fuyu —
FuyuModel(FuyuConfig model) - gemma —
GemmaModel(GemmaConfig model) - gemma2 —
Gemma2Model(Gemma2Config model) - gemma3 —
Gemma3Model(Gemma3Config model) - gemma3_text —
Gemma3TextModel(Gemma3TextConfig model) - gemma3n —
Gemma3nModel(Gemma3nConfig model) - gemma3n_audio —
Gemma3nAudioEncoder(Gemma3nAudioConfig model) - gemma3n_text —
Gemma3nTextModel(Gemma3nTextConfig model) - gemma3n_vision —
TimmWrapperModel(Gemma3nVisionConfig model) - gemma4 —
Gemma4Model(Gemma4Config model) - gemma4_audio —
Gemma4AudioModel(Gemma4AudioConfig model) - gemma4_text —
Gemma4TextModel(Gemma4TextConfig model) - gemma4_vision —
Gemma4VisionModel(Gemma4VisionConfig model) - git —
GitModel(GitConfig model) - glm —
GlmModel(GlmConfig model) - glm4 —
Glm4Model(Glm4Config model) - glm46v —
Glm46VModel(Glm46VConfig model) - glm4_moe —
Glm4MoeModel(Glm4MoeConfig model) - glm4_moe_lite —
Glm4MoeLiteModel(Glm4MoeLiteConfig model) - glm4v —
Glm4vModel(Glm4vConfig model) - glm4v_moe —
Glm4vMoeModel(Glm4vMoeConfig model) - glm4v_moe_text —
Glm4vMoeTextModel(Glm4vMoeTextConfig model) - glm4v_moe_vision —
Glm4vMoeVisionModel(Glm4vMoeVisionConfig model) - glm4v_text —
Glm4vTextModel(Glm4vTextConfig model) - glm4v_vision —
Glm4vVisionModel(Glm4vVisionConfig model) - glm_image —
GlmImageModel(GlmImageConfig model) - glm_image_text —
GlmImageTextModel(GlmImageTextConfig model) - glm_image_vision —
GlmImageVisionModel(GlmImageVisionConfig model) - glm_image_vqmodel —
GlmImageVQVAE(GlmImageVQVAEConfig model) - glm_moe_dsa —
GlmMoeDsaModel(GlmMoeDsaConfig model) - glm_ocr —
GlmOcrModel(GlmOcrConfig model) - glm_ocr_text —
GlmOcrTextModel(GlmOcrTextConfig model) - glm_ocr_vision —
GlmOcrVisionModel(GlmOcrVisionConfig model) - glmasr —
GlmAsrForConditionalGeneration(GlmAsrConfig model) - glmasr_encoder —
GlmAsrEncoder(GlmAsrEncoderConfig model) - glpn —
GLPNModel(GLPNConfig model) - got_ocr2 —
GotOcr2Model(GotOcr2Config model) - gpt-sw3 —
GPT2Model(GPT2Config model) - gpt2 —
GPT2Model(GPT2Config model) - gpt_bigcode —
GPTBigCodeModel(GPTBigCodeConfig model) - gpt_neo —
GPTNeoModel(GPTNeoConfig model) - gpt_neox —
GPTNeoXModel(GPTNeoXConfig model) - gpt_neox_japanese —
GPTNeoXJapaneseModel(GPTNeoXJapaneseConfig model) - gpt_oss —
GptOssModel(GptOssConfig model) - gptj —
GPTJModel(GPTJConfig model) - granite —
GraniteModel(GraniteConfig model) - granitemoe —
GraniteMoeModel(GraniteMoeConfig model) - granitemoehybrid —
GraniteMoeHybridModel(GraniteMoeHybridConfig model) - granitemoeshared —
GraniteMoeSharedModel(GraniteMoeSharedConfig model) - grounding-dino —
GroundingDinoModel(GroundingDinoConfig model) - groupvit —
GroupViTModel(GroupViTConfig model) - helium —
HeliumModel(HeliumConfig model) - hgnet_v2 —
HGNetV2Backbone(HGNetV2Config model) - hiera —
HieraModel(HieraConfig model) - higgs_audio_v2 —
HiggsAudioV2ForConditionalGeneration(HiggsAudioV2Config model) - higgs_audio_v2_tokenizer —
HiggsAudioV2TokenizerModel(HiggsAudioV2TokenizerConfig model) - hubert —
HubertModel(HubertConfig model) - hunyuan_v1_dense —
HunYuanDenseV1Model(HunYuanDenseV1Config model) - hunyuan_v1_moe —
HunYuanMoEV1Model(HunYuanMoEV1Config model) - ibert —
IBertModel(IBertConfig model) - idefics —
IdeficsModel(IdeficsConfig model) - idefics2 —
Idefics2Model(Idefics2Config model) - idefics3 —
Idefics3Model(Idefics3Config model) - idefics3_vision —
Idefics3VisionTransformer(Idefics3VisionConfig model) - ijepa —
IJepaModel(IJepaConfig model) - imagegpt —
ImageGPTModel(ImageGPTConfig model) - informer —
InformerModel(InformerConfig model) - instructblip —
InstructBlipModel(InstructBlipConfig model) - instructblipvideo —
InstructBlipVideoModel(InstructBlipVideoConfig model) - internvl —
InternVLModel(InternVLConfig model) - internvl_vision —
InternVLVisionModel(InternVLVisionConfig model) - jais2 —
Jais2Model(Jais2Config model) - jamba —
JambaModel(JambaConfig model) - janus —
JanusModel(JanusConfig model) - jetmoe —
JetMoeModel(JetMoeConfig model) - jina_embeddings_v3 —
JinaEmbeddingsV3Model(JinaEmbeddingsV3Config model) - kosmos-2 —
Kosmos2Model(Kosmos2Config model) - kosmos-2.5 —
Kosmos2_5Model(Kosmos2_5Config model) - kyutai_speech_to_text —
KyutaiSpeechToTextModel(KyutaiSpeechToTextConfig model) - lasr_ctc —
LasrForCTC(LasrCTCConfig model) - lasr_encoder —
LasrEncoder(LasrEncoderConfig model) - layoutlm —
LayoutLMModel(LayoutLMConfig model) - layoutlmv2 —
LayoutLMv2Model(LayoutLMv2Config model) - layoutlmv3 —
LayoutLMv3Model(LayoutLMv3Config model) - led —
LEDModel(LEDConfig model) - levit —
LevitModel(LevitConfig model) - lfm2 —
Lfm2Model(Lfm2Config model) - lfm2_moe —
Lfm2MoeModel(Lfm2MoeConfig model) - lfm2_vl —
Lfm2VlModel(Lfm2VlConfig model) - lightglue —
LightGlueForKeypointMatching(LightGlueConfig model) - lighton_ocr —
LightOnOcrModel(LightOnOcrConfig model) - lilt —
LiltModel(LiltConfig model) - llama —
LlamaModel(LlamaConfig model) - llama4 —
Llama4ForConditionalGeneration(Llama4Config model) - llama4_text —
Llama4TextModel(Llama4TextConfig model) - llava —
LlavaModel(LlavaConfig model) - llava_next —
LlavaNextModel(LlavaNextConfig model) - llava_next_video —
LlavaNextVideoModel(LlavaNextVideoConfig model) - llava_onevision —
LlavaOnevisionModel(LlavaOnevisionConfig model) - longcat_flash —
LongcatFlashModel(LongcatFlashConfig model) - longformer —
LongformerModel(LongformerConfig model) - longt5 —
LongT5Model(LongT5Config model) - luke —
LukeModel(LukeConfig model) - lw_detr —
LwDetrModel(LwDetrConfig model) - lxmert —
LxmertModel(LxmertConfig model) - m2m_100 —
M2M100Model(M2M100Config model) - mamba —
MambaModel(MambaConfig model) - mamba2 —
Mamba2Model(Mamba2Config model) - marian —
MarianModel(MarianConfig model) - markuplm —
MarkupLMModel(MarkupLMConfig model) - mask2former —
Mask2FormerModel(Mask2FormerConfig model) - maskformer —
MaskFormerModel(MaskFormerConfig model) - maskformer-swin —
MaskFormerSwinModel(MaskFormerSwinConfig model) - mbart —
MBartModel(MBartConfig model) - megatron-bert —
MegatronBertModel(MegatronBertConfig model) - metaclip_2 —
MetaClip2Model(MetaClip2Config model) - mgp-str —
MgpstrForSceneTextRecognition(MgpstrConfig model) - mimi —
MimiModel(MimiConfig model) - minimax —
MiniMaxModel(MiniMaxConfig model) - minimax_m2 —
MiniMaxM2Model(MiniMaxM2Config model) - ministral —
MinistralModel(MinistralConfig model) - ministral3 —
Ministral3Model(Ministral3Config model) - mistral —
MistralModel(MistralConfig model) - mistral3 —
Mistral3Model(Mistral3Config model) - mistral4 —
Mistral4Model(Mistral4Config model) - mixtral —
MixtralModel(MixtralConfig model) - mlcd —
MLCDVisionModel(MLCDVisionConfig model) - mlcd_vision_model —
MLCDVisionModel(MLCDVisionConfig model) - mllama —
MllamaModel(MllamaConfig model) - mm-grounding-dino —
MMGroundingDinoModel(MMGroundingDinoConfig model) - mobilebert —
MobileBertModel(MobileBertConfig model) - mobilenet_v1 —
MobileNetV1Model(MobileNetV1Config model) - mobilenet_v2 —
MobileNetV2Model(MobileNetV2Config model) - mobilevit —
MobileViTModel(MobileViTConfig model) - mobilevitv2 —
MobileViTV2Model(MobileViTV2Config model) - modernbert —
ModernBertModel(ModernBertConfig model) - modernbert-decoder —
ModernBertDecoderModel(ModernBertDecoderConfig model) - modernvbert —
ModernVBertModel(ModernVBertConfig model) - moonshine —
MoonshineModel(MoonshineConfig model) - moonshine_streaming —
MoonshineStreamingModel(MoonshineStreamingConfig model) - moshi —
MoshiModel(MoshiConfig model) - mpnet —
MPNetModel(MPNetConfig model) - mpt —
MptModel(MptConfig model) - mra —
MraModel(MraConfig model) - mt5 —
MT5Model(MT5Config model) - musicflamingo —
MusicFlamingoForConditionalGeneration(MusicFlamingoConfig model) - musicgen —
MusicgenModel(MusicgenConfig model) - musicgen_melody —
MusicgenMelodyModel(MusicgenMelodyConfig model) - mvp —
MvpModel(MvpConfig model) - nanochat —
NanoChatModel(NanoChatConfig model) - nemotron —
NemotronModel(NemotronConfig model) - nemotron_h —
NemotronHModel(NemotronHConfig model) - nllb-moe —
NllbMoeModel(NllbMoeConfig model) - nomic_bert —
NomicBertModel(NomicBertConfig model) - nystromformer —
NystromformerModel(NystromformerConfig model) - olmo —
OlmoModel(OlmoConfig model) - olmo2 —
Olmo2Model(Olmo2Config model) - olmo3 —
Olmo3Model(Olmo3Config model) - olmo_hybrid —
OlmoHybridModel(OlmoHybridConfig model) - olmoe —
OlmoeModel(OlmoeConfig model) - omdet-turbo —
OmDetTurboForObjectDetection(OmDetTurboConfig model) - oneformer —
OneFormerModel(OneFormerConfig model) - openai-gpt —
OpenAIGPTModel(OpenAIGPTConfig model) - opt —
OPTModel(OPTConfig model) - ovis2 —
Ovis2Model(Ovis2Config model) - owlv2 —
Owlv2Model(Owlv2Config model) - owlvit —
OwlViTModel(OwlViTConfig model) - paligemma —
PaliGemmaModel(PaliGemmaConfig model) - parakeet_ctc —
ParakeetForCTC(ParakeetCTCConfig model) - parakeet_encoder —
ParakeetEncoder(ParakeetEncoderConfig model) - patchtsmixer —
PatchTSMixerModel(PatchTSMixerConfig model) - patchtst —
PatchTSTModel(PatchTSTConfig model) - pe_audio —
PeAudioModel(PeAudioConfig model) - pe_audio_encoder —
PeAudioEncoder(PeAudioEncoderConfig model) - pe_audio_video —
PeAudioVideoModel(PeAudioVideoConfig model) - pe_audio_video_encoder —
PeAudioVideoEncoder(PeAudioVideoEncoderConfig model) - pe_video —
PeVideoModel(PeVideoConfig model) - pe_video_encoder —
PeVideoEncoder(PeVideoEncoderConfig model) - pegasus —
PegasusModel(PegasusConfig model) - pegasus_x —
PegasusXModel(PegasusXConfig model) - perceiver —
PerceiverModel(PerceiverConfig model) - perception_lm —
PerceptionLMModel(PerceptionLMConfig model) - persimmon —
PersimmonModel(PersimmonConfig model) - phi —
PhiModel(PhiConfig model) - phi3 —
Phi3Model(Phi3Config model) - phi4_multimodal —
Phi4MultimodalModel(Phi4MultimodalConfig model) - phimoe —
PhimoeModel(PhimoeConfig model) - pi0 —
PI0Model(PI0Config model) - pixio —
PixioModel(PixioConfig model) - pixtral —
PixtralVisionModel(PixtralVisionConfig model) - plbart —
PLBartModel(PLBartConfig model) - poolformer —
PoolFormerModel(PoolFormerConfig model) - pp_doclayout_v3 —
PPDocLayoutV3Model(PPDocLayoutV3Config model) - pp_ocrv5_mobile_rec —
PPOCRV5MobileRecModel(PPOCRV5MobileRecConfig model) - pp_ocrv5_server_rec —
PPOCRV5ServerRecModel(PPOCRV5ServerRecConfig model) - prophetnet —
ProphetNetModel(ProphetNetConfig model) - pvt —
PvtModel(PvtConfig model) - pvt_v2 —
PvtV2Model(PvtV2Config model) - qianfan_ocr —
QianfanOCRModel(QianfanOCRConfig model) - qianfan_ocr_vision —
QianfanOCRVisionModel(QianfanOCRVisionConfig model) - qwen2 —
Qwen2Model(Qwen2Config model) - qwen2_5_vl —
Qwen2_5_VLModel(Qwen2_5_VLConfig model) - qwen2_5_vl_text —
Qwen2_5_VLTextModel(Qwen2_5_VLTextConfig model) - qwen2_audio_encoder —
Qwen2AudioEncoder(Qwen2AudioEncoderConfig model) - qwen2_moe —
Qwen2MoeModel(Qwen2MoeConfig model) - qwen2_vl —
Qwen2VLModel(Qwen2VLConfig model) - qwen2_vl_text —
Qwen2VLTextModel(Qwen2VLTextConfig model) - qwen3 —
Qwen3Model(Qwen3Config model) - qwen3_5 —
Qwen3_5Model(Qwen3_5Config model) - qwen3_5_moe —
Qwen3_5MoeModel(Qwen3_5MoeConfig model) - qwen3_5_moe_text —
Qwen3_5MoeTextModel(Qwen3_5MoeTextConfig model) - qwen3_5_text —
Qwen3_5TextModel(Qwen3_5TextConfig model) - qwen3_moe —
Qwen3MoeModel(Qwen3MoeConfig model) - qwen3_next —
Qwen3NextModel(Qwen3NextConfig model) - qwen3_vl —
Qwen3VLModel(Qwen3VLConfig model) - qwen3_vl_moe —
Qwen3VLMoeModel(Qwen3VLMoeConfig model) - qwen3_vl_moe_text —
Qwen3VLMoeTextModel(Qwen3VLMoeTextConfig model) - qwen3_vl_text —
Qwen3VLTextModel(Qwen3VLTextConfig model) - recurrent_gemma —
RecurrentGemmaModel(RecurrentGemmaConfig model) - reformer —
ReformerModel(ReformerConfig model) - regnet —
RegNetModel(RegNetConfig model) - rembert —
RemBertModel(RemBertConfig model) - resnet —
ResNetModel(ResNetConfig model) - roberta —
RobertaModel(RobertaConfig model) - roberta-prelayernorm —
RobertaPreLayerNormModel(RobertaPreLayerNormConfig model) - roc_bert —
RoCBertModel(RoCBertConfig model) - roformer —
RoFormerModel(RoFormerConfig model) - rt_detr —
RTDetrModel(RTDetrConfig model) - rt_detr_v2 —
RTDetrV2Model(RTDetrV2Config model) - rwkv —
RwkvModel(RwkvConfig model) - sam —
SamModel(SamConfig model) - sam2 —
Sam2Model(Sam2Config model) - sam2_hiera_det_model —
Sam2HieraDetModel(Sam2HieraDetConfig model) - sam2_video —
Sam2VideoModel(Sam2VideoConfig model) - sam2_vision_model —
Sam2VisionModel(Sam2VisionConfig model) - sam3 —
Sam3Model(Sam3Config model) - sam3_lite_text —
Sam3LiteTextModel(Sam3LiteTextConfig model) - sam3_lite_text_text_model —
Sam3LiteTextTextModel(Sam3LiteTextTextConfig model) - sam3_tracker —
Sam3TrackerModel(Sam3TrackerConfig model) - sam3_tracker_video —
Sam3TrackerVideoModel(Sam3TrackerVideoConfig model) - sam3_video —
Sam3VideoModel(Sam3VideoConfig model) - sam3_vision_model —
Sam3VisionModel(Sam3VisionConfig model) - sam3_vit_model —
Sam3ViTModel(Sam3ViTConfig model) - sam_hq —
SamHQModel(SamHQConfig model) - sam_hq_vision_model —
SamHQVisionModel(SamHQVisionConfig model) - sam_vision_model —
SamVisionModel(SamVisionConfig model) - seamless_m4t —
SeamlessM4TModel(SeamlessM4TConfig model) - seamless_m4t_v2 —
SeamlessM4Tv2Model(SeamlessM4Tv2Config model) - seed_oss —
SeedOssModel(SeedOssConfig model) - segformer —
SegformerModel(SegformerConfig model) - seggpt —
SegGptModel(SegGptConfig model) - sew —
SEWModel(SEWConfig model) - sew-d —
SEWDModel(SEWDConfig model) - siglip —
SiglipModel(SiglipConfig model) - siglip2 —
Siglip2Model(Siglip2Config model) - siglip2_vision_model —
Siglip2VisionModel(Siglip2VisionConfig model) - siglip_vision_model —
SiglipVisionModel(SiglipVisionConfig model) - smollm3 —
SmolLM3Model(SmolLM3Config model) - smolvlm —
SmolVLMModel(SmolVLMConfig model) - smolvlm_vision —
SmolVLMVisionTransformer(SmolVLMVisionConfig model) - solar_open —
SolarOpenModel(SolarOpenConfig model) - speech_to_text —
Speech2TextModel(Speech2TextConfig model) - speecht5 —
SpeechT5Model(SpeechT5Config model) - splinter —
SplinterModel(SplinterConfig model) - squeezebert —
SqueezeBertModel(SqueezeBertConfig model) - stablelm —
StableLmModel(StableLmConfig model) - starcoder2 —
Starcoder2Model(Starcoder2Config model) - swiftformer —
SwiftFormerModel(SwiftFormerConfig model) - swin —
SwinModel(SwinConfig model) - swin2sr —
Swin2SRModel(Swin2SRConfig model) - swinv2 —
Swinv2Model(Swinv2Config model) - switch_transformers —
SwitchTransformersModel(SwitchTransformersConfig model) - t5 —
T5Model(T5Config model) - t5gemma —
T5GemmaModel(T5GemmaConfig model) - t5gemma2 —
T5Gemma2Model(T5Gemma2Config model) - t5gemma2_encoder —
T5Gemma2Encoder(T5Gemma2EncoderConfig model) - table-transformer —
TableTransformerModel(TableTransformerConfig model) - tapas —
TapasModel(TapasConfig model) - textnet —
TextNetModel(TextNetConfig model) - time_series_transformer —
TimeSeriesTransformerModel(TimeSeriesTransformerConfig model) - timesfm —
TimesFmModel(TimesFmConfig model) - timesfm2_5 —
TimesFm2_5Model(TimesFm2_5Config model) - timesformer —
TimesformerModel(TimesformerConfig model) - timm_backbone —
TimmBackbone(TimmBackboneConfig model) - timm_wrapper —
TimmWrapperModel(TimmWrapperConfig model) - tvp —
TvpModel(TvpConfig model) - udop —
UdopModel(UdopConfig model) - umt5 —
UMT5Model(UMT5Config model) - unispeech —
UniSpeechModel(UniSpeechConfig model) - unispeech-sat —
UniSpeechSatModel(UniSpeechSatConfig model) - univnet —
UnivNetModel(UnivNetConfig model) - uvdoc —
UVDocModel(UVDocConfig model) - vaultgemma —
VaultGemmaModel(VaultGemmaConfig model) - vibevoice_acoustic_tokenizer —
VibeVoiceAcousticTokenizerModel(VibeVoiceAcousticTokenizerConfig model) - vibevoice_acoustic_tokenizer_decoder —
VibeVoiceAcousticTokenizerDecoderModel(VibeVoiceAcousticTokenizerDecoderConfig model) - vibevoice_acoustic_tokenizer_encoder —
VibeVoiceAcousticTokenizerEncoderModel(VibeVoiceAcousticTokenizerEncoderConfig model) - vibevoice_asr —
VibeVoiceAsrForConditionalGeneration(VibeVoiceAsrConfig model) - video_llama_3 —
VideoLlama3Model(VideoLlama3Config model) - video_llama_3_vision —
VideoLlama3VisionModel(VideoLlama3VisionConfig model) - video_llava —
VideoLlavaModel(VideoLlavaConfig model) - videomae —
VideoMAEModel(VideoMAEConfig model) - vilt —
ViltModel(ViltConfig model) - vipllava —
VipLlavaModel(VipLlavaConfig model) - vision-text-dual-encoder —
VisionTextDualEncoderModel(VisionTextDualEncoderConfig model) - visual_bert —
VisualBertModel(VisualBertConfig model) - vit —
ViTModel(ViTConfig model) - vit_mae —
ViTMAEModel(ViTMAEConfig model) - vit_msn —
ViTMSNModel(ViTMSNConfig model) - vitdet —
VitDetModel(VitDetConfig model) - vits —
VitsModel(VitsConfig model) - vivit —
VivitModel(VivitConfig model) - vjepa2 —
VJEPA2Model(VJEPA2Config model) - voxtral —
VoxtralForConditionalGeneration(VoxtralConfig model) - voxtral_encoder —
VoxtralEncoder(VoxtralEncoderConfig model) - voxtral_realtime —
VoxtralRealtimeForConditionalGeneration(VoxtralRealtimeConfig model) - voxtral_realtime_encoder —
VoxtralRealtimeEncoder(VoxtralRealtimeEncoderConfig model) - voxtral_realtime_text —
VoxtralRealtimeTextModel(VoxtralRealtimeTextConfig model) - wav2vec2 —
Wav2Vec2Model(Wav2Vec2Config model) - wav2vec2-bert —
Wav2Vec2BertModel(Wav2Vec2BertConfig model) - wav2vec2-conformer —
Wav2Vec2ConformerModel(Wav2Vec2ConformerConfig model) - wavlm —
WavLMModel(WavLMConfig model) - whisper —
WhisperModel(WhisperConfig model) - xclip —
XCLIPModel(XCLIPConfig model) - xcodec —
XcodecModel(XcodecConfig model) - xglm —
XGLMModel(XGLMConfig model) - xlm —
XLMModel(XLMConfig model) - xlm-roberta —
XLMRobertaModel(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaXLModel(XLMRobertaXLConfig model) - xlnet —
XLNetModel(XLNetConfig model) - xlstm —
xLSTMModel(xLSTMConfig model) - xmod —
XmodModel(XmodConfig model) - yolos —
YolosModel(YolosConfig model) - yoso —
YosoModel(YosoConfig model) - youtu —
YoutuModel(YoutuConfig model) - zamba —
ZambaModel(ZambaConfig model) - zamba2 —
Zamba2Model(Zamba2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModel
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueGeneric pretraining classes
以下の自動クラスは、事前学習ヘッドを持つモデルをインスタンス化するために利用可能です。
AutoModelForPreTraining
This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForPreTraining (AlbertConfig model)
AudioFlamingo3Configconfiguration class:AudioFlamingo3ForConditionalGeneration(AudioFlamingo3Config model)- BartConfig configuration class: BartForConditionalGeneration (BartConfig model)
- BertConfig configuration class: BertForPreTraining (BertConfig model)
- BigBirdConfig configuration class: BigBirdForPreTraining (BigBirdConfig model)
- BloomConfig configuration class: BloomForCausalLM (BloomConfig model)
- CTRLConfig configuration class: CTRLLMHeadModel (CTRLConfig model)
- CamembertConfig configuration class: CamembertForMaskedLM (CamembertConfig model)
ColModernVBertConfigconfiguration class:ColModernVBertForRetrieval(ColModernVBertConfig model)ColPaliConfigconfiguration class:ColPaliForRetrieval(ColPaliConfig model)ColQwen2Configconfiguration class:ColQwen2ForRetrieval(ColQwen2Config model)- Data2VecTextConfig configuration class: Data2VecTextForMaskedLM (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForMaskedLM (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForMaskedLM (DebertaV2Config model)
DistilBertConfigconfiguration class:DistilBertForMaskedLM(DistilBertConfig model)ElectraConfigconfiguration class:ElectraForPreTraining(ElectraConfig model)ErnieConfigconfiguration class:ErnieForPreTraining(ErnieConfig model)EvollaConfigconfiguration class:EvollaForProteinText2Text(EvollaConfig model)Exaone4Configconfiguration class:Exaone4ForCausalLM(Exaone4Config model)ExaoneMoeConfigconfiguration class:ExaoneMoeForCausalLM(ExaoneMoeConfig model)FNetConfigconfiguration class:FNetForPreTraining(FNetConfig model)FSMTConfigconfiguration class:FSMTForConditionalGeneration(FSMTConfig model)FalconMambaConfigconfiguration class:FalconMambaForCausalLM(FalconMambaConfig model)FlaubertConfigconfiguration class:FlaubertWithLMHeadModel(FlaubertConfig model)FlavaConfigconfiguration class:FlavaForPreTraining(FlavaConfig model)Florence2Configconfiguration class:Florence2ForConditionalGeneration(Florence2Config model)FunnelConfigconfiguration class:FunnelForPreTraining(FunnelConfig model)GPT2Configconfiguration class:GPT2LMHeadModel(GPT2Config model)GPTBigCodeConfigconfiguration class:GPTBigCodeForCausalLM(GPTBigCodeConfig model)Gemma3Configconfiguration class:Gemma3ForConditionalGeneration(Gemma3Config model)Gemma4Configconfiguration class:Gemma4ForConditionalGeneration(Gemma4Config model)GlmAsrConfigconfiguration class:GlmAsrForConditionalGeneration(GlmAsrConfig model)HieraConfigconfiguration class:HieraForPreTraining(HieraConfig model)IBertConfigconfiguration class:IBertForMaskedLM(IBertConfig model)Idefics2Configconfiguration class:Idefics2ForConditionalGeneration(Idefics2Config model)Idefics3Configconfiguration class:Idefics3ForConditionalGeneration(Idefics3Config model)IdeficsConfigconfiguration class:IdeficsForVisionText2Text(IdeficsConfig model)JanusConfigconfiguration class:JanusForConditionalGeneration(JanusConfig model)LayoutLMConfigconfiguration class:LayoutLMForMaskedLM(LayoutLMConfig model)LlavaConfigconfiguration class:LlavaForConditionalGeneration(LlavaConfig model)LlavaNextConfigconfiguration class:LlavaNextForConditionalGeneration(LlavaNextConfig model)LlavaNextVideoConfigconfiguration class:LlavaNextVideoForConditionalGeneration(LlavaNextVideoConfig model)LlavaOnevisionConfigconfiguration class:LlavaOnevisionForConditionalGeneration(LlavaOnevisionConfig model)LongformerConfigconfiguration class:LongformerForMaskedLM(LongformerConfig model)LukeConfigconfiguration class:LukeForMaskedLM(LukeConfig model)LxmertConfigconfiguration class:LxmertForPreTraining(LxmertConfig model)MPNetConfigconfiguration class:MPNetForMaskedLM(MPNetConfig model)Mamba2Configconfiguration class:Mamba2ForCausalLM(Mamba2Config model)MambaConfigconfiguration class:MambaForCausalLM(MambaConfig model)MegatronBertConfigconfiguration class:MegatronBertForPreTraining(MegatronBertConfig model)Mistral3Configconfiguration class:Mistral3ForConditionalGeneration(Mistral3Config model)Mistral4Configconfiguration class:Mistral4ForCausalLM(Mistral4Config model)MllamaConfigconfiguration class:MllamaForConditionalGeneration(MllamaConfig model)MobileBertConfigconfiguration class:MobileBertForPreTraining(MobileBertConfig model)MptConfigconfiguration class:MptForCausalLM(MptConfig model)MraConfigconfiguration class:MraForMaskedLM(MraConfig model)MusicFlamingoConfigconfiguration class:MusicFlamingoForConditionalGeneration(MusicFlamingoConfig model)MvpConfigconfiguration class:MvpForConditionalGeneration(MvpConfig model)NanoChatConfigconfiguration class:NanoChatForCausalLM(NanoChatConfig model)NllbMoeConfigconfiguration class:NllbMoeForConditionalGeneration(NllbMoeConfig model)OpenAIGPTConfigconfiguration class:OpenAIGPTLMHeadModel(OpenAIGPTConfig model)PaliGemmaConfigconfiguration class:PaliGemmaForConditionalGeneration(PaliGemmaConfig model)Qwen2AudioConfigconfiguration class:Qwen2AudioForConditionalGeneration(Qwen2AudioConfig model)RoCBertConfigconfiguration class:RoCBertForPreTraining(RoCBertConfig model)RobertaConfigconfiguration class:RobertaForMaskedLM(RobertaConfig model)RobertaPreLayerNormConfigconfiguration class:RobertaPreLayerNormForMaskedLM(RobertaPreLayerNormConfig model)RwkvConfigconfiguration class:RwkvForCausalLM(RwkvConfig model)SplinterConfigconfiguration class:SplinterForPreTraining(SplinterConfig model)SqueezeBertConfigconfiguration class:SqueezeBertForMaskedLM(SqueezeBertConfig model)SwitchTransformersConfigconfiguration class:SwitchTransformersForConditionalGeneration(SwitchTransformersConfig model)T5Configconfiguration class:T5ForConditionalGeneration(T5Config model)T5Gemma2Configconfiguration class:T5Gemma2ForConditionalGeneration(T5Gemma2Config model)T5GemmaConfigconfiguration class:T5GemmaForConditionalGeneration(T5GemmaConfig model)TapasConfigconfiguration class:TapasForMaskedLM(TapasConfig model)UniSpeechConfigconfiguration class:UniSpeechForPreTraining(UniSpeechConfig model)UniSpeechSatConfigconfiguration class:UniSpeechSatForPreTraining(UniSpeechSatConfig model)ViTMAEConfigconfiguration class:ViTMAEForPreTraining(ViTMAEConfig model)VibeVoiceAsrConfigconfiguration class:VibeVoiceAsrForConditionalGeneration(VibeVoiceAsrConfig model)VideoLlavaConfigconfiguration class:VideoLlavaForConditionalGeneration(VideoLlavaConfig model)VideoMAEConfigconfiguration class:VideoMAEForPreTraining(VideoMAEConfig model)VipLlavaConfigconfiguration class:VipLlavaForConditionalGeneration(VipLlavaConfig model)VisualBertConfigconfiguration class:VisualBertForPreTraining(VisualBertConfig model)VoxtralConfigconfiguration class:VoxtralForConditionalGeneration(VoxtralConfig model)VoxtralRealtimeConfigconfiguration class:VoxtralRealtimeForConditionalGeneration(VoxtralRealtimeConfig model)Wav2Vec2Configconfiguration class:Wav2Vec2ForPreTraining(Wav2Vec2Config model)Wav2Vec2ConformerConfigconfiguration class:Wav2Vec2ConformerForPreTraining(Wav2Vec2ConformerConfig model)XLMConfigconfiguration class:XLMWithLMHeadModel(XLMConfig model)XLMRobertaConfigconfiguration class:XLMRobertaForMaskedLM(XLMRobertaConfig model)XLMRobertaXLConfigconfiguration class:XLMRobertaXLForMaskedLM(XLMRobertaXLConfig model)XLNetConfigconfiguration class:XLNetLMHeadModel(XLNetConfig model)XmodConfigconfiguration class:XmodForMaskedLM(XmodConfig model)xLSTMConfigconfiguration class:xLSTMForCausalLM(xLSTMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a pretraining head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert — AlbertForPreTraining (AlbertConfig model)
- audioflamingo3 —
AudioFlamingo3ForConditionalGeneration(AudioFlamingo3Config model) - bart — BartForConditionalGeneration (BartConfig model)
- bert — BertForPreTraining (BertConfig model)
- big_bird — BigBirdForPreTraining (BigBirdConfig model)
- bloom — BloomForCausalLM (BloomConfig model)
- camembert — CamembertForMaskedLM (CamembertConfig model)
- colmodernvbert —
ColModernVBertForRetrieval(ColModernVBertConfig model) - colpali —
ColPaliForRetrieval(ColPaliConfig model) - colqwen2 —
ColQwen2ForRetrieval(ColQwen2Config model) - ctrl — CTRLLMHeadModel (CTRLConfig model)
- data2vec-text — Data2VecTextForMaskedLM (Data2VecTextConfig model)
- deberta — DebertaForMaskedLM (DebertaConfig model)
- deberta-v2 — DebertaV2ForMaskedLM (DebertaV2Config model)
- distilbert —
DistilBertForMaskedLM(DistilBertConfig model) - electra —
ElectraForPreTraining(ElectraConfig model) - ernie —
ErnieForPreTraining(ErnieConfig model) - evolla —
EvollaForProteinText2Text(EvollaConfig model) - exaone4 —
Exaone4ForCausalLM(Exaone4Config model) - exaone_moe —
ExaoneMoeForCausalLM(ExaoneMoeConfig model) - falcon_mamba —
FalconMambaForCausalLM(FalconMambaConfig model) - flaubert —
FlaubertWithLMHeadModel(FlaubertConfig model) - flava —
FlavaForPreTraining(FlavaConfig model) - florence2 —
Florence2ForConditionalGeneration(Florence2Config model) - fnet —
FNetForPreTraining(FNetConfig model) - fsmt —
FSMTForConditionalGeneration(FSMTConfig model) - funnel —
FunnelForPreTraining(FunnelConfig model) - gemma3 —
Gemma3ForConditionalGeneration(Gemma3Config model) - gemma4 —
Gemma4ForConditionalGeneration(Gemma4Config model) - glmasr —
GlmAsrForConditionalGeneration(GlmAsrConfig model) - gpt-sw3 —
GPT2LMHeadModel(GPT2Config model) - gpt2 —
GPT2LMHeadModel(GPT2Config model) - gpt_bigcode —
GPTBigCodeForCausalLM(GPTBigCodeConfig model) - hiera —
HieraForPreTraining(HieraConfig model) - ibert —
IBertForMaskedLM(IBertConfig model) - idefics —
IdeficsForVisionText2Text(IdeficsConfig model) - idefics2 —
Idefics2ForConditionalGeneration(Idefics2Config model) - idefics3 —
Idefics3ForConditionalGeneration(Idefics3Config model) - janus —
JanusForConditionalGeneration(JanusConfig model) - layoutlm —
LayoutLMForMaskedLM(LayoutLMConfig model) - llava —
LlavaForConditionalGeneration(LlavaConfig model) - llava_next —
LlavaNextForConditionalGeneration(LlavaNextConfig model) - llava_next_video —
LlavaNextVideoForConditionalGeneration(LlavaNextVideoConfig model) - llava_onevision —
LlavaOnevisionForConditionalGeneration(LlavaOnevisionConfig model) - longformer —
LongformerForMaskedLM(LongformerConfig model) - luke —
LukeForMaskedLM(LukeConfig model) - lxmert —
LxmertForPreTraining(LxmertConfig model) - mamba —
MambaForCausalLM(MambaConfig model) - mamba2 —
Mamba2ForCausalLM(Mamba2Config model) - megatron-bert —
MegatronBertForPreTraining(MegatronBertConfig model) - mistral3 —
Mistral3ForConditionalGeneration(Mistral3Config model) - mistral4 —
Mistral4ForCausalLM(Mistral4Config model) - mllama —
MllamaForConditionalGeneration(MllamaConfig model) - mobilebert —
MobileBertForPreTraining(MobileBertConfig model) - mpnet —
MPNetForMaskedLM(MPNetConfig model) - mpt —
MptForCausalLM(MptConfig model) - mra —
MraForMaskedLM(MraConfig model) - musicflamingo —
MusicFlamingoForConditionalGeneration(MusicFlamingoConfig model) - mvp —
MvpForConditionalGeneration(MvpConfig model) - nanochat —
NanoChatForCausalLM(NanoChatConfig model) - nllb-moe —
NllbMoeForConditionalGeneration(NllbMoeConfig model) - openai-gpt —
OpenAIGPTLMHeadModel(OpenAIGPTConfig model) - paligemma —
PaliGemmaForConditionalGeneration(PaliGemmaConfig model) - qwen2_audio —
Qwen2AudioForConditionalGeneration(Qwen2AudioConfig model) - roberta —
RobertaForMaskedLM(RobertaConfig model) - roberta-prelayernorm —
RobertaPreLayerNormForMaskedLM(RobertaPreLayerNormConfig model) - roc_bert —
RoCBertForPreTraining(RoCBertConfig model) - rwkv —
RwkvForCausalLM(RwkvConfig model) - splinter —
SplinterForPreTraining(SplinterConfig model) - squeezebert —
SqueezeBertForMaskedLM(SqueezeBertConfig model) - switch_transformers —
SwitchTransformersForConditionalGeneration(SwitchTransformersConfig model) - t5 —
T5ForConditionalGeneration(T5Config model) - t5gemma —
T5GemmaForConditionalGeneration(T5GemmaConfig model) - t5gemma2 —
T5Gemma2ForConditionalGeneration(T5Gemma2Config model) - tapas —
TapasForMaskedLM(TapasConfig model) - unispeech —
UniSpeechForPreTraining(UniSpeechConfig model) - unispeech-sat —
UniSpeechSatForPreTraining(UniSpeechSatConfig model) - vibevoice_asr —
VibeVoiceAsrForConditionalGeneration(VibeVoiceAsrConfig model) - video_llava —
VideoLlavaForConditionalGeneration(VideoLlavaConfig model) - videomae —
VideoMAEForPreTraining(VideoMAEConfig model) - vipllava —
VipLlavaForConditionalGeneration(VipLlavaConfig model) - visual_bert —
VisualBertForPreTraining(VisualBertConfig model) - vit_mae —
ViTMAEForPreTraining(ViTMAEConfig model) - voxtral —
VoxtralForConditionalGeneration(VoxtralConfig model) - voxtral_realtime —
VoxtralRealtimeForConditionalGeneration(VoxtralRealtimeConfig model) - wav2vec2 —
Wav2Vec2ForPreTraining(Wav2Vec2Config model) - wav2vec2-conformer —
Wav2Vec2ConformerForPreTraining(Wav2Vec2ConformerConfig model) - xlm —
XLMWithLMHeadModel(XLMConfig model) - xlm-roberta —
XLMRobertaForMaskedLM(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaXLForMaskedLM(XLMRobertaXLConfig model) - xlnet —
XLNetLMHeadModel(XLNetConfig model) - xlstm —
xLSTMForCausalLM(xLSTMConfig model) - xmod —
XmodForMaskedLM(XmodConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForPreTraining
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueNatural Language Processing
以下の自動クラスは、次の自然言語処理タスクに利用可能です。
AutoModelForCausalLM
This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
AfmoeConfigconfiguration class:AfmoeForCausalLM(AfmoeConfig model)ApertusConfigconfiguration class:ApertusForCausalLM(ApertusConfig model)ArceeConfigconfiguration class:ArceeForCausalLM(ArceeConfig model)AriaTextConfigconfiguration class:AriaTextForCausalLM(AriaTextConfig model)BambaConfigconfiguration class:BambaForCausalLM(BambaConfig model)- BartConfig configuration class: BartForCausalLM (BartConfig model)
- BertConfig configuration class: BertLMHeadModel (BertConfig model)
- BertGenerationConfig configuration class: BertGenerationDecoder (BertGenerationConfig model)
- BigBirdConfig configuration class: BigBirdForCausalLM (BigBirdConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForCausalLM (BigBirdPegasusConfig model)
- BioGptConfig configuration class: BioGptForCausalLM (BioGptConfig model)
BitNetConfigconfiguration class:BitNetForCausalLM(BitNetConfig model)- BlenderbotConfig configuration class: BlenderbotForCausalLM (BlenderbotConfig model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallForCausalLM (BlenderbotSmallConfig model)
- BloomConfig configuration class: BloomForCausalLM (BloomConfig model)
BltConfigconfiguration class:BltForCausalLM(BltConfig model)- CTRLConfig configuration class: CTRLLMHeadModel (CTRLConfig model)
- CamembertConfig configuration class: CamembertForCausalLM (CamembertConfig model)
- CodeGenConfig configuration class: CodeGenForCausalLM (CodeGenConfig model)
Cohere2Configconfiguration class:Cohere2ForCausalLM(Cohere2Config model)CohereConfigconfiguration class:CohereForCausalLM(CohereConfig model)- CpmAntConfig configuration class: CpmAntForCausalLM (CpmAntConfig model)
CwmConfigconfiguration class:CwmForCausalLM(CwmConfig model)- Data2VecTextConfig configuration class: Data2VecTextForCausalLM (Data2VecTextConfig model)
DbrxConfigconfiguration class:DbrxForCausalLM(DbrxConfig model)DeepseekV2Configconfiguration class:DeepseekV2ForCausalLM(DeepseekV2Config model)DeepseekV3Configconfiguration class:DeepseekV3ForCausalLM(DeepseekV3Config model)DiffLlamaConfigconfiguration class:DiffLlamaForCausalLM(DiffLlamaConfig model)DogeConfigconfiguration class:DogeForCausalLM(DogeConfig model)Dots1Configconfiguration class:Dots1ForCausalLM(Dots1Config model)ElectraConfigconfiguration class:ElectraForCausalLM(ElectraConfig model)Emu3Configconfiguration class:Emu3ForCausalLM(Emu3Config model)Ernie4_5Configconfiguration class:Ernie4_5ForCausalLM(Ernie4_5Config model)Ernie4_5_MoeConfigconfiguration class:Ernie4_5_MoeForCausalLM(Ernie4_5_MoeConfig model)ErnieConfigconfiguration class:ErnieForCausalLM(ErnieConfig model)Exaone4Configconfiguration class:Exaone4ForCausalLM(Exaone4Config model)ExaoneMoeConfigconfiguration class:ExaoneMoeForCausalLM(ExaoneMoeConfig model)FalconConfigconfiguration class:FalconForCausalLM(FalconConfig model)FalconH1Configconfiguration class:FalconH1ForCausalLM(FalconH1Config model)FalconMambaConfigconfiguration class:FalconMambaForCausalLM(FalconMambaConfig model)FlexOlmoConfigconfiguration class:FlexOlmoForCausalLM(FlexOlmoConfig model)FuyuConfigconfiguration class:FuyuForCausalLM(FuyuConfig model)GPT2Configconfiguration class:GPT2LMHeadModel(GPT2Config model)GPTBigCodeConfigconfiguration class:GPTBigCodeForCausalLM(GPTBigCodeConfig model)GPTJConfigconfiguration class:GPTJForCausalLM(GPTJConfig model)GPTNeoConfigconfiguration class:GPTNeoForCausalLM(GPTNeoConfig model)GPTNeoXConfigconfiguration class:GPTNeoXForCausalLM(GPTNeoXConfig model)GPTNeoXJapaneseConfigconfiguration class:GPTNeoXJapaneseForCausalLM(GPTNeoXJapaneseConfig model)Gemma2Configconfiguration class:Gemma2ForCausalLM(Gemma2Config model)Gemma3Configconfiguration class:Gemma3ForConditionalGeneration(Gemma3Config model)Gemma3TextConfigconfiguration class:Gemma3ForCausalLM(Gemma3TextConfig model)Gemma3nConfigconfiguration class:Gemma3nForConditionalGeneration(Gemma3nConfig model)Gemma3nTextConfigconfiguration class:Gemma3nForCausalLM(Gemma3nTextConfig model)Gemma4Configconfiguration class:Gemma4ForConditionalGeneration(Gemma4Config model)Gemma4TextConfigconfiguration class:Gemma4ForCausalLM(Gemma4TextConfig model)GemmaConfigconfiguration class:GemmaForCausalLM(GemmaConfig model)GitConfigconfiguration class:GitForCausalLM(GitConfig model)Glm4Configconfiguration class:Glm4ForCausalLM(Glm4Config model)Glm4MoeConfigconfiguration class:Glm4MoeForCausalLM(Glm4MoeConfig model)Glm4MoeLiteConfigconfiguration class:Glm4MoeLiteForCausalLM(Glm4MoeLiteConfig model)GlmConfigconfiguration class:GlmForCausalLM(GlmConfig model)GlmMoeDsaConfigconfiguration class:GlmMoeDsaForCausalLM(GlmMoeDsaConfig model)GotOcr2Configconfiguration class:GotOcr2ForConditionalGeneration(GotOcr2Config model)GptOssConfigconfiguration class:GptOssForCausalLM(GptOssConfig model)GraniteConfigconfiguration class:GraniteForCausalLM(GraniteConfig model)GraniteMoeConfigconfiguration class:GraniteMoeForCausalLM(GraniteMoeConfig model)GraniteMoeHybridConfigconfiguration class:GraniteMoeHybridForCausalLM(GraniteMoeHybridConfig model)GraniteMoeSharedConfigconfiguration class:GraniteMoeSharedForCausalLM(GraniteMoeSharedConfig model)HeliumConfigconfiguration class:HeliumForCausalLM(HeliumConfig model)HunYuanDenseV1Configconfiguration class:HunYuanDenseV1ForCausalLM(HunYuanDenseV1Config model)HunYuanMoEV1Configconfiguration class:HunYuanMoEV1ForCausalLM(HunYuanMoEV1Config model)Jais2Configconfiguration class:Jais2ForCausalLM(Jais2Config model)JambaConfigconfiguration class:JambaForCausalLM(JambaConfig model)JetMoeConfigconfiguration class:JetMoeForCausalLM(JetMoeConfig model)Lfm2Configconfiguration class:Lfm2ForCausalLM(Lfm2Config model)Lfm2MoeConfigconfiguration class:Lfm2MoeForCausalLM(Lfm2MoeConfig model)Llama4Configconfiguration class:Llama4ForCausalLM(Llama4Config model)Llama4TextConfigconfiguration class:Llama4ForCausalLM(Llama4TextConfig model)LlamaConfigconfiguration class:LlamaForCausalLM(LlamaConfig model)LongcatFlashConfigconfiguration class:LongcatFlashForCausalLM(LongcatFlashConfig model)MBartConfigconfiguration class:MBartForCausalLM(MBartConfig model)Mamba2Configconfiguration class:Mamba2ForCausalLM(Mamba2Config model)MambaConfigconfiguration class:MambaForCausalLM(MambaConfig model)MarianConfigconfiguration class:MarianForCausalLM(MarianConfig model)MegatronBertConfigconfiguration class:MegatronBertForCausalLM(MegatronBertConfig model)MiniMaxConfigconfiguration class:MiniMaxForCausalLM(MiniMaxConfig model)MiniMaxM2Configconfiguration class:MiniMaxM2ForCausalLM(MiniMaxM2Config model)Ministral3Configconfiguration class:Ministral3ForCausalLM(Ministral3Config model)MinistralConfigconfiguration class:MinistralForCausalLM(MinistralConfig model)MistralConfigconfiguration class:MistralForCausalLM(MistralConfig model)MixtralConfigconfiguration class:MixtralForCausalLM(MixtralConfig model)MllamaConfigconfiguration class:MllamaForCausalLM(MllamaConfig model)ModernBertDecoderConfigconfiguration class:ModernBertDecoderForCausalLM(ModernBertDecoderConfig model)MoshiConfigconfiguration class:MoshiForCausalLM(MoshiConfig model)MptConfigconfiguration class:MptForCausalLM(MptConfig model)MusicgenConfigconfiguration class:MusicgenForCausalLM(MusicgenConfig model)MusicgenMelodyConfigconfiguration class:MusicgenMelodyForCausalLM(MusicgenMelodyConfig model)MvpConfigconfiguration class:MvpForCausalLM(MvpConfig model)NanoChatConfigconfiguration class:NanoChatForCausalLM(NanoChatConfig model)NemotronConfigconfiguration class:NemotronForCausalLM(NemotronConfig model)NemotronHConfigconfiguration class:NemotronHForCausalLM(NemotronHConfig model)OPTConfigconfiguration class:OPTForCausalLM(OPTConfig model)Olmo2Configconfiguration class:Olmo2ForCausalLM(Olmo2Config model)Olmo3Configconfiguration class:Olmo3ForCausalLM(Olmo3Config model)OlmoConfigconfiguration class:OlmoForCausalLM(OlmoConfig model)OlmoHybridConfigconfiguration class:OlmoHybridForCausalLM(OlmoHybridConfig model)OlmoeConfigconfiguration class:OlmoeForCausalLM(OlmoeConfig model)OpenAIGPTConfigconfiguration class:OpenAIGPTLMHeadModel(OpenAIGPTConfig model)PLBartConfigconfiguration class:PLBartForCausalLM(PLBartConfig model)PegasusConfigconfiguration class:PegasusForCausalLM(PegasusConfig model)PersimmonConfigconfiguration class:PersimmonForCausalLM(PersimmonConfig model)Phi3Configconfiguration class:Phi3ForCausalLM(Phi3Config model)Phi4MultimodalConfigconfiguration class:Phi4MultimodalForCausalLM(Phi4MultimodalConfig model)PhiConfigconfiguration class:PhiForCausalLM(PhiConfig model)PhimoeConfigconfiguration class:PhimoeForCausalLM(PhimoeConfig model)ProphetNetConfigconfiguration class:ProphetNetForCausalLM(ProphetNetConfig model)Qwen2Configconfiguration class:Qwen2ForCausalLM(Qwen2Config model)Qwen2MoeConfigconfiguration class:Qwen2MoeForCausalLM(Qwen2MoeConfig model)Qwen3Configconfiguration class:Qwen3ForCausalLM(Qwen3Config model)Qwen3MoeConfigconfiguration class:Qwen3MoeForCausalLM(Qwen3MoeConfig model)Qwen3NextConfigconfiguration class:Qwen3NextForCausalLM(Qwen3NextConfig model)Qwen3_5Configconfiguration class:Qwen3_5ForCausalLM(Qwen3_5Config model)Qwen3_5MoeConfigconfiguration class:Qwen3_5MoeForCausalLM(Qwen3_5MoeConfig model)Qwen3_5MoeTextConfigconfiguration class:Qwen3_5MoeForCausalLM(Qwen3_5MoeTextConfig model)Qwen3_5TextConfigconfiguration class:Qwen3_5ForCausalLM(Qwen3_5TextConfig model)RecurrentGemmaConfigconfiguration class:RecurrentGemmaForCausalLM(RecurrentGemmaConfig model)ReformerConfigconfiguration class:ReformerModelWithLMHead(ReformerConfig model)RemBertConfigconfiguration class:RemBertForCausalLM(RemBertConfig model)RoCBertConfigconfiguration class:RoCBertForCausalLM(RoCBertConfig model)RoFormerConfigconfiguration class:RoFormerForCausalLM(RoFormerConfig model)RobertaConfigconfiguration class:RobertaForCausalLM(RobertaConfig model)RobertaPreLayerNormConfigconfiguration class:RobertaPreLayerNormForCausalLM(RobertaPreLayerNormConfig model)RwkvConfigconfiguration class:RwkvForCausalLM(RwkvConfig model)SeedOssConfigconfiguration class:SeedOssForCausalLM(SeedOssConfig model)SmolLM3Configconfiguration class:SmolLM3ForCausalLM(SmolLM3Config model)SolarOpenConfigconfiguration class:SolarOpenForCausalLM(SolarOpenConfig model)StableLmConfigconfiguration class:StableLmForCausalLM(StableLmConfig model)Starcoder2Configconfiguration class:Starcoder2ForCausalLM(Starcoder2Config model)TrOCRConfigconfiguration class:TrOCRForCausalLM(TrOCRConfig model)VaultGemmaConfigconfiguration class:VaultGemmaForCausalLM(VaultGemmaConfig model)WhisperConfigconfiguration class:WhisperForCausalLM(WhisperConfig model)XGLMConfigconfiguration class:XGLMForCausalLM(XGLMConfig model)XLMConfigconfiguration class:XLMWithLMHeadModel(XLMConfig model)XLMRobertaConfigconfiguration class:XLMRobertaForCausalLM(XLMRobertaConfig model)XLMRobertaXLConfigconfiguration class:XLMRobertaXLForCausalLM(XLMRobertaXLConfig model)XLNetConfigconfiguration class:XLNetLMHeadModel(XLNetConfig model)XmodConfigconfiguration class:XmodForCausalLM(XmodConfig model)YoutuConfigconfiguration class:YoutuForCausalLM(YoutuConfig model)Zamba2Configconfiguration class:Zamba2ForCausalLM(Zamba2Config model)ZambaConfigconfiguration class:ZambaForCausalLM(ZambaConfig model)xLSTMConfigconfiguration class:xLSTMForCausalLM(xLSTMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- afmoe —
AfmoeForCausalLM(AfmoeConfig model) - apertus —
ApertusForCausalLM(ApertusConfig model) - arcee —
ArceeForCausalLM(ArceeConfig model) - aria_text —
AriaTextForCausalLM(AriaTextConfig model) - bamba —
BambaForCausalLM(BambaConfig model) - bart — BartForCausalLM (BartConfig model)
- bert — BertLMHeadModel (BertConfig model)
- bert-generation — BertGenerationDecoder (BertGenerationConfig model)
- big_bird — BigBirdForCausalLM (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusForCausalLM (BigBirdPegasusConfig model)
- biogpt — BioGptForCausalLM (BioGptConfig model)
- bitnet —
BitNetForCausalLM(BitNetConfig model) - blenderbot — BlenderbotForCausalLM (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallForCausalLM (BlenderbotSmallConfig model)
- bloom — BloomForCausalLM (BloomConfig model)
- blt —
BltForCausalLM(BltConfig model) - camembert — CamembertForCausalLM (CamembertConfig model)
- codegen — CodeGenForCausalLM (CodeGenConfig model)
- cohere —
CohereForCausalLM(CohereConfig model) - cohere2 —
Cohere2ForCausalLM(Cohere2Config model) - cpmant — CpmAntForCausalLM (CpmAntConfig model)
- ctrl — CTRLLMHeadModel (CTRLConfig model)
- cwm —
CwmForCausalLM(CwmConfig model) - data2vec-text — Data2VecTextForCausalLM (Data2VecTextConfig model)
- dbrx —
DbrxForCausalLM(DbrxConfig model) - deepseek_v2 —
DeepseekV2ForCausalLM(DeepseekV2Config model) - deepseek_v3 —
DeepseekV3ForCausalLM(DeepseekV3Config model) - diffllama —
DiffLlamaForCausalLM(DiffLlamaConfig model) - doge —
DogeForCausalLM(DogeConfig model) - dots1 —
Dots1ForCausalLM(Dots1Config model) - electra —
ElectraForCausalLM(ElectraConfig model) - emu3 —
Emu3ForCausalLM(Emu3Config model) - ernie —
ErnieForCausalLM(ErnieConfig model) - ernie4_5 —
Ernie4_5ForCausalLM(Ernie4_5Config model) - ernie4_5_moe —
Ernie4_5_MoeForCausalLM(Ernie4_5_MoeConfig model) - exaone4 —
Exaone4ForCausalLM(Exaone4Config model) - exaone_moe —
ExaoneMoeForCausalLM(ExaoneMoeConfig model) - falcon —
FalconForCausalLM(FalconConfig model) - falcon_h1 —
FalconH1ForCausalLM(FalconH1Config model) - falcon_mamba —
FalconMambaForCausalLM(FalconMambaConfig model) - flex_olmo —
FlexOlmoForCausalLM(FlexOlmoConfig model) - fuyu —
FuyuForCausalLM(FuyuConfig model) - gemma —
GemmaForCausalLM(GemmaConfig model) - gemma2 —
Gemma2ForCausalLM(Gemma2Config model) - gemma3 —
Gemma3ForConditionalGeneration(Gemma3Config model) - gemma3_text —
Gemma3ForCausalLM(Gemma3TextConfig model) - gemma3n —
Gemma3nForConditionalGeneration(Gemma3nConfig model) - gemma3n_text —
Gemma3nForCausalLM(Gemma3nTextConfig model) - gemma4 —
Gemma4ForConditionalGeneration(Gemma4Config model) - gemma4_text —
Gemma4ForCausalLM(Gemma4TextConfig model) - git —
GitForCausalLM(GitConfig model) - glm —
GlmForCausalLM(GlmConfig model) - glm4 —
Glm4ForCausalLM(Glm4Config model) - glm4_moe —
Glm4MoeForCausalLM(Glm4MoeConfig model) - glm4_moe_lite —
Glm4MoeLiteForCausalLM(Glm4MoeLiteConfig model) - glm_moe_dsa —
GlmMoeDsaForCausalLM(GlmMoeDsaConfig model) - got_ocr2 —
GotOcr2ForConditionalGeneration(GotOcr2Config model) - gpt-sw3 —
GPT2LMHeadModel(GPT2Config model) - gpt2 —
GPT2LMHeadModel(GPT2Config model) - gpt_bigcode —
GPTBigCodeForCausalLM(GPTBigCodeConfig model) - gpt_neo —
GPTNeoForCausalLM(GPTNeoConfig model) - gpt_neox —
GPTNeoXForCausalLM(GPTNeoXConfig model) - gpt_neox_japanese —
GPTNeoXJapaneseForCausalLM(GPTNeoXJapaneseConfig model) - gpt_oss —
GptOssForCausalLM(GptOssConfig model) - gptj —
GPTJForCausalLM(GPTJConfig model) - granite —
GraniteForCausalLM(GraniteConfig model) - granitemoe —
GraniteMoeForCausalLM(GraniteMoeConfig model) - granitemoehybrid —
GraniteMoeHybridForCausalLM(GraniteMoeHybridConfig model) - granitemoeshared —
GraniteMoeSharedForCausalLM(GraniteMoeSharedConfig model) - helium —
HeliumForCausalLM(HeliumConfig model) - hunyuan_v1_dense —
HunYuanDenseV1ForCausalLM(HunYuanDenseV1Config model) - hunyuan_v1_moe —
HunYuanMoEV1ForCausalLM(HunYuanMoEV1Config model) - jais2 —
Jais2ForCausalLM(Jais2Config model) - jamba —
JambaForCausalLM(JambaConfig model) - jetmoe —
JetMoeForCausalLM(JetMoeConfig model) - lfm2 —
Lfm2ForCausalLM(Lfm2Config model) - lfm2_moe —
Lfm2MoeForCausalLM(Lfm2MoeConfig model) - llama —
LlamaForCausalLM(LlamaConfig model) - llama4 —
Llama4ForCausalLM(Llama4Config model) - llama4_text —
Llama4ForCausalLM(Llama4TextConfig model) - longcat_flash —
LongcatFlashForCausalLM(LongcatFlashConfig model) - mamba —
MambaForCausalLM(MambaConfig model) - mamba2 —
Mamba2ForCausalLM(Mamba2Config model) - marian —
MarianForCausalLM(MarianConfig model) - mbart —
MBartForCausalLM(MBartConfig model) - megatron-bert —
MegatronBertForCausalLM(MegatronBertConfig model) - minimax —
MiniMaxForCausalLM(MiniMaxConfig model) - minimax_m2 —
MiniMaxM2ForCausalLM(MiniMaxM2Config model) - ministral —
MinistralForCausalLM(MinistralConfig model) - ministral3 —
Ministral3ForCausalLM(Ministral3Config model) - mistral —
MistralForCausalLM(MistralConfig model) - mixtral —
MixtralForCausalLM(MixtralConfig model) - mllama —
MllamaForCausalLM(MllamaConfig model) - modernbert-decoder —
ModernBertDecoderForCausalLM(ModernBertDecoderConfig model) - moshi —
MoshiForCausalLM(MoshiConfig model) - mpt —
MptForCausalLM(MptConfig model) - musicgen —
MusicgenForCausalLM(MusicgenConfig model) - musicgen_melody —
MusicgenMelodyForCausalLM(MusicgenMelodyConfig model) - mvp —
MvpForCausalLM(MvpConfig model) - nanochat —
NanoChatForCausalLM(NanoChatConfig model) - nemotron —
NemotronForCausalLM(NemotronConfig model) - nemotron_h —
NemotronHForCausalLM(NemotronHConfig model) - olmo —
OlmoForCausalLM(OlmoConfig model) - olmo2 —
Olmo2ForCausalLM(Olmo2Config model) - olmo3 —
Olmo3ForCausalLM(Olmo3Config model) - olmo_hybrid —
OlmoHybridForCausalLM(OlmoHybridConfig model) - olmoe —
OlmoeForCausalLM(OlmoeConfig model) - openai-gpt —
OpenAIGPTLMHeadModel(OpenAIGPTConfig model) - opt —
OPTForCausalLM(OPTConfig model) - pegasus —
PegasusForCausalLM(PegasusConfig model) - persimmon —
PersimmonForCausalLM(PersimmonConfig model) - phi —
PhiForCausalLM(PhiConfig model) - phi3 —
Phi3ForCausalLM(Phi3Config model) - phi4_multimodal —
Phi4MultimodalForCausalLM(Phi4MultimodalConfig model) - phimoe —
PhimoeForCausalLM(PhimoeConfig model) - plbart —
PLBartForCausalLM(PLBartConfig model) - prophetnet —
ProphetNetForCausalLM(ProphetNetConfig model) - qwen2 —
Qwen2ForCausalLM(Qwen2Config model) - qwen2_moe —
Qwen2MoeForCausalLM(Qwen2MoeConfig model) - qwen3 —
Qwen3ForCausalLM(Qwen3Config model) - qwen3_5 —
Qwen3_5ForCausalLM(Qwen3_5Config model) - qwen3_5_moe —
Qwen3_5MoeForCausalLM(Qwen3_5MoeConfig model) - qwen3_5_moe_text —
Qwen3_5MoeForCausalLM(Qwen3_5MoeTextConfig model) - qwen3_5_text —
Qwen3_5ForCausalLM(Qwen3_5TextConfig model) - qwen3_moe —
Qwen3MoeForCausalLM(Qwen3MoeConfig model) - qwen3_next —
Qwen3NextForCausalLM(Qwen3NextConfig model) - recurrent_gemma —
RecurrentGemmaForCausalLM(RecurrentGemmaConfig model) - reformer —
ReformerModelWithLMHead(ReformerConfig model) - rembert —
RemBertForCausalLM(RemBertConfig model) - roberta —
RobertaForCausalLM(RobertaConfig model) - roberta-prelayernorm —
RobertaPreLayerNormForCausalLM(RobertaPreLayerNormConfig model) - roc_bert —
RoCBertForCausalLM(RoCBertConfig model) - roformer —
RoFormerForCausalLM(RoFormerConfig model) - rwkv —
RwkvForCausalLM(RwkvConfig model) - seed_oss —
SeedOssForCausalLM(SeedOssConfig model) - smollm3 —
SmolLM3ForCausalLM(SmolLM3Config model) - solar_open —
SolarOpenForCausalLM(SolarOpenConfig model) - stablelm —
StableLmForCausalLM(StableLmConfig model) - starcoder2 —
Starcoder2ForCausalLM(Starcoder2Config model) - trocr —
TrOCRForCausalLM(TrOCRConfig model) - vaultgemma —
VaultGemmaForCausalLM(VaultGemmaConfig model) - whisper —
WhisperForCausalLM(WhisperConfig model) - xglm —
XGLMForCausalLM(XGLMConfig model) - xlm —
XLMWithLMHeadModel(XLMConfig model) - xlm-roberta —
XLMRobertaForCausalLM(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaXLForCausalLM(XLMRobertaXLConfig model) - xlnet —
XLNetLMHeadModel(XLNetConfig model) - xlstm —
xLSTMForCausalLM(xLSTMConfig model) - xmod —
XmodForCausalLM(XmodConfig model) - youtu —
YoutuForCausalLM(YoutuConfig model) - zamba —
ZambaForCausalLM(ZambaConfig model) - zamba2 —
Zamba2ForCausalLM(Zamba2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForCausalLM
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForMaskedLM
This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForMaskedLM (AlbertConfig model)
- BartConfig configuration class: BartForConditionalGeneration (BartConfig model)
- BertConfig configuration class: BertForMaskedLM (BertConfig model)
- BigBirdConfig configuration class: BigBirdForMaskedLM (BigBirdConfig model)
- CamembertConfig configuration class: CamembertForMaskedLM (CamembertConfig model)
- ConvBertConfig configuration class: ConvBertForMaskedLM (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForMaskedLM (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForMaskedLM (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForMaskedLM (DebertaV2Config model)
DistilBertConfigconfiguration class:DistilBertForMaskedLM(DistilBertConfig model)ElectraConfigconfiguration class:ElectraForMaskedLM(ElectraConfig model)ErnieConfigconfiguration class:ErnieForMaskedLM(ErnieConfig model)EsmConfigconfiguration class:EsmForMaskedLM(EsmConfig model)EuroBertConfigconfiguration class:EuroBertForMaskedLM(EuroBertConfig model)FNetConfigconfiguration class:FNetForMaskedLM(FNetConfig model)FlaubertConfigconfiguration class:FlaubertWithLMHeadModel(FlaubertConfig model)FunnelConfigconfiguration class:FunnelForMaskedLM(FunnelConfig model)IBertConfigconfiguration class:IBertForMaskedLM(IBertConfig model)JinaEmbeddingsV3Configconfiguration class:JinaEmbeddingsV3ForMaskedLM(JinaEmbeddingsV3Config model)LayoutLMConfigconfiguration class:LayoutLMForMaskedLM(LayoutLMConfig model)LongformerConfigconfiguration class:LongformerForMaskedLM(LongformerConfig model)LukeConfigconfiguration class:LukeForMaskedLM(LukeConfig model)MBartConfigconfiguration class:MBartForConditionalGeneration(MBartConfig model)MPNetConfigconfiguration class:MPNetForMaskedLM(MPNetConfig model)MegatronBertConfigconfiguration class:MegatronBertForMaskedLM(MegatronBertConfig model)MobileBertConfigconfiguration class:MobileBertForMaskedLM(MobileBertConfig model)ModernBertConfigconfiguration class:ModernBertForMaskedLM(ModernBertConfig model)ModernVBertConfigconfiguration class:ModernVBertForMaskedLM(ModernVBertConfig model)MraConfigconfiguration class:MraForMaskedLM(MraConfig model)MvpConfigconfiguration class:MvpForConditionalGeneration(MvpConfig model)NomicBertConfigconfiguration class:NomicBertForMaskedLM(NomicBertConfig model)NystromformerConfigconfiguration class:NystromformerForMaskedLM(NystromformerConfig model)PerceiverConfigconfiguration class:PerceiverForMaskedLM(PerceiverConfig model)ReformerConfigconfiguration class:ReformerForMaskedLM(ReformerConfig model)RemBertConfigconfiguration class:RemBertForMaskedLM(RemBertConfig model)RoCBertConfigconfiguration class:RoCBertForMaskedLM(RoCBertConfig model)RoFormerConfigconfiguration class:RoFormerForMaskedLM(RoFormerConfig model)RobertaConfigconfiguration class:RobertaForMaskedLM(RobertaConfig model)RobertaPreLayerNormConfigconfiguration class:RobertaPreLayerNormForMaskedLM(RobertaPreLayerNormConfig model)SqueezeBertConfigconfiguration class:SqueezeBertForMaskedLM(SqueezeBertConfig model)TapasConfigconfiguration class:TapasForMaskedLM(TapasConfig model)XLMConfigconfiguration class:XLMWithLMHeadModel(XLMConfig model)XLMRobertaConfigconfiguration class:XLMRobertaForMaskedLM(XLMRobertaConfig model)XLMRobertaXLConfigconfiguration class:XLMRobertaXLForMaskedLM(XLMRobertaXLConfig model)XmodConfigconfiguration class:XmodForMaskedLM(XmodConfig model)YosoConfigconfiguration class:YosoForMaskedLM(YosoConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert — AlbertForMaskedLM (AlbertConfig model)
- bart — BartForConditionalGeneration (BartConfig model)
- bert — BertForMaskedLM (BertConfig model)
- big_bird — BigBirdForMaskedLM (BigBirdConfig model)
- camembert — CamembertForMaskedLM (CamembertConfig model)
- convbert — ConvBertForMaskedLM (ConvBertConfig model)
- data2vec-text — Data2VecTextForMaskedLM (Data2VecTextConfig model)
- deberta — DebertaForMaskedLM (DebertaConfig model)
- deberta-v2 — DebertaV2ForMaskedLM (DebertaV2Config model)
- distilbert —
DistilBertForMaskedLM(DistilBertConfig model) - electra —
ElectraForMaskedLM(ElectraConfig model) - ernie —
ErnieForMaskedLM(ErnieConfig model) - esm —
EsmForMaskedLM(EsmConfig model) - eurobert —
EuroBertForMaskedLM(EuroBertConfig model) - flaubert —
FlaubertWithLMHeadModel(FlaubertConfig model) - fnet —
FNetForMaskedLM(FNetConfig model) - funnel —
FunnelForMaskedLM(FunnelConfig model) - ibert —
IBertForMaskedLM(IBertConfig model) - jina_embeddings_v3 —
JinaEmbeddingsV3ForMaskedLM(JinaEmbeddingsV3Config model) - layoutlm —
LayoutLMForMaskedLM(LayoutLMConfig model) - longformer —
LongformerForMaskedLM(LongformerConfig model) - luke —
LukeForMaskedLM(LukeConfig model) - mbart —
MBartForConditionalGeneration(MBartConfig model) - megatron-bert —
MegatronBertForMaskedLM(MegatronBertConfig model) - mobilebert —
MobileBertForMaskedLM(MobileBertConfig model) - modernbert —
ModernBertForMaskedLM(ModernBertConfig model) - modernvbert —
ModernVBertForMaskedLM(ModernVBertConfig model) - mpnet —
MPNetForMaskedLM(MPNetConfig model) - mra —
MraForMaskedLM(MraConfig model) - mvp —
MvpForConditionalGeneration(MvpConfig model) - nomic_bert —
NomicBertForMaskedLM(NomicBertConfig model) - nystromformer —
NystromformerForMaskedLM(NystromformerConfig model) - perceiver —
PerceiverForMaskedLM(PerceiverConfig model) - reformer —
ReformerForMaskedLM(ReformerConfig model) - rembert —
RemBertForMaskedLM(RemBertConfig model) - roberta —
RobertaForMaskedLM(RobertaConfig model) - roberta-prelayernorm —
RobertaPreLayerNormForMaskedLM(RobertaPreLayerNormConfig model) - roc_bert —
RoCBertForMaskedLM(RoCBertConfig model) - roformer —
RoFormerForMaskedLM(RoFormerConfig model) - squeezebert —
SqueezeBertForMaskedLM(SqueezeBertConfig model) - tapas —
TapasForMaskedLM(TapasConfig model) - xlm —
XLMWithLMHeadModel(XLMConfig model) - xlm-roberta —
XLMRobertaForMaskedLM(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaXLForMaskedLM(XLMRobertaXLConfig model) - xmod —
XmodForMaskedLM(XmodConfig model) - yoso —
YosoForMaskedLM(YosoConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForMaskedLM
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForMaskGeneration
AutoModelForSeq2SeqLM
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
AudioFlamingo3Configconfiguration class:AudioFlamingo3ForConditionalGeneration(AudioFlamingo3Config model)- BartConfig configuration class: BartForConditionalGeneration (BartConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForConditionalGeneration (BigBirdPegasusConfig model)
- BlenderbotConfig configuration class: BlenderbotForConditionalGeneration (BlenderbotConfig model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallForConditionalGeneration (BlenderbotSmallConfig model)
EncoderDecoderConfigconfiguration class:EncoderDecoderModel(EncoderDecoderConfig model)FSMTConfigconfiguration class:FSMTForConditionalGeneration(FSMTConfig model)GlmAsrConfigconfiguration class:GlmAsrForConditionalGeneration(GlmAsrConfig model)GraniteSpeechConfigconfiguration class:GraniteSpeechForConditionalGeneration(GraniteSpeechConfig model)LEDConfigconfiguration class:LEDForConditionalGeneration(LEDConfig model)LongT5Configconfiguration class:LongT5ForConditionalGeneration(LongT5Config model)M2M100Configconfiguration class:M2M100ForConditionalGeneration(M2M100Config model)MBartConfigconfiguration class:MBartForConditionalGeneration(MBartConfig model)MT5Configconfiguration class:MT5ForConditionalGeneration(MT5Config model)MarianConfigconfiguration class:MarianMTModel(MarianConfig model)MusicFlamingoConfigconfiguration class:MusicFlamingoForConditionalGeneration(MusicFlamingoConfig model)MvpConfigconfiguration class:MvpForConditionalGeneration(MvpConfig model)NllbMoeConfigconfiguration class:NllbMoeForConditionalGeneration(NllbMoeConfig model)PLBartConfigconfiguration class:PLBartForConditionalGeneration(PLBartConfig model)PegasusConfigconfiguration class:PegasusForConditionalGeneration(PegasusConfig model)PegasusXConfigconfiguration class:PegasusXForConditionalGeneration(PegasusXConfig model)ProphetNetConfigconfiguration class:ProphetNetForConditionalGeneration(ProphetNetConfig model)Qwen2AudioConfigconfiguration class:Qwen2AudioForConditionalGeneration(Qwen2AudioConfig model)SeamlessM4TConfigconfiguration class:SeamlessM4TForTextToText(SeamlessM4TConfig model)SeamlessM4Tv2Configconfiguration class:SeamlessM4Tv2ForTextToText(SeamlessM4Tv2Config model)SwitchTransformersConfigconfiguration class:SwitchTransformersForConditionalGeneration(SwitchTransformersConfig model)T5Configconfiguration class:T5ForConditionalGeneration(T5Config model)T5Gemma2Configconfiguration class:T5Gemma2ForConditionalGeneration(T5Gemma2Config model)T5GemmaConfigconfiguration class:T5GemmaForConditionalGeneration(T5GemmaConfig model)UMT5Configconfiguration class:UMT5ForConditionalGeneration(UMT5Config model)VibeVoiceAsrConfigconfiguration class:VibeVoiceAsrForConditionalGeneration(VibeVoiceAsrConfig model)VoxtralConfigconfiguration class:VoxtralForConditionalGeneration(VoxtralConfig model)VoxtralRealtimeConfigconfiguration class:VoxtralRealtimeForConditionalGeneration(VoxtralRealtimeConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- audioflamingo3 —
AudioFlamingo3ForConditionalGeneration(AudioFlamingo3Config model) - bart — BartForConditionalGeneration (BartConfig model)
- bigbird_pegasus — BigBirdPegasusForConditionalGeneration (BigBirdPegasusConfig model)
- blenderbot — BlenderbotForConditionalGeneration (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallForConditionalGeneration (BlenderbotSmallConfig model)
- encoder-decoder —
EncoderDecoderModel(EncoderDecoderConfig model) - fsmt —
FSMTForConditionalGeneration(FSMTConfig model) - glmasr —
GlmAsrForConditionalGeneration(GlmAsrConfig model) - granite_speech —
GraniteSpeechForConditionalGeneration(GraniteSpeechConfig model) - led —
LEDForConditionalGeneration(LEDConfig model) - longt5 —
LongT5ForConditionalGeneration(LongT5Config model) - m2m_100 —
M2M100ForConditionalGeneration(M2M100Config model) - marian —
MarianMTModel(MarianConfig model) - mbart —
MBartForConditionalGeneration(MBartConfig model) - mt5 —
MT5ForConditionalGeneration(MT5Config model) - musicflamingo —
MusicFlamingoForConditionalGeneration(MusicFlamingoConfig model) - mvp —
MvpForConditionalGeneration(MvpConfig model) - nllb-moe —
NllbMoeForConditionalGeneration(NllbMoeConfig model) - pegasus —
PegasusForConditionalGeneration(PegasusConfig model) - pegasus_x —
PegasusXForConditionalGeneration(PegasusXConfig model) - plbart —
PLBartForConditionalGeneration(PLBartConfig model) - prophetnet —
ProphetNetForConditionalGeneration(ProphetNetConfig model) - qwen2_audio —
Qwen2AudioForConditionalGeneration(Qwen2AudioConfig model) - seamless_m4t —
SeamlessM4TForTextToText(SeamlessM4TConfig model) - seamless_m4t_v2 —
SeamlessM4Tv2ForTextToText(SeamlessM4Tv2Config model) - switch_transformers —
SwitchTransformersForConditionalGeneration(SwitchTransformersConfig model) - t5 —
T5ForConditionalGeneration(T5Config model) - t5gemma —
T5GemmaForConditionalGeneration(T5GemmaConfig model) - t5gemma2 —
T5Gemma2ForConditionalGeneration(T5Gemma2Config model) - umt5 —
UMT5ForConditionalGeneration(UMT5Config model) - vibevoice_asr —
VibeVoiceAsrForConditionalGeneration(VibeVoiceAsrConfig model) - voxtral —
VoxtralForConditionalGeneration(VoxtralConfig model) - voxtral_realtime —
VoxtralRealtimeForConditionalGeneration(VoxtralRealtimeConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")
>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForSequenceClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForSequenceClassification (AlbertConfig model)
ArceeConfigconfiguration class:ArceeForSequenceClassification(ArceeConfig model)- BartConfig configuration class: BartForSequenceClassification (BartConfig model)
- BertConfig configuration class: BertForSequenceClassification (BertConfig model)
- BigBirdConfig configuration class: BigBirdForSequenceClassification (BigBirdConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForSequenceClassification (BigBirdPegasusConfig model)
- BioGptConfig configuration class: BioGptForSequenceClassification (BioGptConfig model)
- BloomConfig configuration class: BloomForSequenceClassification (BloomConfig model)
- CTRLConfig configuration class: CTRLForSequenceClassification (CTRLConfig model)
- CamembertConfig configuration class: CamembertForSequenceClassification (CamembertConfig model)
- CanineConfig configuration class: CanineForSequenceClassification (CanineConfig model)
- ConvBertConfig configuration class: ConvBertForSequenceClassification (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForSequenceClassification (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForSequenceClassification (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForSequenceClassification (DebertaV2Config model)
DeepseekV2Configconfiguration class:DeepseekV2ForSequenceClassification(DeepseekV2Config model)DeepseekV3Configconfiguration class:DeepseekV3ForSequenceClassification(DeepseekV3Config model)DiffLlamaConfigconfiguration class:DiffLlamaForSequenceClassification(DiffLlamaConfig model)DistilBertConfigconfiguration class:DistilBertForSequenceClassification(DistilBertConfig model)DogeConfigconfiguration class:DogeForSequenceClassification(DogeConfig model)ElectraConfigconfiguration class:ElectraForSequenceClassification(ElectraConfig model)ErnieConfigconfiguration class:ErnieForSequenceClassification(ErnieConfig model)EsmConfigconfiguration class:EsmForSequenceClassification(EsmConfig model)EuroBertConfigconfiguration class:EuroBertForSequenceClassification(EuroBertConfig model)Exaone4Configconfiguration class:Exaone4ForSequenceClassification(Exaone4Config model)FNetConfigconfiguration class:FNetForSequenceClassification(FNetConfig model)FalconConfigconfiguration class:FalconForSequenceClassification(FalconConfig model)FlaubertConfigconfiguration class:FlaubertForSequenceClassification(FlaubertConfig model)FunnelConfigconfiguration class:FunnelForSequenceClassification(FunnelConfig model)GPT2Configconfiguration class:GPT2ForSequenceClassification(GPT2Config model)GPTBigCodeConfigconfiguration class:GPTBigCodeForSequenceClassification(GPTBigCodeConfig model)GPTJConfigconfiguration class:GPTJForSequenceClassification(GPTJConfig model)GPTNeoConfigconfiguration class:GPTNeoForSequenceClassification(GPTNeoConfig model)GPTNeoXConfigconfiguration class:GPTNeoXForSequenceClassification(GPTNeoXConfig model)Gemma2Configconfiguration class:Gemma2ForSequenceClassification(Gemma2Config model)Gemma3Configconfiguration class:Gemma3ForSequenceClassification(Gemma3Config model)Gemma3TextConfigconfiguration class:Gemma3TextForSequenceClassification(Gemma3TextConfig model)GemmaConfigconfiguration class:GemmaForSequenceClassification(GemmaConfig model)Glm4Configconfiguration class:Glm4ForSequenceClassification(Glm4Config model)GlmConfigconfiguration class:GlmForSequenceClassification(GlmConfig model)GptOssConfigconfiguration class:GptOssForSequenceClassification(GptOssConfig model)HeliumConfigconfiguration class:HeliumForSequenceClassification(HeliumConfig model)HunYuanDenseV1Configconfiguration class:HunYuanDenseV1ForSequenceClassification(HunYuanDenseV1Config model)HunYuanMoEV1Configconfiguration class:HunYuanMoEV1ForSequenceClassification(HunYuanMoEV1Config model)IBertConfigconfiguration class:IBertForSequenceClassification(IBertConfig model)JambaConfigconfiguration class:JambaForSequenceClassification(JambaConfig model)JetMoeConfigconfiguration class:JetMoeForSequenceClassification(JetMoeConfig model)JinaEmbeddingsV3Configconfiguration class:JinaEmbeddingsV3ForSequenceClassification(JinaEmbeddingsV3Config model)LayoutLMConfigconfiguration class:LayoutLMForSequenceClassification(LayoutLMConfig model)LayoutLMv2Configconfiguration class:LayoutLMv2ForSequenceClassification(LayoutLMv2Config model)LayoutLMv3Configconfiguration class:LayoutLMv3ForSequenceClassification(LayoutLMv3Config model)LiltConfigconfiguration class:LiltForSequenceClassification(LiltConfig model)LlamaConfigconfiguration class:LlamaForSequenceClassification(LlamaConfig model)LongformerConfigconfiguration class:LongformerForSequenceClassification(LongformerConfig model)LukeConfigconfiguration class:LukeForSequenceClassification(LukeConfig model)MBartConfigconfiguration class:MBartForSequenceClassification(MBartConfig model)MPNetConfigconfiguration class:MPNetForSequenceClassification(MPNetConfig model)MT5Configconfiguration class:MT5ForSequenceClassification(MT5Config model)MarkupLMConfigconfiguration class:MarkupLMForSequenceClassification(MarkupLMConfig model)MegatronBertConfigconfiguration class:MegatronBertForSequenceClassification(MegatronBertConfig model)MiniMaxConfigconfiguration class:MiniMaxForSequenceClassification(MiniMaxConfig model)Ministral3Configconfiguration class:Ministral3ForSequenceClassification(Ministral3Config model)MinistralConfigconfiguration class:MinistralForSequenceClassification(MinistralConfig model)Mistral4Configconfiguration class:Mistral4ForSequenceClassification(Mistral4Config model)MistralConfigconfiguration class:MistralForSequenceClassification(MistralConfig model)MixtralConfigconfiguration class:MixtralForSequenceClassification(MixtralConfig model)MobileBertConfigconfiguration class:MobileBertForSequenceClassification(MobileBertConfig model)ModernBertConfigconfiguration class:ModernBertForSequenceClassification(ModernBertConfig model)ModernBertDecoderConfigconfiguration class:ModernBertDecoderForSequenceClassification(ModernBertDecoderConfig model)ModernVBertConfigconfiguration class:ModernVBertForSequenceClassification(ModernVBertConfig model)MptConfigconfiguration class:MptForSequenceClassification(MptConfig model)MraConfigconfiguration class:MraForSequenceClassification(MraConfig model)MvpConfigconfiguration class:MvpForSequenceClassification(MvpConfig model)NemotronConfigconfiguration class:NemotronForSequenceClassification(NemotronConfig model)NomicBertConfigconfiguration class:NomicBertForSequenceClassification(NomicBertConfig model)NystromformerConfigconfiguration class:NystromformerForSequenceClassification(NystromformerConfig model)OPTConfigconfiguration class:OPTForSequenceClassification(OPTConfig model)OpenAIGPTConfigconfiguration class:OpenAIGPTForSequenceClassification(OpenAIGPTConfig model)PLBartConfigconfiguration class:PLBartForSequenceClassification(PLBartConfig model)PerceiverConfigconfiguration class:PerceiverForSequenceClassification(PerceiverConfig model)PersimmonConfigconfiguration class:PersimmonForSequenceClassification(PersimmonConfig model)Phi3Configconfiguration class:Phi3ForSequenceClassification(Phi3Config model)PhiConfigconfiguration class:PhiForSequenceClassification(PhiConfig model)PhimoeConfigconfiguration class:PhimoeForSequenceClassification(PhimoeConfig model)Qwen2Configconfiguration class:Qwen2ForSequenceClassification(Qwen2Config model)Qwen2MoeConfigconfiguration class:Qwen2MoeForSequenceClassification(Qwen2MoeConfig model)Qwen3Configconfiguration class:Qwen3ForSequenceClassification(Qwen3Config model)Qwen3MoeConfigconfiguration class:Qwen3MoeForSequenceClassification(Qwen3MoeConfig model)Qwen3NextConfigconfiguration class:Qwen3NextForSequenceClassification(Qwen3NextConfig model)Qwen3_5Configconfiguration class:Qwen3_5ForSequenceClassification(Qwen3_5Config model)Qwen3_5TextConfigconfiguration class:Qwen3_5ForSequenceClassification(Qwen3_5TextConfig model)ReformerConfigconfiguration class:ReformerForSequenceClassification(ReformerConfig model)RemBertConfigconfiguration class:RemBertForSequenceClassification(RemBertConfig model)RoCBertConfigconfiguration class:RoCBertForSequenceClassification(RoCBertConfig model)RoFormerConfigconfiguration class:RoFormerForSequenceClassification(RoFormerConfig model)RobertaConfigconfiguration class:RobertaForSequenceClassification(RobertaConfig model)RobertaPreLayerNormConfigconfiguration class:RobertaPreLayerNormForSequenceClassification(RobertaPreLayerNormConfig model)SeedOssConfigconfiguration class:SeedOssForSequenceClassification(SeedOssConfig model)SmolLM3Configconfiguration class:SmolLM3ForSequenceClassification(SmolLM3Config model)SqueezeBertConfigconfiguration class:SqueezeBertForSequenceClassification(SqueezeBertConfig model)StableLmConfigconfiguration class:StableLmForSequenceClassification(StableLmConfig model)Starcoder2Configconfiguration class:Starcoder2ForSequenceClassification(Starcoder2Config model)T5Configconfiguration class:T5ForSequenceClassification(T5Config model)T5Gemma2Configconfiguration class:T5Gemma2ForSequenceClassification(T5Gemma2Config model)T5GemmaConfigconfiguration class:T5GemmaForSequenceClassification(T5GemmaConfig model)TapasConfigconfiguration class:TapasForSequenceClassification(TapasConfig model)UMT5Configconfiguration class:UMT5ForSequenceClassification(UMT5Config model)XLMConfigconfiguration class:XLMForSequenceClassification(XLMConfig model)XLMRobertaConfigconfiguration class:XLMRobertaForSequenceClassification(XLMRobertaConfig model)XLMRobertaXLConfigconfiguration class:XLMRobertaXLForSequenceClassification(XLMRobertaXLConfig model)XLNetConfigconfiguration class:XLNetForSequenceClassification(XLNetConfig model)XmodConfigconfiguration class:XmodForSequenceClassification(XmodConfig model)YosoConfigconfiguration class:YosoForSequenceClassification(YosoConfig model)Zamba2Configconfiguration class:Zamba2ForSequenceClassification(Zamba2Config model)ZambaConfigconfiguration class:ZambaForSequenceClassification(ZambaConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert — AlbertForSequenceClassification (AlbertConfig model)
- arcee —
ArceeForSequenceClassification(ArceeConfig model) - bart — BartForSequenceClassification (BartConfig model)
- bert — BertForSequenceClassification (BertConfig model)
- big_bird — BigBirdForSequenceClassification (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusForSequenceClassification (BigBirdPegasusConfig model)
- biogpt — BioGptForSequenceClassification (BioGptConfig model)
- bloom — BloomForSequenceClassification (BloomConfig model)
- camembert — CamembertForSequenceClassification (CamembertConfig model)
- canine — CanineForSequenceClassification (CanineConfig model)
- convbert — ConvBertForSequenceClassification (ConvBertConfig model)
- ctrl — CTRLForSequenceClassification (CTRLConfig model)
- data2vec-text — Data2VecTextForSequenceClassification (Data2VecTextConfig model)
- deberta — DebertaForSequenceClassification (DebertaConfig model)
- deberta-v2 — DebertaV2ForSequenceClassification (DebertaV2Config model)
- deepseek_v2 —
DeepseekV2ForSequenceClassification(DeepseekV2Config model) - deepseek_v3 —
DeepseekV3ForSequenceClassification(DeepseekV3Config model) - diffllama —
DiffLlamaForSequenceClassification(DiffLlamaConfig model) - distilbert —
DistilBertForSequenceClassification(DistilBertConfig model) - doge —
DogeForSequenceClassification(DogeConfig model) - electra —
ElectraForSequenceClassification(ElectraConfig model) - ernie —
ErnieForSequenceClassification(ErnieConfig model) - esm —
EsmForSequenceClassification(EsmConfig model) - eurobert —
EuroBertForSequenceClassification(EuroBertConfig model) - exaone4 —
Exaone4ForSequenceClassification(Exaone4Config model) - falcon —
FalconForSequenceClassification(FalconConfig model) - flaubert —
FlaubertForSequenceClassification(FlaubertConfig model) - fnet —
FNetForSequenceClassification(FNetConfig model) - funnel —
FunnelForSequenceClassification(FunnelConfig model) - gemma —
GemmaForSequenceClassification(GemmaConfig model) - gemma2 —
Gemma2ForSequenceClassification(Gemma2Config model) - gemma3 —
Gemma3ForSequenceClassification(Gemma3Config model) - gemma3_text —
Gemma3TextForSequenceClassification(Gemma3TextConfig model) - glm —
GlmForSequenceClassification(GlmConfig model) - glm4 —
Glm4ForSequenceClassification(Glm4Config model) - gpt-sw3 —
GPT2ForSequenceClassification(GPT2Config model) - gpt2 —
GPT2ForSequenceClassification(GPT2Config model) - gpt_bigcode —
GPTBigCodeForSequenceClassification(GPTBigCodeConfig model) - gpt_neo —
GPTNeoForSequenceClassification(GPTNeoConfig model) - gpt_neox —
GPTNeoXForSequenceClassification(GPTNeoXConfig model) - gpt_oss —
GptOssForSequenceClassification(GptOssConfig model) - gptj —
GPTJForSequenceClassification(GPTJConfig model) - helium —
HeliumForSequenceClassification(HeliumConfig model) - hunyuan_v1_dense —
HunYuanDenseV1ForSequenceClassification(HunYuanDenseV1Config model) - hunyuan_v1_moe —
HunYuanMoEV1ForSequenceClassification(HunYuanMoEV1Config model) - ibert —
IBertForSequenceClassification(IBertConfig model) - jamba —
JambaForSequenceClassification(JambaConfig model) - jetmoe —
JetMoeForSequenceClassification(JetMoeConfig model) - jina_embeddings_v3 —
JinaEmbeddingsV3ForSequenceClassification(JinaEmbeddingsV3Config model) - layoutlm —
LayoutLMForSequenceClassification(LayoutLMConfig model) - layoutlmv2 —
LayoutLMv2ForSequenceClassification(LayoutLMv2Config model) - layoutlmv3 —
LayoutLMv3ForSequenceClassification(LayoutLMv3Config model) - lilt —
LiltForSequenceClassification(LiltConfig model) - llama —
LlamaForSequenceClassification(LlamaConfig model) - longformer —
LongformerForSequenceClassification(LongformerConfig model) - luke —
LukeForSequenceClassification(LukeConfig model) - markuplm —
MarkupLMForSequenceClassification(MarkupLMConfig model) - mbart —
MBartForSequenceClassification(MBartConfig model) - megatron-bert —
MegatronBertForSequenceClassification(MegatronBertConfig model) - minimax —
MiniMaxForSequenceClassification(MiniMaxConfig model) - ministral —
MinistralForSequenceClassification(MinistralConfig model) - ministral3 —
Ministral3ForSequenceClassification(Ministral3Config model) - mistral —
MistralForSequenceClassification(MistralConfig model) - mistral4 —
Mistral4ForSequenceClassification(Mistral4Config model) - mixtral —
MixtralForSequenceClassification(MixtralConfig model) - mobilebert —
MobileBertForSequenceClassification(MobileBertConfig model) - modernbert —
ModernBertForSequenceClassification(ModernBertConfig model) - modernbert-decoder —
ModernBertDecoderForSequenceClassification(ModernBertDecoderConfig model) - modernvbert —
ModernVBertForSequenceClassification(ModernVBertConfig model) - mpnet —
MPNetForSequenceClassification(MPNetConfig model) - mpt —
MptForSequenceClassification(MptConfig model) - mra —
MraForSequenceClassification(MraConfig model) - mt5 —
MT5ForSequenceClassification(MT5Config model) - mvp —
MvpForSequenceClassification(MvpConfig model) - nemotron —
NemotronForSequenceClassification(NemotronConfig model) - nomic_bert —
NomicBertForSequenceClassification(NomicBertConfig model) - nystromformer —
NystromformerForSequenceClassification(NystromformerConfig model) - openai-gpt —
OpenAIGPTForSequenceClassification(OpenAIGPTConfig model) - opt —
OPTForSequenceClassification(OPTConfig model) - perceiver —
PerceiverForSequenceClassification(PerceiverConfig model) - persimmon —
PersimmonForSequenceClassification(PersimmonConfig model) - phi —
PhiForSequenceClassification(PhiConfig model) - phi3 —
Phi3ForSequenceClassification(Phi3Config model) - phimoe —
PhimoeForSequenceClassification(PhimoeConfig model) - plbart —
PLBartForSequenceClassification(PLBartConfig model) - qwen2 —
Qwen2ForSequenceClassification(Qwen2Config model) - qwen2_moe —
Qwen2MoeForSequenceClassification(Qwen2MoeConfig model) - qwen3 —
Qwen3ForSequenceClassification(Qwen3Config model) - qwen3_5 —
Qwen3_5ForSequenceClassification(Qwen3_5Config model) - qwen3_5_text —
Qwen3_5ForSequenceClassification(Qwen3_5TextConfig model) - qwen3_moe —
Qwen3MoeForSequenceClassification(Qwen3MoeConfig model) - qwen3_next —
Qwen3NextForSequenceClassification(Qwen3NextConfig model) - reformer —
ReformerForSequenceClassification(ReformerConfig model) - rembert —
RemBertForSequenceClassification(RemBertConfig model) - roberta —
RobertaForSequenceClassification(RobertaConfig model) - roberta-prelayernorm —
RobertaPreLayerNormForSequenceClassification(RobertaPreLayerNormConfig model) - roc_bert —
RoCBertForSequenceClassification(RoCBertConfig model) - roformer —
RoFormerForSequenceClassification(RoFormerConfig model) - seed_oss —
SeedOssForSequenceClassification(SeedOssConfig model) - smollm3 —
SmolLM3ForSequenceClassification(SmolLM3Config model) - squeezebert —
SqueezeBertForSequenceClassification(SqueezeBertConfig model) - stablelm —
StableLmForSequenceClassification(StableLmConfig model) - starcoder2 —
Starcoder2ForSequenceClassification(Starcoder2Config model) - t5 —
T5ForSequenceClassification(T5Config model) - t5gemma —
T5GemmaForSequenceClassification(T5GemmaConfig model) - t5gemma2 —
T5Gemma2ForSequenceClassification(T5Gemma2Config model) - tapas —
TapasForSequenceClassification(TapasConfig model) - umt5 —
UMT5ForSequenceClassification(UMT5Config model) - xlm —
XLMForSequenceClassification(XLMConfig model) - xlm-roberta —
XLMRobertaForSequenceClassification(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaXLForSequenceClassification(XLMRobertaXLConfig model) - xlnet —
XLNetForSequenceClassification(XLNetConfig model) - xmod —
XmodForSequenceClassification(XmodConfig model) - yoso —
YosoForSequenceClassification(YosoConfig model) - zamba —
ZambaForSequenceClassification(ZambaConfig model) - zamba2 —
Zamba2ForSequenceClassification(Zamba2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForSequenceClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForMultipleChoice
This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForMultipleChoice (AlbertConfig model)
- BertConfig configuration class: BertForMultipleChoice (BertConfig model)
- BigBirdConfig configuration class: BigBirdForMultipleChoice (BigBirdConfig model)
- CamembertConfig configuration class: CamembertForMultipleChoice (CamembertConfig model)
- CanineConfig configuration class: CanineForMultipleChoice (CanineConfig model)
- ConvBertConfig configuration class: ConvBertForMultipleChoice (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForMultipleChoice (Data2VecTextConfig model)
- DebertaV2Config configuration class: DebertaV2ForMultipleChoice (DebertaV2Config model)
DistilBertConfigconfiguration class:DistilBertForMultipleChoice(DistilBertConfig model)ElectraConfigconfiguration class:ElectraForMultipleChoice(ElectraConfig model)ErnieConfigconfiguration class:ErnieForMultipleChoice(ErnieConfig model)FNetConfigconfiguration class:FNetForMultipleChoice(FNetConfig model)FlaubertConfigconfiguration class:FlaubertForMultipleChoice(FlaubertConfig model)FunnelConfigconfiguration class:FunnelForMultipleChoice(FunnelConfig model)IBertConfigconfiguration class:IBertForMultipleChoice(IBertConfig model)LongformerConfigconfiguration class:LongformerForMultipleChoice(LongformerConfig model)LukeConfigconfiguration class:LukeForMultipleChoice(LukeConfig model)MPNetConfigconfiguration class:MPNetForMultipleChoice(MPNetConfig model)MegatronBertConfigconfiguration class:MegatronBertForMultipleChoice(MegatronBertConfig model)MobileBertConfigconfiguration class:MobileBertForMultipleChoice(MobileBertConfig model)ModernBertConfigconfiguration class:ModernBertForMultipleChoice(ModernBertConfig model)MraConfigconfiguration class:MraForMultipleChoice(MraConfig model)NystromformerConfigconfiguration class:NystromformerForMultipleChoice(NystromformerConfig model)RemBertConfigconfiguration class:RemBertForMultipleChoice(RemBertConfig model)RoCBertConfigconfiguration class:RoCBertForMultipleChoice(RoCBertConfig model)RoFormerConfigconfiguration class:RoFormerForMultipleChoice(RoFormerConfig model)RobertaConfigconfiguration class:RobertaForMultipleChoice(RobertaConfig model)RobertaPreLayerNormConfigconfiguration class:RobertaPreLayerNormForMultipleChoice(RobertaPreLayerNormConfig model)SqueezeBertConfigconfiguration class:SqueezeBertForMultipleChoice(SqueezeBertConfig model)XLMConfigconfiguration class:XLMForMultipleChoice(XLMConfig model)XLMRobertaConfigconfiguration class:XLMRobertaForMultipleChoice(XLMRobertaConfig model)XLMRobertaXLConfigconfiguration class:XLMRobertaXLForMultipleChoice(XLMRobertaXLConfig model)XLNetConfigconfiguration class:XLNetForMultipleChoice(XLNetConfig model)XmodConfigconfiguration class:XmodForMultipleChoice(XmodConfig model)YosoConfigconfiguration class:YosoForMultipleChoice(YosoConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert — AlbertForMultipleChoice (AlbertConfig model)
- bert — BertForMultipleChoice (BertConfig model)
- big_bird — BigBirdForMultipleChoice (BigBirdConfig model)
- camembert — CamembertForMultipleChoice (CamembertConfig model)
- canine — CanineForMultipleChoice (CanineConfig model)
- convbert — ConvBertForMultipleChoice (ConvBertConfig model)
- data2vec-text — Data2VecTextForMultipleChoice (Data2VecTextConfig model)
- deberta-v2 — DebertaV2ForMultipleChoice (DebertaV2Config model)
- distilbert —
DistilBertForMultipleChoice(DistilBertConfig model) - electra —
ElectraForMultipleChoice(ElectraConfig model) - ernie —
ErnieForMultipleChoice(ErnieConfig model) - flaubert —
FlaubertForMultipleChoice(FlaubertConfig model) - fnet —
FNetForMultipleChoice(FNetConfig model) - funnel —
FunnelForMultipleChoice(FunnelConfig model) - ibert —
IBertForMultipleChoice(IBertConfig model) - longformer —
LongformerForMultipleChoice(LongformerConfig model) - luke —
LukeForMultipleChoice(LukeConfig model) - megatron-bert —
MegatronBertForMultipleChoice(MegatronBertConfig model) - mobilebert —
MobileBertForMultipleChoice(MobileBertConfig model) - modernbert —
ModernBertForMultipleChoice(ModernBertConfig model) - mpnet —
MPNetForMultipleChoice(MPNetConfig model) - mra —
MraForMultipleChoice(MraConfig model) - nystromformer —
NystromformerForMultipleChoice(NystromformerConfig model) - rembert —
RemBertForMultipleChoice(RemBertConfig model) - roberta —
RobertaForMultipleChoice(RobertaConfig model) - roberta-prelayernorm —
RobertaPreLayerNormForMultipleChoice(RobertaPreLayerNormConfig model) - roc_bert —
RoCBertForMultipleChoice(RoCBertConfig model) - roformer —
RoFormerForMultipleChoice(RoFormerConfig model) - squeezebert —
SqueezeBertForMultipleChoice(SqueezeBertConfig model) - xlm —
XLMForMultipleChoice(XLMConfig model) - xlm-roberta —
XLMRobertaForMultipleChoice(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaXLForMultipleChoice(XLMRobertaXLConfig model) - xlnet —
XLNetForMultipleChoice(XLNetConfig model) - xmod —
XmodForMultipleChoice(XmodConfig model) - yoso —
YosoForMultipleChoice(YosoConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForMultipleChoice
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForNextSentencePrediction
This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BertConfig configuration class: BertForNextSentencePrediction (BertConfig model)
ErnieConfigconfiguration class:ErnieForNextSentencePrediction(ErnieConfig model)FNetConfigconfiguration class:FNetForNextSentencePrediction(FNetConfig model)MegatronBertConfigconfiguration class:MegatronBertForNextSentencePrediction(MegatronBertConfig model)MobileBertConfigconfiguration class:MobileBertForNextSentencePrediction(MobileBertConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- bert — BertForNextSentencePrediction (BertConfig model)
- ernie —
ErnieForNextSentencePrediction(ErnieConfig model) - fnet —
FNetForNextSentencePrediction(FNetConfig model) - megatron-bert —
MegatronBertForNextSentencePrediction(MegatronBertConfig model) - mobilebert —
MobileBertForNextSentencePrediction(MobileBertConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForTokenClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForTokenClassification (AlbertConfig model)
ApertusConfigconfiguration class:ApertusForTokenClassification(ApertusConfig model)ArceeConfigconfiguration class:ArceeForTokenClassification(ArceeConfig model)- BertConfig configuration class: BertForTokenClassification (BertConfig model)
- BigBirdConfig configuration class: BigBirdForTokenClassification (BigBirdConfig model)
- BioGptConfig configuration class: BioGptForTokenClassification (BioGptConfig model)
- BloomConfig configuration class: BloomForTokenClassification (BloomConfig model)
- BrosConfig configuration class: BrosForTokenClassification (BrosConfig model)
- CamembertConfig configuration class: CamembertForTokenClassification (CamembertConfig model)
- CanineConfig configuration class: CanineForTokenClassification (CanineConfig model)
- ConvBertConfig configuration class: ConvBertForTokenClassification (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForTokenClassification (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForTokenClassification (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForTokenClassification (DebertaV2Config model)
DeepseekV3Configconfiguration class:DeepseekV3ForTokenClassification(DeepseekV3Config model)DiffLlamaConfigconfiguration class:DiffLlamaForTokenClassification(DiffLlamaConfig model)DistilBertConfigconfiguration class:DistilBertForTokenClassification(DistilBertConfig model)ElectraConfigconfiguration class:ElectraForTokenClassification(ElectraConfig model)ErnieConfigconfiguration class:ErnieForTokenClassification(ErnieConfig model)EsmConfigconfiguration class:EsmForTokenClassification(EsmConfig model)EuroBertConfigconfiguration class:EuroBertForTokenClassification(EuroBertConfig model)Exaone4Configconfiguration class:Exaone4ForTokenClassification(Exaone4Config model)FNetConfigconfiguration class:FNetForTokenClassification(FNetConfig model)FalconConfigconfiguration class:FalconForTokenClassification(FalconConfig model)FlaubertConfigconfiguration class:FlaubertForTokenClassification(FlaubertConfig model)FunnelConfigconfiguration class:FunnelForTokenClassification(FunnelConfig model)GPT2Configconfiguration class:GPT2ForTokenClassification(GPT2Config model)GPTBigCodeConfigconfiguration class:GPTBigCodeForTokenClassification(GPTBigCodeConfig model)GPTNeoConfigconfiguration class:GPTNeoForTokenClassification(GPTNeoConfig model)GPTNeoXConfigconfiguration class:GPTNeoXForTokenClassification(GPTNeoXConfig model)Gemma2Configconfiguration class:Gemma2ForTokenClassification(Gemma2Config model)GemmaConfigconfiguration class:GemmaForTokenClassification(GemmaConfig model)Glm4Configconfiguration class:Glm4ForTokenClassification(Glm4Config model)GlmConfigconfiguration class:GlmForTokenClassification(GlmConfig model)GptOssConfigconfiguration class:GptOssForTokenClassification(GptOssConfig model)HeliumConfigconfiguration class:HeliumForTokenClassification(HeliumConfig model)IBertConfigconfiguration class:IBertForTokenClassification(IBertConfig model)JinaEmbeddingsV3Configconfiguration class:JinaEmbeddingsV3ForTokenClassification(JinaEmbeddingsV3Config model)LayoutLMConfigconfiguration class:LayoutLMForTokenClassification(LayoutLMConfig model)LayoutLMv2Configconfiguration class:LayoutLMv2ForTokenClassification(LayoutLMv2Config model)LayoutLMv3Configconfiguration class:LayoutLMv3ForTokenClassification(LayoutLMv3Config model)LiltConfigconfiguration class:LiltForTokenClassification(LiltConfig model)LlamaConfigconfiguration class:LlamaForTokenClassification(LlamaConfig model)LongformerConfigconfiguration class:LongformerForTokenClassification(LongformerConfig model)LukeConfigconfiguration class:LukeForTokenClassification(LukeConfig model)MPNetConfigconfiguration class:MPNetForTokenClassification(MPNetConfig model)MT5Configconfiguration class:MT5ForTokenClassification(MT5Config model)MarkupLMConfigconfiguration class:MarkupLMForTokenClassification(MarkupLMConfig model)MegatronBertConfigconfiguration class:MegatronBertForTokenClassification(MegatronBertConfig model)MiniMaxConfigconfiguration class:MiniMaxForTokenClassification(MiniMaxConfig model)Ministral3Configconfiguration class:Ministral3ForTokenClassification(Ministral3Config model)MinistralConfigconfiguration class:MinistralForTokenClassification(MinistralConfig model)Mistral4Configconfiguration class:Mistral4ForTokenClassification(Mistral4Config model)MistralConfigconfiguration class:MistralForTokenClassification(MistralConfig model)MixtralConfigconfiguration class:MixtralForTokenClassification(MixtralConfig model)MobileBertConfigconfiguration class:MobileBertForTokenClassification(MobileBertConfig model)ModernBertConfigconfiguration class:ModernBertForTokenClassification(ModernBertConfig model)ModernVBertConfigconfiguration class:ModernVBertForTokenClassification(ModernVBertConfig model)MptConfigconfiguration class:MptForTokenClassification(MptConfig model)MraConfigconfiguration class:MraForTokenClassification(MraConfig model)NemotronConfigconfiguration class:NemotronForTokenClassification(NemotronConfig model)NomicBertConfigconfiguration class:NomicBertForTokenClassification(NomicBertConfig model)NystromformerConfigconfiguration class:NystromformerForTokenClassification(NystromformerConfig model)PersimmonConfigconfiguration class:PersimmonForTokenClassification(PersimmonConfig model)Phi3Configconfiguration class:Phi3ForTokenClassification(Phi3Config model)PhiConfigconfiguration class:PhiForTokenClassification(PhiConfig model)Qwen2Configconfiguration class:Qwen2ForTokenClassification(Qwen2Config model)Qwen2MoeConfigconfiguration class:Qwen2MoeForTokenClassification(Qwen2MoeConfig model)Qwen3Configconfiguration class:Qwen3ForTokenClassification(Qwen3Config model)Qwen3MoeConfigconfiguration class:Qwen3MoeForTokenClassification(Qwen3MoeConfig model)Qwen3NextConfigconfiguration class:Qwen3NextForTokenClassification(Qwen3NextConfig model)RemBertConfigconfiguration class:RemBertForTokenClassification(RemBertConfig model)RoCBertConfigconfiguration class:RoCBertForTokenClassification(RoCBertConfig model)RoFormerConfigconfiguration class:RoFormerForTokenClassification(RoFormerConfig model)RobertaConfigconfiguration class:RobertaForTokenClassification(RobertaConfig model)RobertaPreLayerNormConfigconfiguration class:RobertaPreLayerNormForTokenClassification(RobertaPreLayerNormConfig model)SeedOssConfigconfiguration class:SeedOssForTokenClassification(SeedOssConfig model)SmolLM3Configconfiguration class:SmolLM3ForTokenClassification(SmolLM3Config model)SqueezeBertConfigconfiguration class:SqueezeBertForTokenClassification(SqueezeBertConfig model)StableLmConfigconfiguration class:StableLmForTokenClassification(StableLmConfig model)Starcoder2Configconfiguration class:Starcoder2ForTokenClassification(Starcoder2Config model)T5Configconfiguration class:T5ForTokenClassification(T5Config model)T5Gemma2Configconfiguration class:T5Gemma2ForTokenClassification(T5Gemma2Config model)T5GemmaConfigconfiguration class:T5GemmaForTokenClassification(T5GemmaConfig model)UMT5Configconfiguration class:UMT5ForTokenClassification(UMT5Config model)XLMConfigconfiguration class:XLMForTokenClassification(XLMConfig model)XLMRobertaConfigconfiguration class:XLMRobertaForTokenClassification(XLMRobertaConfig model)XLMRobertaXLConfigconfiguration class:XLMRobertaXLForTokenClassification(XLMRobertaXLConfig model)XLNetConfigconfiguration class:XLNetForTokenClassification(XLNetConfig model)XmodConfigconfiguration class:XmodForTokenClassification(XmodConfig model)YosoConfigconfiguration class:YosoForTokenClassification(YosoConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a token classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert — AlbertForTokenClassification (AlbertConfig model)
- apertus —
ApertusForTokenClassification(ApertusConfig model) - arcee —
ArceeForTokenClassification(ArceeConfig model) - bert — BertForTokenClassification (BertConfig model)
- big_bird — BigBirdForTokenClassification (BigBirdConfig model)
- biogpt — BioGptForTokenClassification (BioGptConfig model)
- bloom — BloomForTokenClassification (BloomConfig model)
- bros — BrosForTokenClassification (BrosConfig model)
- camembert — CamembertForTokenClassification (CamembertConfig model)
- canine — CanineForTokenClassification (CanineConfig model)
- convbert — ConvBertForTokenClassification (ConvBertConfig model)
- data2vec-text — Data2VecTextForTokenClassification (Data2VecTextConfig model)
- deberta — DebertaForTokenClassification (DebertaConfig model)
- deberta-v2 — DebertaV2ForTokenClassification (DebertaV2Config model)
- deepseek_v3 —
DeepseekV3ForTokenClassification(DeepseekV3Config model) - diffllama —
DiffLlamaForTokenClassification(DiffLlamaConfig model) - distilbert —
DistilBertForTokenClassification(DistilBertConfig model) - electra —
ElectraForTokenClassification(ElectraConfig model) - ernie —
ErnieForTokenClassification(ErnieConfig model) - esm —
EsmForTokenClassification(EsmConfig model) - eurobert —
EuroBertForTokenClassification(EuroBertConfig model) - exaone4 —
Exaone4ForTokenClassification(Exaone4Config model) - falcon —
FalconForTokenClassification(FalconConfig model) - flaubert —
FlaubertForTokenClassification(FlaubertConfig model) - fnet —
FNetForTokenClassification(FNetConfig model) - funnel —
FunnelForTokenClassification(FunnelConfig model) - gemma —
GemmaForTokenClassification(GemmaConfig model) - gemma2 —
Gemma2ForTokenClassification(Gemma2Config model) - glm —
GlmForTokenClassification(GlmConfig model) - glm4 —
Glm4ForTokenClassification(Glm4Config model) - gpt-sw3 —
GPT2ForTokenClassification(GPT2Config model) - gpt2 —
GPT2ForTokenClassification(GPT2Config model) - gpt_bigcode —
GPTBigCodeForTokenClassification(GPTBigCodeConfig model) - gpt_neo —
GPTNeoForTokenClassification(GPTNeoConfig model) - gpt_neox —
GPTNeoXForTokenClassification(GPTNeoXConfig model) - gpt_oss —
GptOssForTokenClassification(GptOssConfig model) - helium —
HeliumForTokenClassification(HeliumConfig model) - ibert —
IBertForTokenClassification(IBertConfig model) - jina_embeddings_v3 —
JinaEmbeddingsV3ForTokenClassification(JinaEmbeddingsV3Config model) - layoutlm —
LayoutLMForTokenClassification(LayoutLMConfig model) - layoutlmv2 —
LayoutLMv2ForTokenClassification(LayoutLMv2Config model) - layoutlmv3 —
LayoutLMv3ForTokenClassification(LayoutLMv3Config model) - lilt —
LiltForTokenClassification(LiltConfig model) - llama —
LlamaForTokenClassification(LlamaConfig model) - longformer —
LongformerForTokenClassification(LongformerConfig model) - luke —
LukeForTokenClassification(LukeConfig model) - markuplm —
MarkupLMForTokenClassification(MarkupLMConfig model) - megatron-bert —
MegatronBertForTokenClassification(MegatronBertConfig model) - minimax —
MiniMaxForTokenClassification(MiniMaxConfig model) - ministral —
MinistralForTokenClassification(MinistralConfig model) - ministral3 —
Ministral3ForTokenClassification(Ministral3Config model) - mistral —
MistralForTokenClassification(MistralConfig model) - mistral4 —
Mistral4ForTokenClassification(Mistral4Config model) - mixtral —
MixtralForTokenClassification(MixtralConfig model) - mobilebert —
MobileBertForTokenClassification(MobileBertConfig model) - modernbert —
ModernBertForTokenClassification(ModernBertConfig model) - modernvbert —
ModernVBertForTokenClassification(ModernVBertConfig model) - mpnet —
MPNetForTokenClassification(MPNetConfig model) - mpt —
MptForTokenClassification(MptConfig model) - mra —
MraForTokenClassification(MraConfig model) - mt5 —
MT5ForTokenClassification(MT5Config model) - nemotron —
NemotronForTokenClassification(NemotronConfig model) - nomic_bert —
NomicBertForTokenClassification(NomicBertConfig model) - nystromformer —
NystromformerForTokenClassification(NystromformerConfig model) - persimmon —
PersimmonForTokenClassification(PersimmonConfig model) - phi —
PhiForTokenClassification(PhiConfig model) - phi3 —
Phi3ForTokenClassification(Phi3Config model) - qwen2 —
Qwen2ForTokenClassification(Qwen2Config model) - qwen2_moe —
Qwen2MoeForTokenClassification(Qwen2MoeConfig model) - qwen3 —
Qwen3ForTokenClassification(Qwen3Config model) - qwen3_moe —
Qwen3MoeForTokenClassification(Qwen3MoeConfig model) - qwen3_next —
Qwen3NextForTokenClassification(Qwen3NextConfig model) - rembert —
RemBertForTokenClassification(RemBertConfig model) - roberta —
RobertaForTokenClassification(RobertaConfig model) - roberta-prelayernorm —
RobertaPreLayerNormForTokenClassification(RobertaPreLayerNormConfig model) - roc_bert —
RoCBertForTokenClassification(RoCBertConfig model) - roformer —
RoFormerForTokenClassification(RoFormerConfig model) - seed_oss —
SeedOssForTokenClassification(SeedOssConfig model) - smollm3 —
SmolLM3ForTokenClassification(SmolLM3Config model) - squeezebert —
SqueezeBertForTokenClassification(SqueezeBertConfig model) - stablelm —
StableLmForTokenClassification(StableLmConfig model) - starcoder2 —
Starcoder2ForTokenClassification(Starcoder2Config model) - t5 —
T5ForTokenClassification(T5Config model) - t5gemma —
T5GemmaForTokenClassification(T5GemmaConfig model) - t5gemma2 —
T5Gemma2ForTokenClassification(T5Gemma2Config model) - umt5 —
UMT5ForTokenClassification(UMT5Config model) - xlm —
XLMForTokenClassification(XLMConfig model) - xlm-roberta —
XLMRobertaForTokenClassification(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaXLForTokenClassification(XLMRobertaXLConfig model) - xlnet —
XLNetForTokenClassification(XLNetConfig model) - xmod —
XmodForTokenClassification(XmodConfig model) - yoso —
YosoForTokenClassification(YosoConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForTokenClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForQuestionAnswering
This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForQuestionAnswering (AlbertConfig model)
ArceeConfigconfiguration class:ArceeForQuestionAnswering(ArceeConfig model)- BartConfig configuration class: BartForQuestionAnswering (BartConfig model)
- BertConfig configuration class: BertForQuestionAnswering (BertConfig model)
- BigBirdConfig configuration class: BigBirdForQuestionAnswering (BigBirdConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForQuestionAnswering (BigBirdPegasusConfig model)
- BloomConfig configuration class: BloomForQuestionAnswering (BloomConfig model)
- CamembertConfig configuration class: CamembertForQuestionAnswering (CamembertConfig model)
- CanineConfig configuration class: CanineForQuestionAnswering (CanineConfig model)
- ConvBertConfig configuration class: ConvBertForQuestionAnswering (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForQuestionAnswering (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForQuestionAnswering (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForQuestionAnswering (DebertaV2Config model)
DiffLlamaConfigconfiguration class:DiffLlamaForQuestionAnswering(DiffLlamaConfig model)DistilBertConfigconfiguration class:DistilBertForQuestionAnswering(DistilBertConfig model)ElectraConfigconfiguration class:ElectraForQuestionAnswering(ElectraConfig model)ErnieConfigconfiguration class:ErnieForQuestionAnswering(ErnieConfig model)Exaone4Configconfiguration class:Exaone4ForQuestionAnswering(Exaone4Config model)FNetConfigconfiguration class:FNetForQuestionAnswering(FNetConfig model)FalconConfigconfiguration class:FalconForQuestionAnswering(FalconConfig model)FlaubertConfigconfiguration class:FlaubertForQuestionAnsweringSimple(FlaubertConfig model)FunnelConfigconfiguration class:FunnelForQuestionAnswering(FunnelConfig model)GPT2Configconfiguration class:GPT2ForQuestionAnswering(GPT2Config model)GPTJConfigconfiguration class:GPTJForQuestionAnswering(GPTJConfig model)GPTNeoConfigconfiguration class:GPTNeoForQuestionAnswering(GPTNeoConfig model)GPTNeoXConfigconfiguration class:GPTNeoXForQuestionAnswering(GPTNeoXConfig model)IBertConfigconfiguration class:IBertForQuestionAnswering(IBertConfig model)JinaEmbeddingsV3Configconfiguration class:JinaEmbeddingsV3ForQuestionAnswering(JinaEmbeddingsV3Config model)LEDConfigconfiguration class:LEDForQuestionAnswering(LEDConfig model)LayoutLMv2Configconfiguration class:LayoutLMv2ForQuestionAnswering(LayoutLMv2Config model)LayoutLMv3Configconfiguration class:LayoutLMv3ForQuestionAnswering(LayoutLMv3Config model)LiltConfigconfiguration class:LiltForQuestionAnswering(LiltConfig model)LlamaConfigconfiguration class:LlamaForQuestionAnswering(LlamaConfig model)LongformerConfigconfiguration class:LongformerForQuestionAnswering(LongformerConfig model)LukeConfigconfiguration class:LukeForQuestionAnswering(LukeConfig model)LxmertConfigconfiguration class:LxmertForQuestionAnswering(LxmertConfig model)MBartConfigconfiguration class:MBartForQuestionAnswering(MBartConfig model)MPNetConfigconfiguration class:MPNetForQuestionAnswering(MPNetConfig model)MT5Configconfiguration class:MT5ForQuestionAnswering(MT5Config model)MarkupLMConfigconfiguration class:MarkupLMForQuestionAnswering(MarkupLMConfig model)MegatronBertConfigconfiguration class:MegatronBertForQuestionAnswering(MegatronBertConfig model)MiniMaxConfigconfiguration class:MiniMaxForQuestionAnswering(MiniMaxConfig model)Ministral3Configconfiguration class:Ministral3ForQuestionAnswering(Ministral3Config model)MinistralConfigconfiguration class:MinistralForQuestionAnswering(MinistralConfig model)MistralConfigconfiguration class:MistralForQuestionAnswering(MistralConfig model)MixtralConfigconfiguration class:MixtralForQuestionAnswering(MixtralConfig model)MobileBertConfigconfiguration class:MobileBertForQuestionAnswering(MobileBertConfig model)ModernBertConfigconfiguration class:ModernBertForQuestionAnswering(ModernBertConfig model)MptConfigconfiguration class:MptForQuestionAnswering(MptConfig model)MraConfigconfiguration class:MraForQuestionAnswering(MraConfig model)MvpConfigconfiguration class:MvpForQuestionAnswering(MvpConfig model)NemotronConfigconfiguration class:NemotronForQuestionAnswering(NemotronConfig model)NystromformerConfigconfiguration class:NystromformerForQuestionAnswering(NystromformerConfig model)OPTConfigconfiguration class:OPTForQuestionAnswering(OPTConfig model)Qwen2Configconfiguration class:Qwen2ForQuestionAnswering(Qwen2Config model)Qwen2MoeConfigconfiguration class:Qwen2MoeForQuestionAnswering(Qwen2MoeConfig model)Qwen3Configconfiguration class:Qwen3ForQuestionAnswering(Qwen3Config model)Qwen3MoeConfigconfiguration class:Qwen3MoeForQuestionAnswering(Qwen3MoeConfig model)Qwen3NextConfigconfiguration class:Qwen3NextForQuestionAnswering(Qwen3NextConfig model)ReformerConfigconfiguration class:ReformerForQuestionAnswering(ReformerConfig model)RemBertConfigconfiguration class:RemBertForQuestionAnswering(RemBertConfig model)RoCBertConfigconfiguration class:RoCBertForQuestionAnswering(RoCBertConfig model)RoFormerConfigconfiguration class:RoFormerForQuestionAnswering(RoFormerConfig model)RobertaConfigconfiguration class:RobertaForQuestionAnswering(RobertaConfig model)RobertaPreLayerNormConfigconfiguration class:RobertaPreLayerNormForQuestionAnswering(RobertaPreLayerNormConfig model)SeedOssConfigconfiguration class:SeedOssForQuestionAnswering(SeedOssConfig model)SmolLM3Configconfiguration class:SmolLM3ForQuestionAnswering(SmolLM3Config model)SplinterConfigconfiguration class:SplinterForQuestionAnswering(SplinterConfig model)SqueezeBertConfigconfiguration class:SqueezeBertForQuestionAnswering(SqueezeBertConfig model)T5Configconfiguration class:T5ForQuestionAnswering(T5Config model)UMT5Configconfiguration class:UMT5ForQuestionAnswering(UMT5Config model)XLMConfigconfiguration class:XLMForQuestionAnsweringSimple(XLMConfig model)XLMRobertaConfigconfiguration class:XLMRobertaForQuestionAnswering(XLMRobertaConfig model)XLMRobertaXLConfigconfiguration class:XLMRobertaXLForQuestionAnswering(XLMRobertaXLConfig model)XLNetConfigconfiguration class:XLNetForQuestionAnsweringSimple(XLNetConfig model)XmodConfigconfiguration class:XmodForQuestionAnswering(XmodConfig model)YosoConfigconfiguration class:YosoForQuestionAnswering(YosoConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert — AlbertForQuestionAnswering (AlbertConfig model)
- arcee —
ArceeForQuestionAnswering(ArceeConfig model) - bart — BartForQuestionAnswering (BartConfig model)
- bert — BertForQuestionAnswering (BertConfig model)
- big_bird — BigBirdForQuestionAnswering (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusForQuestionAnswering (BigBirdPegasusConfig model)
- bloom — BloomForQuestionAnswering (BloomConfig model)
- camembert — CamembertForQuestionAnswering (CamembertConfig model)
- canine — CanineForQuestionAnswering (CanineConfig model)
- convbert — ConvBertForQuestionAnswering (ConvBertConfig model)
- data2vec-text — Data2VecTextForQuestionAnswering (Data2VecTextConfig model)
- deberta — DebertaForQuestionAnswering (DebertaConfig model)
- deberta-v2 — DebertaV2ForQuestionAnswering (DebertaV2Config model)
- diffllama —
DiffLlamaForQuestionAnswering(DiffLlamaConfig model) - distilbert —
DistilBertForQuestionAnswering(DistilBertConfig model) - electra —
ElectraForQuestionAnswering(ElectraConfig model) - ernie —
ErnieForQuestionAnswering(ErnieConfig model) - exaone4 —
Exaone4ForQuestionAnswering(Exaone4Config model) - falcon —
FalconForQuestionAnswering(FalconConfig model) - flaubert —
FlaubertForQuestionAnsweringSimple(FlaubertConfig model) - fnet —
FNetForQuestionAnswering(FNetConfig model) - funnel —
FunnelForQuestionAnswering(FunnelConfig model) - gpt2 —
GPT2ForQuestionAnswering(GPT2Config model) - gpt_neo —
GPTNeoForQuestionAnswering(GPTNeoConfig model) - gpt_neox —
GPTNeoXForQuestionAnswering(GPTNeoXConfig model) - gptj —
GPTJForQuestionAnswering(GPTJConfig model) - ibert —
IBertForQuestionAnswering(IBertConfig model) - jina_embeddings_v3 —
JinaEmbeddingsV3ForQuestionAnswering(JinaEmbeddingsV3Config model) - layoutlmv2 —
LayoutLMv2ForQuestionAnswering(LayoutLMv2Config model) - layoutlmv3 —
LayoutLMv3ForQuestionAnswering(LayoutLMv3Config model) - led —
LEDForQuestionAnswering(LEDConfig model) - lilt —
LiltForQuestionAnswering(LiltConfig model) - llama —
LlamaForQuestionAnswering(LlamaConfig model) - longformer —
LongformerForQuestionAnswering(LongformerConfig model) - luke —
LukeForQuestionAnswering(LukeConfig model) - lxmert —
LxmertForQuestionAnswering(LxmertConfig model) - markuplm —
MarkupLMForQuestionAnswering(MarkupLMConfig model) - mbart —
MBartForQuestionAnswering(MBartConfig model) - megatron-bert —
MegatronBertForQuestionAnswering(MegatronBertConfig model) - minimax —
MiniMaxForQuestionAnswering(MiniMaxConfig model) - ministral —
MinistralForQuestionAnswering(MinistralConfig model) - ministral3 —
Ministral3ForQuestionAnswering(Ministral3Config model) - mistral —
MistralForQuestionAnswering(MistralConfig model) - mixtral —
MixtralForQuestionAnswering(MixtralConfig model) - mobilebert —
MobileBertForQuestionAnswering(MobileBertConfig model) - modernbert —
ModernBertForQuestionAnswering(ModernBertConfig model) - mpnet —
MPNetForQuestionAnswering(MPNetConfig model) - mpt —
MptForQuestionAnswering(MptConfig model) - mra —
MraForQuestionAnswering(MraConfig model) - mt5 —
MT5ForQuestionAnswering(MT5Config model) - mvp —
MvpForQuestionAnswering(MvpConfig model) - nemotron —
NemotronForQuestionAnswering(NemotronConfig model) - nystromformer —
NystromformerForQuestionAnswering(NystromformerConfig model) - opt —
OPTForQuestionAnswering(OPTConfig model) - qwen2 —
Qwen2ForQuestionAnswering(Qwen2Config model) - qwen2_moe —
Qwen2MoeForQuestionAnswering(Qwen2MoeConfig model) - qwen3 —
Qwen3ForQuestionAnswering(Qwen3Config model) - qwen3_moe —
Qwen3MoeForQuestionAnswering(Qwen3MoeConfig model) - qwen3_next —
Qwen3NextForQuestionAnswering(Qwen3NextConfig model) - reformer —
ReformerForQuestionAnswering(ReformerConfig model) - rembert —
RemBertForQuestionAnswering(RemBertConfig model) - roberta —
RobertaForQuestionAnswering(RobertaConfig model) - roberta-prelayernorm —
RobertaPreLayerNormForQuestionAnswering(RobertaPreLayerNormConfig model) - roc_bert —
RoCBertForQuestionAnswering(RoCBertConfig model) - roformer —
RoFormerForQuestionAnswering(RoFormerConfig model) - seed_oss —
SeedOssForQuestionAnswering(SeedOssConfig model) - smollm3 —
SmolLM3ForQuestionAnswering(SmolLM3Config model) - splinter —
SplinterForQuestionAnswering(SplinterConfig model) - squeezebert —
SqueezeBertForQuestionAnswering(SqueezeBertConfig model) - t5 —
T5ForQuestionAnswering(T5Config model) - umt5 —
UMT5ForQuestionAnswering(UMT5Config model) - xlm —
XLMForQuestionAnsweringSimple(XLMConfig model) - xlm-roberta —
XLMRobertaForQuestionAnswering(XLMRobertaConfig model) - xlm-roberta-xl —
XLMRobertaXLForQuestionAnswering(XLMRobertaXLConfig model) - xlnet —
XLNetForQuestionAnsweringSimple(XLNetConfig model) - xmod —
XmodForQuestionAnswering(XmodConfig model) - yoso —
YosoForQuestionAnswering(YosoConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForTextEncoding
Computer vision
以下の自動クラスは、次のコンピュータービジョンタスクに利用可能です。
AutoModelForDepthEstimation
This is a generic model class that will be instantiated as one of the model classes of the library (with a depth estimation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
CHMv2Configconfiguration class:CHMv2ForDepthEstimation(CHMv2Config model)DPTConfigconfiguration class:DPTForDepthEstimation(DPTConfig model)DepthAnythingConfigconfiguration class:DepthAnythingForDepthEstimation(DepthAnythingConfig model)DepthProConfigconfiguration class:DepthProForDepthEstimation(DepthProConfig model)GLPNConfigconfiguration class:GLPNForDepthEstimation(GLPNConfig model)PromptDepthAnythingConfigconfiguration class:PromptDepthAnythingForDepthEstimation(PromptDepthAnythingConfig model)ZoeDepthConfigconfiguration class:ZoeDepthForDepthEstimation(ZoeDepthConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a depth estimation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a depth estimation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- chmv2 —
CHMv2ForDepthEstimation(CHMv2Config model) - depth_anything —
DepthAnythingForDepthEstimation(DepthAnythingConfig model) - depth_pro —
DepthProForDepthEstimation(DepthProConfig model) - dpt —
DPTForDepthEstimation(DPTConfig model) - glpn —
GLPNForDepthEstimation(GLPNConfig model) - prompt_depth_anything —
PromptDepthAnythingForDepthEstimation(PromptDepthAnythingConfig model) - zoedepth —
ZoeDepthForDepthEstimation(ZoeDepthConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForDepthEstimation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForImageClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BeitConfig configuration class: BeitForImageClassification (BeitConfig model)
- BitConfig configuration class: BitForImageClassification (BitConfig model)
- CLIPConfig configuration class:
CLIPForImageClassification(CLIPConfig model) - ConvNextConfig configuration class: ConvNextForImageClassification (ConvNextConfig model)
- ConvNextV2Config configuration class: ConvNextV2ForImageClassification (ConvNextV2Config model)
- CvtConfig configuration class: CvtForImageClassification (CvtConfig model)
- Data2VecVisionConfig configuration class: Data2VecVisionForImageClassification (Data2VecVisionConfig model)
- DeiTConfig configuration class: DeiTForImageClassification or DeiTForImageClassificationWithTeacher (DeiTConfig model)
- DinatConfig configuration class: DinatForImageClassification (DinatConfig model)
Dinov2Configconfiguration class:Dinov2ForImageClassification(Dinov2Config model)Dinov2WithRegistersConfigconfiguration class:Dinov2WithRegistersForImageClassification(Dinov2WithRegistersConfig model)DonutSwinConfigconfiguration class:DonutSwinForImageClassification(DonutSwinConfig model)EfficientNetConfigconfiguration class:EfficientNetForImageClassification(EfficientNetConfig model)FocalNetConfigconfiguration class:FocalNetForImageClassification(FocalNetConfig model)HGNetV2Configconfiguration class:HGNetV2ForImageClassification(HGNetV2Config model)HieraConfigconfiguration class:HieraForImageClassification(HieraConfig model)IJepaConfigconfiguration class:IJepaForImageClassification(IJepaConfig model)ImageGPTConfigconfiguration class:ImageGPTForImageClassification(ImageGPTConfig model)LevitConfigconfiguration class:LevitForImageClassificationorLevitForImageClassificationWithTeacher(LevitConfig model)MetaClip2Configconfiguration class:MetaClip2ForImageClassification(MetaClip2Config model)MobileNetV1Configconfiguration class:MobileNetV1ForImageClassification(MobileNetV1Config model)MobileNetV2Configconfiguration class:MobileNetV2ForImageClassification(MobileNetV2Config model)MobileViTConfigconfiguration class:MobileViTForImageClassification(MobileViTConfig model)MobileViTV2Configconfiguration class:MobileViTV2ForImageClassification(MobileViTV2Config model)PPLCNetConfigconfiguration class:PPLCNetForImageClassification(PPLCNetConfig model)PerceiverConfigconfiguration class:PerceiverForImageClassificationLearnedorPerceiverForImageClassificationFourierorPerceiverForImageClassificationConvProcessing(PerceiverConfig model)PoolFormerConfigconfiguration class:PoolFormerForImageClassification(PoolFormerConfig model)PvtConfigconfiguration class:PvtForImageClassification(PvtConfig model)PvtV2Configconfiguration class:PvtV2ForImageClassification(PvtV2Config model)RegNetConfigconfiguration class:RegNetForImageClassification(RegNetConfig model)ResNetConfigconfiguration class:ResNetForImageClassification(ResNetConfig model)SegformerConfigconfiguration class:SegformerForImageClassification(SegformerConfig model)ShieldGemma2Configconfiguration class:ShieldGemma2ForImageClassification(ShieldGemma2Config model)Siglip2Configconfiguration class:Siglip2ForImageClassification(Siglip2Config model)SiglipConfigconfiguration class:SiglipForImageClassification(SiglipConfig model)SwiftFormerConfigconfiguration class:SwiftFormerForImageClassification(SwiftFormerConfig model)SwinConfigconfiguration class:SwinForImageClassification(SwinConfig model)Swinv2Configconfiguration class:Swinv2ForImageClassification(Swinv2Config model)TextNetConfigconfiguration class:TextNetForImageClassification(TextNetConfig model)TimmWrapperConfigconfiguration class:TimmWrapperForImageClassification(TimmWrapperConfig model)ViTConfigconfiguration class:ViTForImageClassification(ViTConfig model)ViTMSNConfigconfiguration class:ViTMSNForImageClassification(ViTMSNConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a image classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- beit — BeitForImageClassification (BeitConfig model)
- bit — BitForImageClassification (BitConfig model)
- clip —
CLIPForImageClassification(CLIPConfig model) - convnext — ConvNextForImageClassification (ConvNextConfig model)
- convnextv2 — ConvNextV2ForImageClassification (ConvNextV2Config model)
- cvt — CvtForImageClassification (CvtConfig model)
- data2vec-vision — Data2VecVisionForImageClassification (Data2VecVisionConfig model)
- deit — DeiTForImageClassification or DeiTForImageClassificationWithTeacher (DeiTConfig model)
- dinat — DinatForImageClassification (DinatConfig model)
- dinov2 —
Dinov2ForImageClassification(Dinov2Config model) - dinov2_with_registers —
Dinov2WithRegistersForImageClassification(Dinov2WithRegistersConfig model) - donut-swin —
DonutSwinForImageClassification(DonutSwinConfig model) - efficientnet —
EfficientNetForImageClassification(EfficientNetConfig model) - focalnet —
FocalNetForImageClassification(FocalNetConfig model) - hgnet_v2 —
HGNetV2ForImageClassification(HGNetV2Config model) - hiera —
HieraForImageClassification(HieraConfig model) - ijepa —
IJepaForImageClassification(IJepaConfig model) - imagegpt —
ImageGPTForImageClassification(ImageGPTConfig model) - levit —
LevitForImageClassificationorLevitForImageClassificationWithTeacher(LevitConfig model) - metaclip_2 —
MetaClip2ForImageClassification(MetaClip2Config model) - mobilenet_v1 —
MobileNetV1ForImageClassification(MobileNetV1Config model) - mobilenet_v2 —
MobileNetV2ForImageClassification(MobileNetV2Config model) - mobilevit —
MobileViTForImageClassification(MobileViTConfig model) - mobilevitv2 —
MobileViTV2ForImageClassification(MobileViTV2Config model) - perceiver —
PerceiverForImageClassificationLearnedorPerceiverForImageClassificationFourierorPerceiverForImageClassificationConvProcessing(PerceiverConfig model) - poolformer —
PoolFormerForImageClassification(PoolFormerConfig model) - pp_lcnet —
PPLCNetForImageClassification(PPLCNetConfig model) - pvt —
PvtForImageClassification(PvtConfig model) - pvt_v2 —
PvtV2ForImageClassification(PvtV2Config model) - regnet —
RegNetForImageClassification(RegNetConfig model) - resnet —
ResNetForImageClassification(ResNetConfig model) - segformer —
SegformerForImageClassification(SegformerConfig model) - shieldgemma2 —
ShieldGemma2ForImageClassification(ShieldGemma2Config model) - siglip —
SiglipForImageClassification(SiglipConfig model) - siglip2 —
Siglip2ForImageClassification(Siglip2Config model) - swiftformer —
SwiftFormerForImageClassification(SwiftFormerConfig model) - swin —
SwinForImageClassification(SwinConfig model) - swinv2 —
Swinv2ForImageClassification(Swinv2Config model) - textnet —
TextNetForImageClassification(TextNetConfig model) - timm_wrapper —
TimmWrapperForImageClassification(TimmWrapperConfig model) - vit —
ViTForImageClassification(ViTConfig model) - vit_msn —
ViTMSNForImageClassification(ViTMSNConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForImageClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForVideoClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a video classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
TimesformerConfigconfiguration class:TimesformerForVideoClassification(TimesformerConfig model)VJEPA2Configconfiguration class:VJEPA2ForVideoClassification(VJEPA2Config model)VideoMAEConfigconfiguration class:VideoMAEForVideoClassification(VideoMAEConfig model)VivitConfigconfiguration class:VivitForVideoClassification(VivitConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a video classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a video classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- timesformer —
TimesformerForVideoClassification(TimesformerConfig model) - videomae —
VideoMAEForVideoClassification(VideoMAEConfig model) - vivit —
VivitForVideoClassification(VivitConfig model) - vjepa2 —
VJEPA2ForVideoClassification(VJEPA2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForVideoClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForMaskedImageModeling
This is a generic model class that will be instantiated as one of the model classes of the library (with a masked image modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DeiTConfig configuration class: DeiTForMaskedImageModeling (DeiTConfig model)
FocalNetConfigconfiguration class:FocalNetForMaskedImageModeling(FocalNetConfig model)SwinConfigconfiguration class:SwinForMaskedImageModeling(SwinConfig model)Swinv2Configconfiguration class:Swinv2ForMaskedImageModeling(Swinv2Config model)ViTConfigconfiguration class:ViTForMaskedImageModeling(ViTConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- deit — DeiTForMaskedImageModeling (DeiTConfig model)
- focalnet —
FocalNetForMaskedImageModeling(FocalNetConfig model) - swin —
SwinForMaskedImageModeling(SwinConfig model) - swinv2 —
Swinv2ForMaskedImageModeling(Swinv2Config model) - vit —
ViTForMaskedImageModeling(ViTConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForObjectDetection
This is a generic model class that will be instantiated as one of the model classes of the library (with a object detection head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- ConditionalDetrConfig configuration class: ConditionalDetrForObjectDetection (ConditionalDetrConfig model)
DFineConfigconfiguration class:DFineForObjectDetection(DFineConfig model)DabDetrConfigconfiguration class:DabDetrForObjectDetection(DabDetrConfig model)- DeformableDetrConfig configuration class: DeformableDetrForObjectDetection (DeformableDetrConfig model)
- DetrConfig configuration class: DetrForObjectDetection (DetrConfig model)
LwDetrConfigconfiguration class:LwDetrForObjectDetection(LwDetrConfig model)PPDocLayoutV2Configconfiguration class:PPDocLayoutV2ForObjectDetection(PPDocLayoutV2Config model)PPDocLayoutV3Configconfiguration class:PPDocLayoutV3ForObjectDetection(PPDocLayoutV3Config model)PPOCRV5MobileDetConfigconfiguration class:PPOCRV5MobileDetForObjectDetection(PPOCRV5MobileDetConfig model)PPOCRV5ServerDetConfigconfiguration class:PPOCRV5ServerDetForObjectDetection(PPOCRV5ServerDetConfig model)RTDetrConfigconfiguration class:RTDetrForObjectDetection(RTDetrConfig model)RTDetrV2Configconfiguration class:RTDetrV2ForObjectDetection(RTDetrV2Config model)TableTransformerConfigconfiguration class:TableTransformerForObjectDetection(TableTransformerConfig model)YolosConfigconfiguration class:YolosForObjectDetection(YolosConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a object detection head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a object detection head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- conditional_detr — ConditionalDetrForObjectDetection (ConditionalDetrConfig model)
- d_fine —
DFineForObjectDetection(DFineConfig model) - dab-detr —
DabDetrForObjectDetection(DabDetrConfig model) - deformable_detr — DeformableDetrForObjectDetection (DeformableDetrConfig model)
- detr — DetrForObjectDetection (DetrConfig model)
- lw_detr —
LwDetrForObjectDetection(LwDetrConfig model) - pp_doclayout_v2 —
PPDocLayoutV2ForObjectDetection(PPDocLayoutV2Config model) - pp_doclayout_v3 —
PPDocLayoutV3ForObjectDetection(PPDocLayoutV3Config model) - pp_ocrv5_mobile_det —
PPOCRV5MobileDetForObjectDetection(PPOCRV5MobileDetConfig model) - pp_ocrv5_server_det —
PPOCRV5ServerDetForObjectDetection(PPOCRV5ServerDetConfig model) - rt_detr —
RTDetrForObjectDetection(RTDetrConfig model) - rt_detr_v2 —
RTDetrV2ForObjectDetection(RTDetrV2Config model) - table-transformer —
TableTransformerForObjectDetection(TableTransformerConfig model) - yolos —
YolosForObjectDetection(YolosConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForObjectDetection
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForImageSegmentation
This is a generic model class that will be instantiated as one of the model classes of the library (with a image segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DetrConfig configuration class: DetrForSegmentation (DetrConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a image segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- detr — DetrForSegmentation (DetrConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForImageSegmentation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForImageToImage
AutoModelForSemanticSegmentation
This is a generic model class that will be instantiated as one of the model classes of the library (with a semantic segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BeitConfig configuration class: BeitForSemanticSegmentation (BeitConfig model)
DPTConfigconfiguration class:DPTForSemanticSegmentation(DPTConfig model)- Data2VecVisionConfig configuration class: Data2VecVisionForSemanticSegmentation (Data2VecVisionConfig model)
MobileNetV2Configconfiguration class:MobileNetV2ForSemanticSegmentation(MobileNetV2Config model)MobileViTConfigconfiguration class:MobileViTForSemanticSegmentation(MobileViTConfig model)MobileViTV2Configconfiguration class:MobileViTV2ForSemanticSegmentation(MobileViTV2Config model)SegformerConfigconfiguration class:SegformerForSemanticSegmentation(SegformerConfig model)UperNetConfigconfiguration class:UperNetForSemanticSegmentation(UperNetConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- beit — BeitForSemanticSegmentation (BeitConfig model)
- data2vec-vision — Data2VecVisionForSemanticSegmentation (Data2VecVisionConfig model)
- dpt —
DPTForSemanticSegmentation(DPTConfig model) - mobilenet_v2 —
MobileNetV2ForSemanticSegmentation(MobileNetV2Config model) - mobilevit —
MobileViTForSemanticSegmentation(MobileViTConfig model) - mobilevitv2 —
MobileViTV2ForSemanticSegmentation(MobileViTV2Config model) - segformer —
SegformerForSemanticSegmentation(SegformerConfig model) - upernet —
UperNetForSemanticSegmentation(UperNetConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForInstanceSegmentation
This is a generic model class that will be instantiated as one of the model classes of the library (with a instance segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
MaskFormerConfigconfiguration class:MaskFormerForInstanceSegmentation(MaskFormerConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a instance segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a instance segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- maskformer —
MaskFormerForInstanceSegmentation(MaskFormerConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForUniversalSegmentation
This is a generic model class that will be instantiated as one of the model classes of the library (with a universal image segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DetrConfig configuration class: DetrForSegmentation (DetrConfig model)
EomtConfigconfiguration class:EomtForUniversalSegmentation(EomtConfig model)EomtDinov3Configconfiguration class:EomtDinov3ForUniversalSegmentation(EomtDinov3Config model)Mask2FormerConfigconfiguration class:Mask2FormerForUniversalSegmentation(Mask2FormerConfig model)MaskFormerConfigconfiguration class:MaskFormerForInstanceSegmentation(MaskFormerConfig model)OneFormerConfigconfiguration class:OneFormerForUniversalSegmentation(OneFormerConfig model)VideomtConfigconfiguration class:VideomtForUniversalSegmentation(VideomtConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a universal image segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a universal image segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- detr — DetrForSegmentation (DetrConfig model)
- eomt —
EomtForUniversalSegmentation(EomtConfig model) - eomt_dinov3 —
EomtDinov3ForUniversalSegmentation(EomtDinov3Config model) - mask2former —
Mask2FormerForUniversalSegmentation(Mask2FormerConfig model) - maskformer —
MaskFormerForInstanceSegmentation(MaskFormerConfig model) - oneformer —
OneFormerForUniversalSegmentation(OneFormerConfig model) - videomt —
VideomtForUniversalSegmentation(VideomtConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForZeroShotImageClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot image classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlignConfig configuration class: AlignModel (AlignConfig model)
- AltCLIPConfig configuration class: AltCLIPModel (AltCLIPConfig model)
- Blip2Config configuration class:
Blip2ForImageTextRetrieval(Blip2Config model) - BlipConfig configuration class: BlipModel (BlipConfig model)
- CLIPConfig configuration class: CLIPModel (CLIPConfig model)
- CLIPSegConfig configuration class: CLIPSegModel (CLIPSegConfig model)
- ChineseCLIPConfig configuration class: ChineseCLIPModel (ChineseCLIPConfig model)
MetaClip2Configconfiguration class:MetaClip2Model(MetaClip2Config model)Siglip2Configconfiguration class:Siglip2Model(Siglip2Config model)SiglipConfigconfiguration class:SiglipModel(SiglipConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a zero-shot image classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a zero-shot image classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- align — AlignModel (AlignConfig model)
- altclip — AltCLIPModel (AltCLIPConfig model)
- blip — BlipModel (BlipConfig model)
- blip-2 —
Blip2ForImageTextRetrieval(Blip2Config model) - chinese_clip — ChineseCLIPModel (ChineseCLIPConfig model)
- clip — CLIPModel (CLIPConfig model)
- clipseg — CLIPSegModel (CLIPSegConfig model)
- metaclip_2 —
MetaClip2Model(MetaClip2Config model) - siglip —
SiglipModel(SiglipConfig model) - siglip2 —
Siglip2Model(Siglip2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForZeroShotObjectDetection
This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot object detection head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
GroundingDinoConfigconfiguration class:GroundingDinoForObjectDetection(GroundingDinoConfig model)MMGroundingDinoConfigconfiguration class:MMGroundingDinoForObjectDetection(MMGroundingDinoConfig model)OmDetTurboConfigconfiguration class:OmDetTurboForObjectDetection(OmDetTurboConfig model)OwlViTConfigconfiguration class:OwlViTForObjectDetection(OwlViTConfig model)Owlv2Configconfiguration class:Owlv2ForObjectDetection(Owlv2Config model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a zero-shot object detection head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a zero-shot object detection head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- grounding-dino —
GroundingDinoForObjectDetection(GroundingDinoConfig model) - mm-grounding-dino —
MMGroundingDinoForObjectDetection(MMGroundingDinoConfig model) - omdet-turbo —
OmDetTurboForObjectDetection(OmDetTurboConfig model) - owlv2 —
Owlv2ForObjectDetection(Owlv2Config model) - owlvit —
OwlViTForObjectDetection(OwlViTConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAudio
以下の自動クラスは、次の音声タスクに利用可能です。
AutoModelForAudioClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- ASTConfig configuration class: ASTForAudioClassification (ASTConfig model)
- Data2VecAudioConfig configuration class: Data2VecAudioForSequenceClassification (Data2VecAudioConfig model)
HubertConfigconfiguration class:HubertForSequenceClassification(HubertConfig model)SEWConfigconfiguration class:SEWForSequenceClassification(SEWConfig model)SEWDConfigconfiguration class:SEWDForSequenceClassification(SEWDConfig model)UniSpeechConfigconfiguration class:UniSpeechForSequenceClassification(UniSpeechConfig model)UniSpeechSatConfigconfiguration class:UniSpeechSatForSequenceClassification(UniSpeechSatConfig model)Wav2Vec2BertConfigconfiguration class:Wav2Vec2BertForSequenceClassification(Wav2Vec2BertConfig model)Wav2Vec2Configconfiguration class:Wav2Vec2ForSequenceClassification(Wav2Vec2Config model)Wav2Vec2ConformerConfigconfiguration class:Wav2Vec2ConformerForSequenceClassification(Wav2Vec2ConformerConfig model)WavLMConfigconfiguration class:WavLMForSequenceClassification(WavLMConfig model)WhisperConfigconfiguration class:WhisperForAudioClassification(WhisperConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a audio classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- audio-spectrogram-transformer — ASTForAudioClassification (ASTConfig model)
- data2vec-audio — Data2VecAudioForSequenceClassification (Data2VecAudioConfig model)
- hubert —
HubertForSequenceClassification(HubertConfig model) - sew —
SEWForSequenceClassification(SEWConfig model) - sew-d —
SEWDForSequenceClassification(SEWDConfig model) - unispeech —
UniSpeechForSequenceClassification(UniSpeechConfig model) - unispeech-sat —
UniSpeechSatForSequenceClassification(UniSpeechSatConfig model) - wav2vec2 —
Wav2Vec2ForSequenceClassification(Wav2Vec2Config model) - wav2vec2-bert —
Wav2Vec2BertForSequenceClassification(Wav2Vec2BertConfig model) - wav2vec2-conformer —
Wav2Vec2ConformerForSequenceClassification(Wav2Vec2ConformerConfig model) - wavlm —
WavLMForSequenceClassification(WavLMConfig model) - whisper —
WhisperForAudioClassification(WhisperConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForAudioClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForAudioFrameClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio frame (token) classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForAudioFrameClassification (Data2VecAudioConfig model)
UniSpeechSatConfigconfiguration class:UniSpeechSatForAudioFrameClassification(UniSpeechSatConfig model)Wav2Vec2BertConfigconfiguration class:Wav2Vec2BertForAudioFrameClassification(Wav2Vec2BertConfig model)Wav2Vec2Configconfiguration class:Wav2Vec2ForAudioFrameClassification(Wav2Vec2Config model)Wav2Vec2ConformerConfigconfiguration class:Wav2Vec2ConformerForAudioFrameClassification(Wav2Vec2ConformerConfig model)WavLMConfigconfiguration class:WavLMForAudioFrameClassification(WavLMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a audio frame (token) classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio frame (token) classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- data2vec-audio — Data2VecAudioForAudioFrameClassification (Data2VecAudioConfig model)
- unispeech-sat —
UniSpeechSatForAudioFrameClassification(UniSpeechSatConfig model) - wav2vec2 —
Wav2Vec2ForAudioFrameClassification(Wav2Vec2Config model) - wav2vec2-bert —
Wav2Vec2BertForAudioFrameClassification(Wav2Vec2BertConfig model) - wav2vec2-conformer —
Wav2Vec2ConformerForAudioFrameClassification(Wav2Vec2ConformerConfig model) - wavlm —
WavLMForAudioFrameClassification(WavLMConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForCTC
This is a generic model class that will be instantiated as one of the model classes of the library (with a connectionist temporal classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForCTC (Data2VecAudioConfig model)
HubertConfigconfiguration class:HubertForCTC(HubertConfig model)LasrCTCConfigconfiguration class:LasrForCTC(LasrCTCConfig model)ParakeetCTCConfigconfiguration class:ParakeetForCTC(ParakeetCTCConfig model)SEWConfigconfiguration class:SEWForCTC(SEWConfig model)SEWDConfigconfiguration class:SEWDForCTC(SEWDConfig model)UniSpeechConfigconfiguration class:UniSpeechForCTC(UniSpeechConfig model)UniSpeechSatConfigconfiguration class:UniSpeechSatForCTC(UniSpeechSatConfig model)Wav2Vec2BertConfigconfiguration class:Wav2Vec2BertForCTC(Wav2Vec2BertConfig model)Wav2Vec2Configconfiguration class:Wav2Vec2ForCTC(Wav2Vec2Config model)Wav2Vec2ConformerConfigconfiguration class:Wav2Vec2ConformerForCTC(Wav2Vec2ConformerConfig model)WavLMConfigconfiguration class:WavLMForCTC(WavLMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a connectionist temporal classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a connectionist temporal classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- data2vec-audio — Data2VecAudioForCTC (Data2VecAudioConfig model)
- hubert —
HubertForCTC(HubertConfig model) - lasr_ctc —
LasrForCTC(LasrCTCConfig model) - parakeet_ctc —
ParakeetForCTC(ParakeetCTCConfig model) - sew —
SEWForCTC(SEWConfig model) - sew-d —
SEWDForCTC(SEWDConfig model) - unispeech —
UniSpeechForCTC(UniSpeechConfig model) - unispeech-sat —
UniSpeechSatForCTC(UniSpeechSatConfig model) - wav2vec2 —
Wav2Vec2ForCTC(Wav2Vec2Config model) - wav2vec2-bert —
Wav2Vec2BertForCTC(Wav2Vec2BertConfig model) - wav2vec2-conformer —
Wav2Vec2ConformerForCTC(Wav2Vec2ConformerConfig model) - wavlm —
WavLMForCTC(WavLMConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForCTC
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForSpeechSeq2Seq
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
CohereAsrConfigconfiguration class:CohereAsrForConditionalGeneration(CohereAsrConfig model)DiaConfigconfiguration class:DiaForConditionalGeneration(DiaConfig model)GraniteSpeechConfigconfiguration class:GraniteSpeechForConditionalGeneration(GraniteSpeechConfig model)KyutaiSpeechToTextConfigconfiguration class:KyutaiSpeechToTextForConditionalGeneration(KyutaiSpeechToTextConfig model)MoonshineConfigconfiguration class:MoonshineForConditionalGeneration(MoonshineConfig model)MoonshineStreamingConfigconfiguration class:MoonshineStreamingForConditionalGeneration(MoonshineStreamingConfig model)Pop2PianoConfigconfiguration class:Pop2PianoForConditionalGeneration(Pop2PianoConfig model)SeamlessM4TConfigconfiguration class:SeamlessM4TForSpeechToText(SeamlessM4TConfig model)SeamlessM4Tv2Configconfiguration class:SeamlessM4Tv2ForSpeechToText(SeamlessM4Tv2Config model)Speech2TextConfigconfiguration class:Speech2TextForConditionalGeneration(Speech2TextConfig model)SpeechEncoderDecoderConfigconfiguration class:SpeechEncoderDecoderModel(SpeechEncoderDecoderConfig model)SpeechT5Configconfiguration class:SpeechT5ForSpeechToText(SpeechT5Config model)VibeVoiceAsrConfigconfiguration class:VibeVoiceAsrForConditionalGeneration(VibeVoiceAsrConfig model)VoxtralConfigconfiguration class:VoxtralForConditionalGeneration(VoxtralConfig model)VoxtralRealtimeConfigconfiguration class:VoxtralRealtimeForConditionalGeneration(VoxtralRealtimeConfig model)WhisperConfigconfiguration class:WhisperForConditionalGeneration(WhisperConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- cohere_asr —
CohereAsrForConditionalGeneration(CohereAsrConfig model) - dia —
DiaForConditionalGeneration(DiaConfig model) - granite_speech —
GraniteSpeechForConditionalGeneration(GraniteSpeechConfig model) - kyutai_speech_to_text —
KyutaiSpeechToTextForConditionalGeneration(KyutaiSpeechToTextConfig model) - moonshine —
MoonshineForConditionalGeneration(MoonshineConfig model) - moonshine_streaming —
MoonshineStreamingForConditionalGeneration(MoonshineStreamingConfig model) - pop2piano —
Pop2PianoForConditionalGeneration(Pop2PianoConfig model) - seamless_m4t —
SeamlessM4TForSpeechToText(SeamlessM4TConfig model) - seamless_m4t_v2 —
SeamlessM4Tv2ForSpeechToText(SeamlessM4Tv2Config model) - speech-encoder-decoder —
SpeechEncoderDecoderModel(SpeechEncoderDecoderConfig model) - speech_to_text —
Speech2TextForConditionalGeneration(Speech2TextConfig model) - speecht5 —
SpeechT5ForSpeechToText(SpeechT5Config model) - vibevoice_asr —
VibeVoiceAsrForConditionalGeneration(VibeVoiceAsrConfig model) - voxtral —
VoxtralForConditionalGeneration(VoxtralConfig model) - voxtral_realtime —
VoxtralRealtimeForConditionalGeneration(VoxtralRealtimeConfig model) - whisper —
WhisperForConditionalGeneration(WhisperConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForAudioXVector
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio retrieval via x-vector head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForXVector (Data2VecAudioConfig model)
UniSpeechSatConfigconfiguration class:UniSpeechSatForXVector(UniSpeechSatConfig model)Wav2Vec2BertConfigconfiguration class:Wav2Vec2BertForXVector(Wav2Vec2BertConfig model)Wav2Vec2Configconfiguration class:Wav2Vec2ForXVector(Wav2Vec2Config model)Wav2Vec2ConformerConfigconfiguration class:Wav2Vec2ConformerForXVector(Wav2Vec2ConformerConfig model)WavLMConfigconfiguration class:WavLMForXVector(WavLMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a audio retrieval via x-vector head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio retrieval via x-vector head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- data2vec-audio — Data2VecAudioForXVector (Data2VecAudioConfig model)
- unispeech-sat —
UniSpeechSatForXVector(UniSpeechSatConfig model) - wav2vec2 —
Wav2Vec2ForXVector(Wav2Vec2Config model) - wav2vec2-bert —
Wav2Vec2BertForXVector(Wav2Vec2BertConfig model) - wav2vec2-conformer —
Wav2Vec2ConformerForXVector(Wav2Vec2ConformerConfig model) - wavlm —
WavLMForXVector(WavLMConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForAudioXVector
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForTextToSpectrogram
AutoModelForTextToWaveform
Multimodal
以下の自動クラスは、次のマルチモーダルタスクに利用可能です。
AutoModelForTableQuestionAnswering
This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
TapasConfigconfiguration class:TapasForQuestionAnswering(TapasConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a table question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- tapas —
TapasForQuestionAnswering(TapasConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")
>>> # Update configuration during loading
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForDocumentQuestionAnswering
This is a generic model class that will be instantiated as one of the model classes of the library (with a document question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
LayoutLMConfigconfiguration class:LayoutLMForQuestionAnswering(LayoutLMConfig model)LayoutLMv2Configconfiguration class:LayoutLMv2ForQuestionAnswering(LayoutLMv2Config model)LayoutLMv3Configconfiguration class:LayoutLMv3ForQuestionAnswering(LayoutLMv3Config model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a document question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
Examples:
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = AutoModelForDocumentQuestionAnswering.from_config(config)from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a document question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- layoutlm —
LayoutLMForQuestionAnswering(LayoutLMConfig model) - layoutlmv2 —
LayoutLMv2ForQuestionAnswering(LayoutLMv2Config model) - layoutlmv3 —
LayoutLMv3ForQuestionAnswering(LayoutLMv3Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> # Update configuration during loading
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForVisualQuestionAnswering
This is a generic model class that will be instantiated as one of the model classes of the library (with a visual question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Blip2Config configuration class: Blip2ForConditionalGeneration (Blip2Config model)
- BlipConfig configuration class: BlipForQuestionAnswering (BlipConfig model)
ViltConfigconfiguration class:ViltForQuestionAnswering(ViltConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a visual question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a visual question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- blip — BlipForQuestionAnswering (BlipConfig model)
- blip-2 — Blip2ForConditionalGeneration (Blip2Config model)
- vilt —
ViltForQuestionAnswering(ViltConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
>>> # Update configuration during loading
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForImageTextToText
This is a generic model class that will be instantiated as one of the model classes of the library (with a image-text-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
AriaConfigconfiguration class:AriaForConditionalGeneration(AriaConfig model)AyaVisionConfigconfiguration class:AyaVisionForConditionalGeneration(AyaVisionConfig model)- Blip2Config configuration class: Blip2ForConditionalGeneration (Blip2Config model)
- BlipConfig configuration class: BlipForConditionalGeneration (BlipConfig model)
ChameleonConfigconfiguration class:ChameleonForConditionalGeneration(ChameleonConfig model)Cohere2VisionConfigconfiguration class:Cohere2VisionForConditionalGeneration(Cohere2VisionConfig model)DeepseekVLConfigconfiguration class:DeepseekVLForConditionalGeneration(DeepseekVLConfig model)DeepseekVLHybridConfigconfiguration class:DeepseekVLHybridForConditionalGeneration(DeepseekVLHybridConfig model)Emu3Configconfiguration class:Emu3ForConditionalGeneration(Emu3Config model)Ernie4_5_VLMoeConfigconfiguration class:Ernie4_5_VLMoeForConditionalGeneration(Ernie4_5_VLMoeConfig model)EvollaConfigconfiguration class:EvollaForProteinText2Text(EvollaConfig model)FastVlmConfigconfiguration class:FastVlmForConditionalGeneration(FastVlmConfig model)Florence2Configconfiguration class:Florence2ForConditionalGeneration(Florence2Config model)FuyuConfigconfiguration class:FuyuForCausalLM(FuyuConfig model)Gemma3Configconfiguration class:Gemma3ForConditionalGeneration(Gemma3Config model)Gemma3nConfigconfiguration class:Gemma3nForConditionalGeneration(Gemma3nConfig model)Gemma4Configconfiguration class:Gemma4ForConditionalGeneration(Gemma4Config model)GitConfigconfiguration class:GitForCausalLM(GitConfig model)Glm46VConfigconfiguration class:Glm46VForConditionalGeneration(Glm46VConfig model)Glm4vConfigconfiguration class:Glm4vForConditionalGeneration(Glm4vConfig model)Glm4vMoeConfigconfiguration class:Glm4vMoeForConditionalGeneration(Glm4vMoeConfig model)GlmOcrConfigconfiguration class:GlmOcrForConditionalGeneration(GlmOcrConfig model)GotOcr2Configconfiguration class:GotOcr2ForConditionalGeneration(GotOcr2Config model)Idefics2Configconfiguration class:Idefics2ForConditionalGeneration(Idefics2Config model)Idefics3Configconfiguration class:Idefics3ForConditionalGeneration(Idefics3Config model)IdeficsConfigconfiguration class:IdeficsForVisionText2Text(IdeficsConfig model)InstructBlipConfigconfiguration class:InstructBlipForConditionalGeneration(InstructBlipConfig model)InstructBlipVideoConfigconfiguration class:InstructBlipVideoForConditionalGeneration(InstructBlipVideoConfig model)InternVLConfigconfiguration class:InternVLForConditionalGeneration(InternVLConfig model)JanusConfigconfiguration class:JanusForConditionalGeneration(JanusConfig model)Kosmos2Configconfiguration class:Kosmos2ForConditionalGeneration(Kosmos2Config model)Kosmos2_5Configconfiguration class:Kosmos2_5ForConditionalGeneration(Kosmos2_5Config model)Lfm2VlConfigconfiguration class:Lfm2VlForConditionalGeneration(Lfm2VlConfig model)LightOnOcrConfigconfiguration class:LightOnOcrForConditionalGeneration(LightOnOcrConfig model)Llama4Configconfiguration class:Llama4ForConditionalGeneration(Llama4Config model)LlavaConfigconfiguration class:LlavaForConditionalGeneration(LlavaConfig model)LlavaNextConfigconfiguration class:LlavaNextForConditionalGeneration(LlavaNextConfig model)LlavaNextVideoConfigconfiguration class:LlavaNextVideoForConditionalGeneration(LlavaNextVideoConfig model)LlavaOnevisionConfigconfiguration class:LlavaOnevisionForConditionalGeneration(LlavaOnevisionConfig model)Mistral3Configconfiguration class:Mistral3ForConditionalGeneration(Mistral3Config model)Mistral4Configconfiguration class:Mistral4ForCausalLM(Mistral4Config model)MllamaConfigconfiguration class:MllamaForConditionalGeneration(MllamaConfig model)Ovis2Configconfiguration class:Ovis2ForConditionalGeneration(Ovis2Config model)PI0Configconfiguration class:PI0ForConditionalGeneration(PI0Config model)PPChart2TableConfigconfiguration class:GotOcr2ForConditionalGeneration(PPChart2TableConfig model)PaddleOCRVLConfigconfiguration class:PaddleOCRVLForConditionalGeneration(PaddleOCRVLConfig model)PaliGemmaConfigconfiguration class:PaliGemmaForConditionalGeneration(PaliGemmaConfig model)PerceptionLMConfigconfiguration class:PerceptionLMForConditionalGeneration(PerceptionLMConfig model)Pix2StructConfigconfiguration class:Pix2StructForConditionalGeneration(Pix2StructConfig model)QianfanOCRConfigconfiguration class:QianfanOCRForConditionalGeneration(QianfanOCRConfig model)Qwen2VLConfigconfiguration class:Qwen2VLForConditionalGeneration(Qwen2VLConfig model)Qwen2_5OmniThinkerConfigconfiguration class:Qwen2_5OmniThinkerForConditionalGeneration(Qwen2_5OmniThinkerConfig model)Qwen2_5_VLConfigconfiguration class:Qwen2_5_VLForConditionalGeneration(Qwen2_5_VLConfig model)Qwen3OmniMoeThinkerConfigconfiguration class:Qwen3OmniMoeThinkerForConditionalGeneration(Qwen3OmniMoeThinkerConfig model)Qwen3VLConfigconfiguration class:Qwen3VLForConditionalGeneration(Qwen3VLConfig model)Qwen3VLMoeConfigconfiguration class:Qwen3VLMoeForConditionalGeneration(Qwen3VLMoeConfig model)Qwen3_5Configconfiguration class:Qwen3_5ForConditionalGeneration(Qwen3_5Config model)Qwen3_5MoeConfigconfiguration class:Qwen3_5MoeForConditionalGeneration(Qwen3_5MoeConfig model)ShieldGemma2Configconfiguration class:Gemma3ForConditionalGeneration(ShieldGemma2Config model)SmolVLMConfigconfiguration class:SmolVLMForConditionalGeneration(SmolVLMConfig model)T5Gemma2Configconfiguration class:T5Gemma2ForConditionalGeneration(T5Gemma2Config model)UdopConfigconfiguration class:UdopForConditionalGeneration(UdopConfig model)VideoLlama3Configconfiguration class:VideoLlama3ForConditionalGeneration(VideoLlama3Config model)VideoLlavaConfigconfiguration class:VideoLlavaForConditionalGeneration(VideoLlavaConfig model)VipLlavaConfigconfiguration class:VipLlavaForConditionalGeneration(VipLlavaConfig model)VisionEncoderDecoderConfigconfiguration class:VisionEncoderDecoderModel(VisionEncoderDecoderConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a image-text-to-text modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image-text-to-text modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- aria —
AriaForConditionalGeneration(AriaConfig model) - aya_vision —
AyaVisionForConditionalGeneration(AyaVisionConfig model) - blip — BlipForConditionalGeneration (BlipConfig model)
- blip-2 — Blip2ForConditionalGeneration (Blip2Config model)
- chameleon —
ChameleonForConditionalGeneration(ChameleonConfig model) - cohere2_vision —
Cohere2VisionForConditionalGeneration(Cohere2VisionConfig model) - deepseek_vl —
DeepseekVLForConditionalGeneration(DeepseekVLConfig model) - deepseek_vl_hybrid —
DeepseekVLHybridForConditionalGeneration(DeepseekVLHybridConfig model) - emu3 —
Emu3ForConditionalGeneration(Emu3Config model) - ernie4_5_vl_moe —
Ernie4_5_VLMoeForConditionalGeneration(Ernie4_5_VLMoeConfig model) - evolla —
EvollaForProteinText2Text(EvollaConfig model) - fast_vlm —
FastVlmForConditionalGeneration(FastVlmConfig model) - florence2 —
Florence2ForConditionalGeneration(Florence2Config model) - fuyu —
FuyuForCausalLM(FuyuConfig model) - gemma3 —
Gemma3ForConditionalGeneration(Gemma3Config model) - gemma3n —
Gemma3nForConditionalGeneration(Gemma3nConfig model) - gemma4 —
Gemma4ForConditionalGeneration(Gemma4Config model) - git —
GitForCausalLM(GitConfig model) - glm46v —
Glm46VForConditionalGeneration(Glm46VConfig model) - glm4v —
Glm4vForConditionalGeneration(Glm4vConfig model) - glm4v_moe —
Glm4vMoeForConditionalGeneration(Glm4vMoeConfig model) - glm_ocr —
GlmOcrForConditionalGeneration(GlmOcrConfig model) - got_ocr2 —
GotOcr2ForConditionalGeneration(GotOcr2Config model) - idefics —
IdeficsForVisionText2Text(IdeficsConfig model) - idefics2 —
Idefics2ForConditionalGeneration(Idefics2Config model) - idefics3 —
Idefics3ForConditionalGeneration(Idefics3Config model) - instructblip —
InstructBlipForConditionalGeneration(InstructBlipConfig model) - instructblipvideo —
InstructBlipVideoForConditionalGeneration(InstructBlipVideoConfig model) - internvl —
InternVLForConditionalGeneration(InternVLConfig model) - janus —
JanusForConditionalGeneration(JanusConfig model) - kosmos-2 —
Kosmos2ForConditionalGeneration(Kosmos2Config model) - kosmos-2.5 —
Kosmos2_5ForConditionalGeneration(Kosmos2_5Config model) - lfm2_vl —
Lfm2VlForConditionalGeneration(Lfm2VlConfig model) - lighton_ocr —
LightOnOcrForConditionalGeneration(LightOnOcrConfig model) - llama4 —
Llama4ForConditionalGeneration(Llama4Config model) - llava —
LlavaForConditionalGeneration(LlavaConfig model) - llava_next —
LlavaNextForConditionalGeneration(LlavaNextConfig model) - llava_next_video —
LlavaNextVideoForConditionalGeneration(LlavaNextVideoConfig model) - llava_onevision —
LlavaOnevisionForConditionalGeneration(LlavaOnevisionConfig model) - mistral3 —
Mistral3ForConditionalGeneration(Mistral3Config model) - mistral4 —
Mistral4ForCausalLM(Mistral4Config model) - mllama —
MllamaForConditionalGeneration(MllamaConfig model) - ovis2 —
Ovis2ForConditionalGeneration(Ovis2Config model) - paddleocr_vl —
PaddleOCRVLForConditionalGeneration(PaddleOCRVLConfig model) - paligemma —
PaliGemmaForConditionalGeneration(PaliGemmaConfig model) - perception_lm —
PerceptionLMForConditionalGeneration(PerceptionLMConfig model) - pi0 —
PI0ForConditionalGeneration(PI0Config model) - pix2struct —
Pix2StructForConditionalGeneration(Pix2StructConfig model) - pp_chart2table —
GotOcr2ForConditionalGeneration(PPChart2TableConfig model) - qianfan_ocr —
QianfanOCRForConditionalGeneration(QianfanOCRConfig model) - qwen2_5_omni_thinker —
Qwen2_5OmniThinkerForConditionalGeneration(Qwen2_5OmniThinkerConfig model) - qwen2_5_vl —
Qwen2_5_VLForConditionalGeneration(Qwen2_5_VLConfig model) - qwen2_vl —
Qwen2VLForConditionalGeneration(Qwen2VLConfig model) - qwen3_5 —
Qwen3_5ForConditionalGeneration(Qwen3_5Config model) - qwen3_5_moe —
Qwen3_5MoeForConditionalGeneration(Qwen3_5MoeConfig model) - qwen3_omni_moe_thinker —
Qwen3OmniMoeThinkerForConditionalGeneration(Qwen3OmniMoeThinkerConfig model) - qwen3_vl —
Qwen3VLForConditionalGeneration(Qwen3VLConfig model) - qwen3_vl_moe —
Qwen3VLMoeForConditionalGeneration(Qwen3VLMoeConfig model) - shieldgemma2 —
Gemma3ForConditionalGeneration(ShieldGemma2Config model) - smolvlm —
SmolVLMForConditionalGeneration(SmolVLMConfig model) - t5gemma2 —
T5Gemma2ForConditionalGeneration(T5Gemma2Config model) - udop —
UdopForConditionalGeneration(UdopConfig model) - video_llama_3 —
VideoLlama3ForConditionalGeneration(VideoLlama3Config model) - video_llava —
VideoLlavaForConditionalGeneration(VideoLlavaConfig model) - vipllava —
VipLlavaForConditionalGeneration(VipLlavaConfig model) - vision-encoder-decoder —
VisionEncoderDecoderModel(VisionEncoderDecoderConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForImageTextToText
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueTime Series
AutoModelForTimeSeriesPrediction
This is a generic model class that will be instantiated as one of the model classes of the library (with a time-series prediction head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
TimesFm2_5Configconfiguration class:TimesFm2_5ModelForPrediction(TimesFm2_5Config model)TimesFmConfigconfiguration class:TimesFmModelForPrediction(TimesFmConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a time-series prediction head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a time-series prediction head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- timesfm —
TimesFmModelForPrediction(TimesFmConfig model) - timesfm2_5 —
TimesFm2_5ModelForPrediction(TimesFm2_5Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True