Supported Models
Other models with similar architectures may also work successfully even if not explicitly validated. Consider testing any unlisted models to verify compatibility with your specific use case.
Large Language Models (LLMs)
LoRA adapters are supported.
The pipeline can work with other similar topologies produced by optimum-intel
with the same model signature.
The model is required to have the following inputs after the conversion:
input_ids
contains the tokens.attention_mask
is filled with1
.beam_idx
selects beams.position_ids
(optional) encodes a position of currently generating token in the sequence and a singlelogits
output.
Models should belong to the same family and have the same tokenizers.
Image Generation Models
Visual Language Models (VLMs)
Architecture | Models | LoRA Support | Example HuggingFace Models |
---|---|---|---|
InternVLChat | InternVLChatModel (Notes) | ❌ | |
LLaVA | LLaVA-v1.5 | ❌ | |
LLaVA-NeXT | LLaVA-v1.6 | ❌ | |
MiniCPMV | MiniCPM-V-2_6 | ❌ | |
Phi3VForCausalLM | phi3_v (Notes) | ❌ | |
Phi4MMForCausalLM | phi4mm (Notes) | ❌ | |
Qwen2-VL | Qwen2-VL | ❌ | |
Qwen2.5-VL | Qwen2.5-VL | ❌ | |
Gemma3ForConditionalGeneration | gemma3 | ❌ |
InternVL2
To convert InternVL2 models, timm
and einops
are required:
pip install timm einops
phi3_v
Models' configs aren't consistent. It's required to override the default eos_token_id
with the one from a tokenizer:
generation_config.set_eos_token_id(pipe.get_tokenizer().get_eos_token_id())
phi4mm
Apply https://huggingface.co/microsoft/Phi-4-multimodal-instruct/discussions/78/files to fix the model export for transformers>=4.50
Speech Recognition Models (Whisper-based)
Architecture | Models | LoRA Support | Example HuggingFace Models |
---|---|---|---|
WhisperForConditionalGeneration | Whisper | ❌ | |
Distil-Whisper | ❌ |
Text Embeddings Models
Architecture | LoRA Support | Example HuggingFace Models |
---|---|---|
BertModel | ❌ | |
MPNetForMaskedLM | ❌ | |
RobertaForMaskedLM | ❌ | |
XLMRobertaModel | ❌ |
Speech Generation Models
Architecture | Models | LoRA Support | Example HuggingFace Models |
---|---|---|---|
SpeechT5ForTextToSpeech | SpeechT5 TTS | ❌ |
Text Rerank Models
Architecture | Example HuggingFace Models |
---|---|
BertForSequenceClassification | |
XLMRobertaForSequenceClassification | |
GemmaForCausalLM | |
ModernBertForSequenceClassification | |
ModernBertForMaskedLM |
LoRA adapters are not supported.
Some models may require access request submission on the Hugging Face page to be downloaded.
If https://huggingface.co/ is down, the conversion step won't be able to download the models.