Convert to HF format

#55
by cyrilvallez HF Staff - opened
No description provided.
cyrilvallez changed pull request title from Upload folder using huggingface_hub to Convert to HF format

This PR converts the weights and configs to HF format following the PR I just merged directly in Transformers https://github.com/huggingface/transformers/pull/36939

We also added multimodality support in chat template, which comes in a separate PR (https://huggingface.co/microsoft/Phi-4-multimodal-instruct/discussions/56)

@cyrilvallez @RaushanTurganbay Thank you for your contributions! In the PR, I see the checkpoints have been modified. However, as the HF checkpoint is being used for vLLM (supported by vLLM 0.7.3+), I wonder if your changes on checkpoint / configs are still compatible with vLLM? Thanks in advance!

cc: @nguyenbh

@cyrilvallez When I tried the sample_inference_phi4mm.py in your PR with transformers==4.51.0, it shows error below. Could you help check the issue? Thanks!

Traceback (most recent call last):
  File "/home/weijianxu/code/phi-o/Phi-4-multimodal-instruct-for-pr/sample_inference_phi4mm.py", line 13, in <module>
    processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)
  File "/home/weijianxu/anaconda3/envs/phi4mm_hf/lib/python3.13/site-packages/transformers/models/auto/processing_auto.py", line 347, in from_pretrained
    return processor_class.from_pretrained(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/weijianxu/anaconda3/envs/phi4mm_hf/lib/python3.13/site-packages/transformers/processing_utils.py", line 1082, in from_pretrained
    return cls.from_args_and_dict(args, processor_dict, **kwargs)
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/weijianxu/anaconda3/envs/phi4mm_hf/lib/python3.13/site-packages/transformers/processing_utils.py", line 876, in from_args_and_dict
    processor = cls(*args, **processor_dict)
  File "/home/weijianxu/anaconda3/envs/phi4mm_hf/lib/python3.13/site-packages/transformers/models/phi4_multimodal/processing_phi4_multimodal.py", line 74, in __init__
    super().__init__(image_processor, audio_processor, tokenizer, **kwargs)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/weijianxu/anaconda3/envs/phi4mm_hf/lib/python3.13/site-packages/transformers/processing_utils.py", line 464, in __init__
    raise TypeError(f"Unexpected keyword argument {key}.")
TypeError: Unexpected keyword argument fake_audio_token_pattern.

@xwjabc for vLLM some weights might need to be adapted, I ma not sure if Cyril has specific keys that need to be changed. If won't be hard, since vLLM internally has a mapping from HF weights to vLLM weights, so we'll need to change that mapping

For the error, did you try to merge both PRs (current and linked)? The processor in transformers==4.51.0 is updated to accommodate for chat template modifications as well. I don't have access to commit to existing PRs unfortunately so made a separate one

@RaushanTurganbay Thank you for your reply! I will give it a try to merge both PRs.

In addition, I see your PR 56 removes processor_config.jsonand add chat_template, so I believe your PR depends on the processor in PR 55 & transformers==4.51.0. Is my understanding correct? Thanks!

You should merge this PR first, then @RaushanTurganbay 's PR! That way the processor will be up-to-date!

Yeah, I rebased on #55 before making changes and the config should indeed be deleted to work with transformers==4.51.0

cyrilvallez changed pull request status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment