Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
SNUMPR 's Collections
HIMA
VLM-RLAIF
ISR-DPO
ReALFRED

VLM-RLAIF

updated Aug 6, 2024

Respository for ACL 2024 paper "Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI feedback"

Upvote
-

  • Paused

    vlm-rlaif-demo

    🚀


  • Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback

    Paper • 2402.03746 • Published Feb 6, 2024

  • SNUMPR/vlm_rlaif_video_llava_7b

    Text Generation • Updated Jun 28, 2024 • 1

  • SNUMPR/vlm_sft_video_llava_7b

    Updated Jul 18, 2024 • 1

  • SNUMPR/vlm_sft_video_llava_13b

    Updated Jul 18, 2024 • 2

  • SNUMPR/vlm_rlaif_datasets

    Preview • Updated Jul 20, 2024 • 4 • 1

  • SNUMPR/vlm_rlaif_train_anet_frames

    Updated Jul 19, 2024 • 1

  • SNUMPR/vlm_rlaif_eval_datasets

    Updated Jul 19, 2024 • 2

  • SNUMPR/vlm_rm_13b_lora

    Updated Aug 6, 2024

  • SNUMPR/vlm_policy_init_7b_lora

    Updated Aug 6, 2024
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs