Image-Text-to-Text
Transformers
PyTorch
English
llava
image-to-text
1-bit
VLA
VLM
conversational

Can we ge just the pretrained version?

#1
by nicolollo - opened

Is it possible?

Owner

It’s already a pre-trained VL model… Due to limited resources, we only train it for 20B tokens. But it still shows good performance when fine-tuning on robotics tasks.

hongyuw changed discussion status to closed

Sign up or log in to comment