Multi-modal processors | |
Any multi-modal model will require an object to encode or decode the data that groups several modalities (among text, | |
vision and audio). |
Multi-modal processors | |
Any multi-modal model will require an object to encode or decode the data that groups several modalities (among text, | |
vision and audio). |