In addition to classification labels, Perceiver IO can produce (for example) language, optical flow, and multimodal videos with audio.