File size: 126 Bytes
5fa1a76
 
1
2
The authors use the features generated after passing these regions through a pre-trained
CNN like ResNet as visual embeddings.