`Image_transform` function seems doesn't support extreme width/height

#21
by alexyywwdd - opened

see in https://huggingface.co/internlm/internlm-xcomposer2d5-7b/blob/4aa81f2bbf20a9ddd4137dfe847c142adf07b652/ixc_utils.py#L36-L39:

def Image_transform(img, hd_num=25):
   # ...
    scale = 1
    # bug?
    while scale*np.ceil(scale/ratio) <= hd_num:
        scale += 1
    scale -= 1

try an image with an extreme width/height ratio, like (375, 15000), and see what kind of results we get.
To correct, add an if judgement seems help:

def Image_transform(img, hd_num=25):
   # ...
    scale = 1
    # bug fix?
    if scale*np.ceil(scale/ratio) <= hd_num:
        while scale*np.ceil(scale/ratio) <= hd_num:
            scale += 1
        scale -= 1

The original version might treat images with an extreme width/height ratio as "(0,0)" images(which will be padded to (560,560)), I'm worried that this might induce hallucination since training might force the model to predict some targets given a dummy image.

alexyywwdd changed discussion status to closed

Sign up or log in to comment