You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@Suhail To generate features from the pre-trained backbones, just use a transform similar to the standard one used for evaluating on image classification with the typical ImageNet normalization mean and std (see what's used in the code). Also, as noted in the model card, the model can also use image sizes that are multiple of the patch size.
The transforms that are linked in the corresponding code have a resize of 256 followed by a center crop of 224 - does this mean that given an image that is 256x256 32 pixels are ignored horizontally and vertically (because the center crop is less than the resize)?
If the images in my dataset have distinctive imagery on their borders does this mean these default transforms will crop that information out?
The text was updated successfully, but these errors were encountered:
I am using DinoV2 with FAISS to do similarity search across a database of images. See these DinoV2-related issues on this.
In this thread @patricklabatut mentions
The transforms that are linked in the corresponding code have a resize of 256 followed by a center crop of 224 - does this mean that given an image that is 256x256 32 pixels are ignored horizontally and vertically (because the center crop is less than the resize)?
If the images in my dataset have distinctive imagery on their borders does this mean these default transforms will crop that information out?
The text was updated successfully, but these errors were encountered: