|
|
|
Backbone |
|
A backbone is a model used for feature extraction for higher level computer vision tasks such as object detection and image classification. Transformers provides an [AutoBackbone] class for initializing a Transformers backbone from pretrained model weights, and two utility classes: |
|
|
|
[~utils.BackboneMixin] enables initializing a backbone from Transformers or timm and includes functions for returning the output features and indices. |
|
[~utils.BackboneConfigMixin] sets the output features and indices of the backbone configuration. |
|
|
|
timm models are loaded with the [TimmBackbone] and [TimmBackboneConfig] classes. |
|
Backbones are supported for the following models: |
|
|
|
BEiT |
|
BiT |
|
ConvNet |
|
ConvNextV2 |
|
DiNAT |
|
DINOV2 |
|
FocalNet |
|
MaskFormer |
|
NAT |
|
ResNet |
|
Swin Transformer |
|
Swin Transformer v2 |
|
ViTDet |
|
|
|
AutoBackbone |
|
[[autodoc]] AutoBackbone |
|
BackboneMixin |
|
[[autodoc]] utils.BackboneMixin |
|
BackboneConfigMixin |
|
[[autodoc]] utils.BackboneConfigMixin |
|
TimmBackbone |
|
[[autodoc]] models.timm_backbone.TimmBackbone |
|
TimmBackboneConfig |
|
[[autodoc]] models.timm_backbone.TimmBackboneConfig |