Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Created by MG96

External Public cs.CV cs.LG

Statistics

Citations
9999
References
84
Last updated
Loading...
Authors

Ze Liu Yutong Lin Yue Cao Han Hu Yixuan Wei Zheng Zhang Stephen Lin Baining Guo
Project Resources

Name Type Source Actions
ArXiv Paper Paper arXiv
Semantic Scholar Paper Semantic Scholar
GitHub Repository Code Repository GitHub
microsoft/swin-tiny-patch4-window7-224 Model Hugging Face
microsoft/swin-base-patch4-window7-224-in22k Model Hugging Face
microsoft/swin-base-patch4-window7-224 Model Hugging Face
BVRA/MegaDescriptor-L-384 Model Hugging Face
timm/swin_base_patch4_window7_224.ms_in22k_ft_in1k Model Hugging Face
openmmlab/upernet-swin-small Model Hugging Face
microsoft/swin-large-patch4-window12-384-in22k Model Hugging Face
microsoft/swin-base-patch4-window12-384 Model Hugging Face
MurmanskY/swin_b Model Hugging Face
openmmlab/upernet-swin-tiny Model Hugging Face
keras-io/swin-transformers Model Hugging Face
innat/videoswin Model Hugging Face
facebook/hiera_base_224.mae_in1k_ft_in1k Model Hugging Face
Mitsua/swin-base-multi-fractal-1k Model Hugging Face
openmmlab/upernet-swin-base Model Hugging Face
timm/swin_s3_base_224.ms_in1k Model Hugging Face
microsoft/swin-large-patch4-window7-224-in22k Model Hugging Face
facebook/hiera-tiny-224-in1k-hf Model Hugging Face
facebook/hiera-base-224-in1k-hf Model Hugging Face
facebook/hiera-huge-224-in1k-hf Model Hugging Face
timm/swin_large_patch4_window12_384.ms_in22k Model Hugging Face
microsoft/swin-small-patch4-window7-224 Model Hugging Face
microsoft/swin-large-patch4-window12-384 Model Hugging Face
microsoft/swin-large-patch4-window7-224 Model Hugging Face
microsoft/swin-base-patch4-window12-384-in22k Model Hugging Face
kadirnar/timm_model_list Model Hugging Face
ZJF-Thunder/Swin-Transformer-Object-Detection Model Hugging Face
keras-io/shiftvit Model Hugging Face
mccaly/test2 Model Hugging Face
BVRA/MegaDescriptor-S-224 Model Hugging Face
facebook/hiera_small_224.mae_in1k_ft_in1k Model Hugging Face
facebook/hiera-tiny-224-mae-hf Model Hugging Face
facebook/hiera-large-224-in1k-hf Model Hugging Face
facebook/hiera-large-224-hf Model Hugging Face
facebook/hiera-huge-224-hf Model Hugging Face
facebook/hiera-huge-224-mae-hf Model Hugging Face
qualcomm/Swin-Tiny Model Hugging Face
qninhdt/MiDaS Model Hugging Face
aoxo/RealFormer Model Hugging Face
openmmlab/upernet-swin-large Model Hugging Face
timm/swin_small_patch4_window7_224.ms_in22k_ft_in1k Model Hugging Face
timm/swin_base_patch4_window7_224.ms_in1k Model Hugging Face
timm/swin_base_patch4_window12_384.ms_in22k_ft_in1k Model Hugging Face
timm/swin_base_patch4_window7_224.ms_in22k Model Hugging Face
timm/swin_tiny_patch4_window7_224.ms_in22k Model Hugging Face
timm/swin_large_patch4_window12_384.ms_in22k_ft_in1k Model Hugging Face
timm/swin_small_patch4_window7_224.ms_in22k Model Hugging Face
timm/swin_tiny_patch4_window7_224.ms_in1k Model Hugging Face
timm/swin_s3_tiny_224.ms_in1k Model Hugging Face
timm/swin_small_patch4_window7_224.ms_in1k Model Hugging Face
timm/swin_base_patch4_window12_384.ms_in22k Model Hugging Face
timm/swin_s3_small_224.ms_in1k Model Hugging Face
timm/swin_tiny_patch4_window7_224.ms_in22k_ft_in1k Model Hugging Face
timm/swin_large_patch4_window7_224.ms_in22k Model Hugging Face
timm/swin_large_patch4_window7_224.ms_in22k_ft_in1k Model Hugging Face
timm/swin_base_patch4_window12_384.ms_in1k Model Hugging Face
BVRA/MegaDescriptor-B-224 Model Hugging Face
BVRA/MegaDescriptor-T-224 Model Hugging Face
BVRA/MegaDescriptor-L-224 Model Hugging Face
yesidcanoc/image-captioning-swin-tiny-distilgpt2 Model Hugging Face
igotech/text2image Model Hugging Face
mlx-vision/swin_v2_tiny_patch4_window8_256-mlxim Model Hugging Face
mlx-vision/swin_v2_base_patch4_window8_256-mlxim Model Hugging Face
mlx-vision/swin_small_patch4_window7_224-mlxim Model Hugging Face
mlx-vision/swin_base_patch4_window7_224-mlxim Model Hugging Face
mlx-vision/swin_v2_small_patch4_window8_256-mlxim Model Hugging Face
mlx-vision/swin_tiny_patch4_window7_224-mlxim Model Hugging Face
subhuatharva/swim-224-base-satellite-image-classification Model Hugging Face
NeuronZero/MRI-Reader Model Hugging Face
facebook/hiera-tiny-224-hf Model Hugging Face
NeuronZero/SkinCancerClassifier Model Hugging Face
facebook/hiera-base-224-hf Model Hugging Face
facebook/hiera-base-224-mae-hf Model Hugging Face
facebook/hiera-small-224-in1k-hf Model Hugging Face
facebook/hiera-small-224-mae-hf Model Hugging Face
facebook/hiera-base-plus-224-mae-hf Model Hugging Face
facebook/hiera-base-plus-224-hf Model Hugging Face
facebook/hiera-small-224-hf Model Hugging Face
facebook/hiera-large-224-mae-hf Model Hugging Face
facebook/hiera-base-plus-224-in1k-hf Model Hugging Face
qualcomm/Swin-Small Model Hugging Face
qualcomm/Swin-Base Model Hugging Face
qninhdt/det Model Hugging Face
merve/vision_papers Space/Demo Hugging Face
Nuno-Tome/simple_image_classifier Space/Demo Hugging Face
akhaliq/Swin-Transformer Space/Demo Hugging Face
juliensimon/battle_of_image_classifiers Space/Demo Hugging Face
NeoPy/advanced-rvc-inference Space/Demo Hugging Face
tom-doerr/logo_generator Space/Demo Hugging Face
FritsLyneborg/kunstnerfrits Space/Demo Hugging Face
subhuatharva/Satellite_Image_Classification Space/Demo Hugging Face
innat/VideoSwin Space/Demo Hugging Face
musadac/VilanOCR-Urdu-English-Chinese Space/Demo Hugging Face
ASesYusuf1/Jhfhnrqgx-Gxeelqj-Vwxglr Space/Demo Hugging Face
spark-nlp/SwinForImageClassification Space/Demo Hugging Face
poiqazwsx/pytorch-music-source-seperation Space/Demo Hugging Face
Jaehan/Image-Classification-Using-a-Vision-Transformer-1 Space/Demo Hugging Face
talantbekdeveloper/check-answer Space/Demo Hugging Face
duongve/Spatial_Control_for_SD Space/Demo Hugging Face
platzi/platzi-curso-gradio-clasificacion-imagenes Space/Demo Hugging Face
SahilJ2/VQA_Model Space/Demo Hugging Face
Gilvan/XRaySwinGen Space/Demo Hugging Face
keras-io/shiftvit Space/Demo Hugging Face
stryker/VilanOCR-Urdu-English-Chinese Space/Demo Hugging Face
Skipeat/demo_platzi Space/Demo Hugging Face
Oscar-Hernandez/demo_prueba Space/Demo Hugging Face
tjtrebat/swin-image-classification Space/Demo Hugging Face
JamesNJ/NN Space/Demo Hugging Face
Nymbo/simple_image_classifier Space/Demo Hugging Face
jdr2dev/demo Space/Demo Hugging Face
maurope/demo_clase_platzi Space/Demo Hugging Face
Daniel-Lagos/demo_test Space/Demo Hugging Face
Frorozcol/demo_clase_platzi Space/Demo Hugging Face
jdgalvan/demo_clase_platzi Space/Demo Hugging Face
Shakir60/Construction_Defect_Analyzer Space/Demo Hugging Face
Carlos31/demo_clase Space/Demo Hugging Face
JuandaBula/demo_clase_platzi Space/Demo Hugging Face
Manuela/ejemplo_demo Space/Demo Hugging Face
keras-io/Swin-transformers Space/Demo Hugging Face
ulichovick/demo_clase_platzi Space/Demo Hugging Face
machves/Demo_Clase Space/Demo Hugging Face
MediPlusPlus/FINAL_VQA Space/Demo Hugging Face
MediPlusPlus/VQA_new Space/Demo Hugging Face
xferdie/ViT Space/Demo Hugging Face
amorales2075/demo1 Space/Demo Hugging Face
shandong1970/listingA1 Space/Demo Hugging Face
EliseoBaquero/demo_platzi Space/Demo Hugging Face
jomanher/demo_clase_platzi Space/Demo Hugging Face
Girotzu/Prueba Space/Demo Hugging Face
shadownada/uff Space/Demo Hugging Face
onlymodels007/keras-io-swin-transformers Space/Demo Hugging Face
tousin23/X-RayDemo Space/Demo Hugging Face
Hugomartinezg/ImagentoText Space/Demo Hugging Face
smoothjazzuser/AI_Model_Explainability Space/Demo Hugging Face
rogerkoranteng/Mental-Sage Space/Demo Hugging Face
cristian-rivera/demo_personal-space Space/Demo Hugging Face
franco1102/demo_img_classification Space/Demo Hugging Face
Johnometalman/demo-clasificacion Space/Demo Hugging Face
LRascon/demo Space/Demo Hugging Face
javiergrandat/jgrandat_demo1 Space/Demo Hugging Face
DAVID316GARCIA/Demo_Platzi Space/Demo Hugging Face
nestorxyz/platzi_demo_img_classification Space/Demo Hugging Face
smjain/vit_ms Space/Demo Hugging Face
lenderlucas/Demo_Clase_Platzi Space/Demo Hugging Face
Shakir60/Test Space/Demo Hugging Face
ronalleiva/Identificar_imagen Space/Demo Hugging Face
Frorozcol/course-platzi Space/Demo Hugging Face
marcela9409/gradio-blocks-practice Space/Demo Hugging Face
marcela9409/demo_clase_platzi Space/Demo Hugging Face
frandak2/Demo_clase_platzi Space/Demo Hugging Face
cmglezpdev/demo_clase_platzi Space/Demo Hugging Face
Chamin09/sustainable_content_moderator Space/Demo Hugging Face
EloiCampeny/demo_clase_platzi Space/Demo Hugging Face
Ichcanziho/Demo_mi_primer_space Space/Demo Hugging Face
pablodatadev/demo_ Space/Demo Hugging Face
requiem108/demo_platzi_v4 Space/Demo Hugging Face
fahpcy/model Space/Demo Hugging Face
gottdammer/multi_modelo Space/Demo Hugging Face
platzi/demo_clase_plazi Space/Demo Hugging Face
RaymundoSGlz/first_demo Space/Demo Hugging Face
jeraldflowers/demo_class_image Space/Demo Hugging Face
mpterradillos/image_classification Space/Demo Hugging Face
Ichcanziho/espacio_multi_modelo Space/Demo Hugging Face
MediPlusPlus/VQA_Model_Original Space/Demo Hugging Face
andreslozano1313/demo_clase_platzi Space/Demo Hugging Face
leovale14/image-classification-microsoft Space/Demo Hugging Face
SivaResearch/Opensource_imageClassifer_combo Space/Demo Hugging Face
shandong1970/a2 Space/Demo Hugging Face
Santenana/Demo_Platzi_clase Space/Demo Hugging Face
wgcv/image-classification-swin-tiny Space/Demo Hugging Face
EliseoBaquero/Space-Multimodelo Space/Demo Hugging Face
Dannel/gender Space/Demo Hugging Face
Davies/demo_portfolio_Davies Space/Demo Hugging Face
Lau771/EvaText Space/Demo Hugging Face
Gabrieltf85/demo_clase Space/Demo Hugging Face
joacorf33/Demos Space/Demo Hugging Face
Segizu/demo_platzi Space/Demo Hugging Face
Metantropia/Gradio-Tabs Space/Demo Hugging Face
rriverar75/demo_clase_platzi Space/Demo Hugging Face
kenkina/demo_image_recognition Space/Demo Hugging Face
PhilHolst/microsoft-swin-base-patch4-window7-224-in22k Space/Demo Hugging Face
mgutierrez/demo_platzi_class Space/Demo Hugging Face
joelorellana/demo_clase_platzi Space/Demo Hugging Face
Abstract

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision. Challenges in adapting Transformer from language to vision arise from differences between the two domains, such as large variations in the scale of visual entities and the high resolution of pixels in images compared to words in text. To address these differences, we propose a hierarchical Transformer whose representation is computed with \textbf{S}hifted \textbf{win}dows. The shifted windowing scheme brings greater efficiency by limiting self-attention computation to non-overlapping local windows while also allowing for cross-window connection. This hierarchical architecture has the flexibility to model at various scales and has linear computational complexity with respect to image size. These qualities of Swin Transformer make it compatible with a broad range of vision tasks, including image classification (87.3 top-1 accuracy on ImageNet-1K) and dense prediction tasks such as object detection (58.7 box AP and 51.1 mask AP on COCO test-dev) and semantic segmentation (53.5 mIoU on ADE20K val). Its performance surpasses the previous state-of-the-art by a large margin of +2.7 box AP and +2.6 mask AP on COCO, and +3.2 mIoU on ADE20K, demonstrating the potential of Transformer-based models as vision backbones. The hierarchical design and the shifted window approach also prove beneficial for all-MLP architectures. The code and models are publicly available at~\url{https://github.com/microsoft/Swin-Transformer}.

Note:

No note available for this project.

No note available for this project.
Contact:

No contact available for this project.

No contact available for this project.