WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images

Hong Liu, Haosen Yang, Paul J. van Diest, Josien P. W. Pluim, Mitko Veta
Computer Science, Computer Vision and Pattern Recognition, Computer Vision and Pattern Recognition (cs.CV)
2024-03-14 00:00:00
The Segment Anything Model (SAM) marks a significant advancement in segmentation models, offering powerful zero-shot capabilities and dynamic prompting. However, existing medical SAMs are not suitable for the multi-scale nature of whole-slide images (WSIs), restricting their effectiveness. To resolve this drawback, we present WSI-SAM, enhancing SAM with precise object segmentation capabilities for histopathology images using multi-resolution patches, while preserving its original prompt-driven design, efficiency, and zero-shot adaptability. To fully exploit pretrained knowledge while minimizing training overhead, we keep SAM frozen, only introducing minimal additional parameters and computation. In particular, we introduce High-Resolution (HR) token, Low-Resolution (LR) token and dual mask decoder. This decoder integrates the original SAM mask decoder with a lightweight fusion module that integrates features at multiple scales. Instead of predicting a mask independently, we integrate HR and LR token at intermediate layer to jointly learn features of the same object across multiple resolutions. Experiments show that our WSI-SAM outperforms state-of-the-art SAM and its variants. In particular, our model outperforms SAM by 4.1 and 2.5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task (CAMELYON16 dataset). The code will be available at
PDF: WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images.pdf
Empowered by ChatGPT