Virchow2G

Virchow2G

Paige
Version: 1
Virchow2G is a self-supervised vision transformer pretrained using 3.1M whole slide histopathology images. It supports both hematoxylin and eosin (H&E) and immunohistochemistry (IHC) stained slides, enhancing its versatility across various pathology tasks. The model can be used as a tile-level feature extractor (frozen or finetuned) to achieve state-of-the-art results for a wide variety of downstream computational pathology use cases, particularly where higher model capacity and richer feature representations are beneficial.
Virchow2G is based on a ViT-G/14 architecture with 1.8 billion parameters. It processes input images at a native size of 224×224 pixels with a 14×14 patch size and incorporates advanced techniques such as SwiGLU activations, LayerScale stabilization, and the use of register tokens to enhance representational power. Pretraining was conducted using a modified DINOv2 self-supervised objective, replacing the original KoLeo regularizer with a kernel density estimator and adopting an extended context translation augmentation strategy. The model was trained by sampling tiles across four magnifications (5x, 10x, 20x, and 40x). Virchow2G was trained using mixed precision (fp16) to optimize efficiency while preserving accuracy and is designed to support flexible downstream fine-tuning or prompting for varied pathology tasks.

Quick facts

Model providerPaige
TypeImage feature extraction
LifecycleGenerally available (GA)
Input typeimage
Output typetext
Context window2048