microsoft-cvt-13-384
Version: 1
Convolutional Vision Transformer (CvT)
CvT-13 model pre-trained on ImageNet-1k at resolution 384x384. It was introduced in the paper CvT: Introducing Convolutions to Vision Transformers by Wu et al. and first released in this repository . Disclaimer: The team releasing CvT did not write a model card for this model so this model card has been written by the Hugging Face team.Usage
Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes:from transformers import AutoFeatureExtractor, CvtForImageClassification
from PIL import Image
import requests
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)
feature_extractor = AutoFeatureExtractor.from_pretrained('microsoft/cvt-13-384')
model = CvtForImageClassification.from_pretrained('microsoft/cvt-13-384')
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
# model predicts one of the 1000 ImageNet classes
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])
---
`microsoft/cvt-13-384` powered by Hugging Face Inference Toolkit
- [Original Model Card](https://huggingface.co/microsoft/cvt-13-384)
- [`image-classification` Task on Hugging Face](https://huggingface.co/tasks/image-classification)
### Send Request
You can use cURL or any REST Client to send a request to the AzureML endpoint with your AzureML token.
```bash
curl <AZUREML_ENDPOINT_URL> \
-X POST \
-H "Authorization: Bearer <AZUREML_TOKEN>" \
-H "Content-Type: image/jpeg" \
--data-binary @"image.jpg"
Supported Parameters
- inputs (string): The input image data as a base64-encoded string. If no parameters are provided, you can also provide the image data as a raw bytes payload.
- parameters (object):
- function_to_apply (enum): Possible values: sigmoid, softmax, none.
- top_k (integer): When specified, limits the output to the top K most probable classes.
Model Specifications
LicenseApache-2.0
Last UpdatedAugust 2025
ProviderHuggingFace