failure of SigLIP2 FP32 to FP16 #4373

yijun02 · 2025-03-04T03:29:40Z

I am trying to convert an SigLIP2 model to TensorRT and use fp16, but the cosine similarity between onnx and trt is 0.6463.

I used the following code convert to onnx.

import torch
import torch.nn as nn
import torch.nn.functional as F
from open_clip import create_model_from_pretrained
import subprocess
from urllib.request import urlopen
from PIL import Image
import numpy as np

model_path = "model"

# load model
device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = create_model_from_pretrained('hf-hub:timm/ViT-B-16-SigLIP2-256', device=device)
model.eval()
# export image encoder
class ImageEncoder(nn.Module):
    def __init__(self, model) -> None:
        super().__init__()
        self.model = model
    @torch.no_grad()
    def forward(self, image):
        image = (image-127.5)/127.5
        image = image.permute(0, 3, 1, 2)
        image_features = model.encode_image(image)
        return image_features

image_encoder = ImageEncoder(model)
dummy_img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
dummy_img = np.array(dummy_img.resize((256, 256)).convert('RGB')).astype(np.float32)
dummy_img = torch.from_numpy(dummy_img).unsqueeze(0).to(device)

torch.onnx.export(image_encoder,
                  (dummy_img),
                  f"{model_path}/img_en_ori.onnx",
                  export_params=True,
                  opset_version=16,
                  do_constant_folding=True,
                  input_names = ['img'],
                  output_names = ['image_feature'])

subprocess.run(["onnxsim", f"{model_path}/img_en_ori.onnx", f"{model_path}/img_en_ori.onnx"], check=True)

and use the command to fp16 trt engine.

/usr/src/tensorrt/bin/trtexec --onnx=model/img_en_ori.onnx --saveEngine=model/img_en_ori.engine --fp16

Environment

AGX with dustynv/l4t-pytorch:r36.4.0

NX with dustynv/l4t-pytorch:2.2-r35.4.1

ubuntu 22.04, RTX 3090 with nvcr.io/nvidia/pytorch:25.01-py3

The text was updated successfully, but these errors were encountered:

kevinch-nv · 2025-03-07T21:24:33Z

Are the FP32 results good? Can you try exporting a model without simplifying it (i.e. remove the subprocess.run(["onnxsim", f"{model_path}/img_en_ori.onnx", f"{model_path}/img_en_ori.onnx"], check=True)) line and check if the results are the same?

yijun02 · 2025-03-10T04:05:06Z

Yes, the FP32 results are good.
I computed cosine similarity between onnx and trt the result is as follows.

and the following is the onnx model's diference between opset_version 16(left) and 17(right).

I think dustynv/l4t-pytorch:2.2-r35.4.1 and nvcr.io/nvidia/pytorch:25.01-py3 have the same problem in Layer Normalization op with opset_version 16.
I think dustynv/l4t-pytorch:r36.4.0 has the same problem as #4333 and also has the same problem as dustynv/l4t-pytorch:2.2-r35.4.1 in the layer normalization operation of opset_version 16.

lix19937 · 2025-03-21T08:35:12Z

Use /usr/src/tensorrt/bin/trtexec --onnx=model/img_en_ori.onnx --saveEngine=model/img_en_ori.engine --fp16 --verbose to upload a log here ?

yijun02 · 2025-03-21T09:14:15Z

The log of nx with opset_version 16:
nx_log.txt

The log of agx with opset_version 16:
agx_log.txt

The log of RTX3090 with opset_version 16:
RTX3090_log.txt

lix19937 · 2025-03-21T09:23:43Z

The logs has no problem, you can export onnx weith ops=17, and use --noTF32 to build.
BTW, use polygraphy to profile which layer output begin to diff.

yijun02 · 2025-03-21T09:53:55Z

The log of agx with opset_version 17 and the cosine similarity is 0.6441.
agx_log_noTF32.txt

polygraphy run model/img_en_ori.onnx --trt --onnxrt --onnx-outputs mark all --trt-outputs mark all > comparison_results.txt
comparison_results.txt

kevinch-nv added triaged Issue has been triaged by maintainers Module:Accuracy Output mismatch between TensorRT and other frameworks labels Mar 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

failure of SigLIP2 FP32 to FP16 #4373

failure of SigLIP2 FP32 to FP16 #4373

yijun02 commented Mar 4, 2025

kevinch-nv commented Mar 7, 2025

yijun02 commented Mar 10, 2025

lix19937 commented Mar 21, 2025

yijun02 commented Mar 21, 2025

lix19937 commented Mar 21, 2025

yijun02 commented Mar 21, 2025

failure of SigLIP2 FP32 to FP16 #4373

failure of SigLIP2 FP32 to FP16 #4373

Comments

yijun02 commented Mar 4, 2025

Environment

kevinch-nv commented Mar 7, 2025

yijun02 commented Mar 10, 2025

lix19937 commented Mar 21, 2025

yijun02 commented Mar 21, 2025

lix19937 commented Mar 21, 2025

yijun02 commented Mar 21, 2025