feat: TensorRT AOT Plugin #3504

bowang007 · 2025-05-05T05:52:07Z

Description

This PR demonstrates how to use AOT plugin in Torch-TensorRT

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/examples/dynamo/aot_plugin.py	2025-05-05 05:52:23.878918+00:00
+++ /home/runner/work/TensorRT/TensorRT/examples/dynamo/aot_plugin.py	2025-05-05 05:52:44.176344+00:00
@@ -23,13 +23,11 @@
    output = x + 1
    tl.store(y_ptr + offsets, output, mask=mask)


@torch.library.custom_op("my::add_one", mutates_args=())  # type: ignore[misc]
-def add_one(
-    X: torch.Tensor
-) -> torch.Tensor:
+def add_one(X: torch.Tensor) -> torch.Tensor:
    # Ensure the tensors are on the GPU
    assert X.is_cuda

    # Create output tensor
    Y = torch.empty_like(X)
@@ -53,19 +51,22 @@

# torch_tensorrt.dynamo.conversion.plugins.generate_plugin(
#     "my::add_one"
# )

+
@trtp.register("my::add_one")
def add_plugin_desc(X: trtp.TensorDesc) -> Tuple[trtp.TensorDesc]:
    return X.like()

+
@trtp.aot_impl("my::add_one")
def add_plugin_aot_impl(
    X: trtp.TensorDesc, outputs: Tuple[trtp.TensorDesc], tactic: int
-) -> Tuple[Union[str, bytes], Union[str, bytes], trtp.KernelLaunchParams, trtp.SymExprs]:
-
+) -> Tuple[
+    Union[str, bytes], Union[str, bytes], trtp.KernelLaunchParams, trtp.SymExprs
+]:

    type_str = "fp32" if X.dtype == trt.float32 else "fp16"

    block_size = 256
    src = triton.compiler.ASTSource(
@@ -101,10 +102,11 @@
        compiled_kernel.asm["ptx"],
        launch_params,
        extra_args,
    )

+
torch_tensorrt.dynamo.conversion.plugins.generate_plugin_converter(
    "my::add_one",
    supports_dynamic_shapes=False,
    requires_output_allocator=False,
    aot=True,
@@ -127,18 +129,15 @@
    parser.add_argument(
        "--aot", action="store_true", help="Try to use AOT compilation", default=False
    )
    args = parser.parse_args()

-
-    
    my_model = MyModel().to("cuda")
    m = torch.full((64, 64), 2, device="cuda", dtype=torch.float)

    # This works!
    assert my_model(X=m)[0][0] == 3.0
-

    with torch_tensorrt.logging.debug():
        trt_inputs = [m]
        model_trt = torch_tensorrt.compile(
            my_model,
@@ -151,6 +150,6 @@
        for i in range(10):
            res = model_trt(m)
            assert torch.allclose(res, my_model(m)), "Results do not match!"

    print("Inference successful!")
-    print(res)
\ No newline at end of file
+    print(res)

narendasan · 2025-05-06T02:58:48Z

examples/dynamo/aot_plugin.py

+#     "my::add_one"
+# )
+
+@trtp.register("my::add_one")


Can we not use torch_tensorrt.dynamo.conversion.custom_op here?

narendasan · 2025-05-06T03:03:19Z

examples/dynamo/aot_plugin.py

+    "my::add_one",
+    supports_dynamic_shapes=False,
+    requires_output_allocator=False,
+    aot=True,


So I think that we need 2 things. 1. there should be a flag something like use_aot_if_available and then in generate_plugin_converter a function that checks on the aot_impl registration

narendasan · 2025-05-06T03:04:05Z

py/torch_tensorrt/dynamo/conversion/plugins/_generate_plugin_converter.py

@@ -80,7 +81,7 @@ def custom_kernel_converter(
            if isinstance(v, torch.fx.immutable_collections.immutable_list):
                kwargs[k] = np.array(v)

-        layer = ctx.net.add_plugin(plugin(*itensor_args, **kwargs))
+        layer = ctx.net.add_plugin(plugin(*itensor_args, **kwargs), aot=aot)


there should be a utility function that checks on aot_impl registrations

feat: enable AOT tensorrt plugin example

c3cd651

facebook-github-bot added the cla signed label May 5, 2025

github-actions bot added component: conversion Issues re: Conversion stage component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels May 5, 2025

github-actions bot requested a review from gs-olive May 5, 2025 05:52

github-actions bot requested changes May 5, 2025

View reviewed changes

narendasan reviewed May 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: TensorRT AOT Plugin #3504

feat: TensorRT AOT Plugin #3504

bowang007 commented May 5, 2025

github-actions bot left a comment

narendasan May 6, 2025

narendasan May 6, 2025

narendasan May 6, 2025

feat: TensorRT AOT Plugin #3504

Are you sure you want to change the base?

feat: TensorRT AOT Plugin #3504

Conversation

bowang007 commented May 5, 2025

Description

Type of change

Checklist:

github-actions bot left a comment

Choose a reason for hiding this comment

narendasan May 6, 2025

Choose a reason for hiding this comment

narendasan May 6, 2025

Choose a reason for hiding this comment

narendasan May 6, 2025

Choose a reason for hiding this comment