CUDA graph capture succeeds in Debug mode but fails in Release mode #4376
Labels
triaged
Issue has been triaged by maintainers
waiting for feedback
Requires more information from user to make progress on the issue.
when i run this code on release, After executing this statement
err = cudaStreamEndCapture(context_->stream_, &graph_)
the value of variable err is cudaErrorStreamCaptureInvalidated(901) and the value of variable execute_result is false.
but when i run this code on debug,everything is work.the value of variable err is cudaSuccess(0) and the value of variable execute_result is true.
Enviorment:
TensorRT Version: 8.4.2.4
NVIDIA GPU: RTX3060
NVIDIA Driver Version: 472.12
CUDA Version: 11.2
Operating System: Windows10
how can i solve it? If you need more information, please let me know.
Thanks in advance!
The text was updated successfully, but these errors were encountered: