-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Error was raised when importing model in v1.0.x #40
Comments
BMinf will request 512MB of memory before loading the model. From your screenshot, it seems that the error is happening here. I'm going to spend some time trying to reproduce this error. |
@a710128 Thanks for the quick response. Please keep me updated. |
I ran the examples with my GTX 1070 on Windows. Everything turned out fine. Could it be that the conda environment is causing some effects? |
@a710128 I tried to import all 3 models. Surprisingly, CPM1 is fine. It started downloading after Seems like your env is windows. Could you try it under Linux? |
I've tested it under Ubuntu 20.04 using 1080Ti, 2080Ti, V100 and it works fine. |
@a710128 Could you share your installation script and cuda version? |
I'm confused that importing CPM1 and importing CPM2 will run almost the same code. But importing CPM2 gives an error at line 55. Line 55 in 45d0af9
CPM1 Lines 26 to 51 in 45d0af9
CPM2 Lines 31 to 56 in 45d0af9
|
|
@a710128 Actually the previous logs are not matched with my latest runs. The error actually comes from the T5 model file during the first init of the pinned decoder layer. Here is my script
And the output is:
|
@sdjksdafji Try BMInf 1.0.2 |
@a710128 Thanks for the fix. I tried 1.0.2. The import works fine for me but the inference does not. Here is the latest error:
BTW, I have some questions regarding the fix. Seems like the actual fix here is to use the non-cuda pinned numpy array if the cuda malloc operation fails. Even if it worked, would it affect the inference performance? I assume now the computation happens on CPU instead of GPU, right? Instead of a fix, to me, this sounds like a workaround with the sacrifice of perf. Shall we try to figure out the root cause of the failed CUDA malloc? My 3080 has 16g GPU-MEM so the OOM error definitely does not make sense. |
Even if a non-cuda pinned numpy array is used, the computation still happens on the GPU. The difference is that non-pinned memory spends more time transferring data from CPU to GPU.
|
Describe the bug
CUDA error was raised when importing models. This issue only happens with BMInf 1.0.x version. I could run BmInf 0.0.5 successfully. Any help would be appreciated. Thanks.
Minimal steps to reproduce
Tried the following on both
WSL2 Ubuntu 20.04 with GTX 3080 16G
andnative Ubuntu 18.04 with GTX 1070 8G
Then run
Expected behavior
Start downloading the model.
Screenshots
Environment:
Tried with various cuda versions including 10.2 11.0 and 11.3
The text was updated successfully, but these errors were encountered: