-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[OMPT] Overlapping device_num
when using multiple offloading architectures
#65104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
@llvm/issue-subscribers-openmp |
mhalk
added a commit
to mhalk/llvm-project
that referenced
this issue
Sep 7, 2023
Fixes: llvm#65104 When a user assigns devices to target regions it may happen that different identifiers will map onto the same id within different plugins. This will lead to situations where callbacks will become much harder to read, as ambiguous identifiers are reported. We fix this by collecting the index-offset upon general RTL initialization. Which in turn, allows to calculate the unique, user-observable device id.
mhalk
added a commit
that referenced
this issue
Sep 11, 2023
Fixes: #65104 When a user assigns devices to target regions it may happen that different identifiers will map onto the same id within different plugins. This will lead to situations where callbacks will become much harder to read, as ambiguous identifiers are reported. We fix this by collecting the index-offset upon general RTL initialization. Which in turn, allows to calculate the unique, user-observable device id.
ZijunZhaoCCK
pushed a commit
to ZijunZhaoCCK/llvm-project
that referenced
this issue
Sep 19, 2023
…#65595) Fixes: llvm#65104 When a user assigns devices to target regions it may happen that different identifiers will map onto the same id within different plugins. This will lead to situations where callbacks will become much harder to read, as ambiguous identifiers are reported. We fix this by collecting the index-offset upon general RTL initialization. Which in turn, allows to calculate the unique, user-observable device id.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
This issue was initially found during discussion of #64487. However, this should be handled as a separate issue, especially since the other one is closed now. This issue doesn't affect me directly, but there are users which might run into this issue, especially in mixed GPU systems.
When a user utilizes multiple architectures to offload to in his program (for example
-fopenmp-targets=x86_64,nvptx64
, the OMPT interface will receive callbacks forompt_callback_device_initialize
for each device used. While this works fine, we may encounter a case where the transferred device number for two different architectures overlap, causing issues for tools to differentiate the devices.Users can select their device by the device number, for example
This would translate to the following in the OMPT interface
Reproducer
This small code should show the issue. I have tested it with with
nvptx64 + x86_64
andnvptx64 + amdgcn-amd-amdhsa
, but it should also affect other combinationsRunning the tool on a system with a single NVIDIA MX550, we can see the following output:
Previous discussion
Here are a few related comments by @mhalk and me regarding this issue:
#64487 (comment)
#64487 (comment)
#64487 (comment)
The text was updated successfully, but these errors were encountered: