Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GraphBolt][CUDA] Add gb.index_select and fix example inferencing. #7051

Merged
merged 8 commits into from
Feb 4, 2024

Conversation

mfbalin
Copy link
Collaborator

@mfbalin mfbalin commented Jan 31, 2024

Description

The existing example was moving the feature memory to the device memory. We change it to the storage_device instead.

When I started this PR, I thought I would use gb.index_select to load features for inferencing. Then I realized that I can just update the features and rely on the dataloader to fetch the features for inferencing. Otherwise, the copies for all the different modes become too complicated and ugly, which the dataloader automatically handles for us. However, I am sure that gb.index_select will be useful to us down the road, it is a very simple wrapper over torch.ops.graphbolt.index_select.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 31, 2024

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@mfbalin mfbalin linked an issue Jan 31, 2024 that may be closed by this pull request
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 31, 2024

Commit ID: 1260e0ae1855b7c53ba23c9e68de66121607202b

Build ID: 1

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 31, 2024

Commit ID: 6d015ffea8edfab14410cad1b11143ed9b9ea5d4

Build ID: 2

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 31, 2024

Commit ID: 22b3e10be3a4b75238ccf811d4f609a8a3824f80

Build ID: 3

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 31, 2024

Commit ID: 620049e7c949b2755b93df879e3872f3596c547d

Build ID: 4

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 31, 2024

Commit ID: e6620a8

Build ID: 5

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@mfbalin mfbalin changed the title [GraphBolt][CUDA] Add gb.index_select for example inferencing. [GraphBolt][CUDA] Add gb.index_select and fix example inferencing. Jan 31, 2024
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 31, 2024

Commit ID: 4d112bf

Build ID: 6

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 31, 2024

Commit ID: 1971922

Build ID: 7

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@mfbalin mfbalin requested a review from frozenbugs February 1, 2024 22:29
@mfbalin
Copy link
Collaborator Author

mfbalin commented Feb 4, 2024

Let's wait for #7083 to be merged first so that I can test the link prediction example in this PR properly.

@mfbalin mfbalin changed the title [GraphBolt][CUDA] Add gb.index_select and fix example inferencing. [Do Not Merge][GraphBolt][CUDA] Add gb.index_select and fix example inferencing. Feb 4, 2024
@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: 56d69614dfb56a08736bd2cf169f178c04d40a4e

Build ID: 8

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@mfbalin
Copy link
Collaborator Author

mfbalin commented Feb 4, 2024

Just tested all 4 different modes for the node_prediction example, all work as expected.

@mfbalin
Copy link
Collaborator Author

mfbalin commented Feb 4, 2024

Just tested the link_prediction example with the fix in #7083, all the 4 different modes work as expected.

@mfbalin mfbalin changed the title [Do Not Merge][GraphBolt][CUDA] Add gb.index_select and fix example inferencing. [GraphBolt][CUDA] Add gb.index_select and fix example inferencing. Feb 4, 2024
@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: fd13cb11175d5aa8fb8a7c194c924df68485964f

Build ID: 9

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@mfbalin mfbalin merged commit 346197c into dmlc:master Feb 4, 2024
@mfbalin mfbalin deleted the gb_index_select_inference_fix branch February 4, 2024 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[GraphBolt][CUDA] Layerwise inference moves whole tensor to GPU
3 participants