-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R] Memory leak when storing Booster objects in a list? #11355
Comments
It's the memory held by the booster object for things like gradient and prediction cache. We can use the reset method #11042 upon returning from |
@trivialfis thanks! Looking forward to the linked solution in the future release. In the meantime, I didn't understand what is the workaround for now in the R case, if any. I see that in the case of Julia's intrerface you mentioned:
But that does not seem to work in R. |
Opened a PR for implementing the reset function for R #11357 . |
I have tested the memory usage. Please use nightly build for now. |
Hi there,
Apologies if this was already addressed somewhere else, but I could not find an Issue here that touches exactly on this.
So, while trying XGBoost 3.0 in R enabled for GPU, I noted that, storing the Booster that is returned by
xgboost()
inside alist
leads to what seems to be a GPU memory leak. Clearly, this appears to be related to Boosters now being R 'ALTLIST' with external pointers.To illustrate the problem, consider the following set-up:
Now, if I simply re-train a model on those data in a loop, here's the GPU memory accumulation that I get:
Calling garbage collection explicitly, as suggested here, does not help:
Note that calling garbage collection at the end makes no difference. Also, note that if
model
is not stored in alist
AND garbage collection is called, the problem almost entirely disappears (well, we still remain with 36% more GPU memory used, but in absolute terms it's a small amount):For reference, the issue does not happen in any of the above examples if using XGBoost 1.5.0 instead of 3.0.0.
Given the criticality of storing Boosters in data containers, I wonder if there is a way work around it? If not, I think this side effect of storing Boosters should be made very clear in the docs.
Thanks!
OBS: this was tested using a Nvidia GeForce RTX 3080 12Gb GPU, in a system running Ubuntu 22.04.
The text was updated successfully, but these errors were encountered: