Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factorized linear supports implementation switch and gradient checkpoint #26

Merged
merged 6 commits into from
Jun 23, 2022

Conversation

JeremieMelo
Copy link
Contributor

support implementation switches between factorized and reconstructed
gradient checkpointing for memory-efficient training-mode forward function.

Copy link
Member

@JeanKossaifi JeanKossaifi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JeremieMelo, this looks great, just made a few suggestions on the API.

n_layers : int, default is 1
number of linear layers to be parametrized with a single factorized tensor
bias : bool, default is True
with_cp : bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just call it checkpointing to be more explicit?

Copy link
Contributor Author

@JeremieMelo JeremieMelo Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in the newest commit.

device : PyTorch device to use, default is None
dtype : PyTorch dtype, default is None
"""
def __init__(self, in_tensorized_features, out_tensorized_features, bias=True,
factorization='cp', rank='same', n_layers=1, device=None, dtype=None):
factorization='cp', rank='same', implementation='factorized', n_layers=1,
with_cp=False, device=None, dtype=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of with_cp, we can use checkpointing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in the newest commit.

JeremieMelo and others added 2 commits June 23, 2022 12:17
Sure, could you create a suggested change to replace all 'with_cp' to 'checkpointing', so I can directly add it to the batched commit. Thanks.

Co-authored-by: Jean Kossaifi <jean.kossaifi@gmail.com>
with_cp to checkpointing, move weight out of _inner_forward
@JeanKossaifi
Copy link
Member

Looks good, thanks @JeremieMelo, merging!

@JeanKossaifi JeanKossaifi merged commit b8bf48d into tensorly:main Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants