Factorized linear supports implementation switch and gradient checkpoint #26

JeremieMelo · 2022-06-16T18:25:53Z

support implementation switches between factorized and reconstructed
gradient checkpointing for memory-efficient training-mode forward function.

JeanKossaifi

Thanks @JeremieMelo, this looks great, just made a few suggestions on the API.

tltorch/factorized_layers/factorized_linear.py

JeanKossaifi · 2022-06-19T19:12:41Z

tltorch/factorized_layers/factorized_linear.py

    n_layers : int, default is 1
        number of linear layers to be parametrized with a single factorized tensor
    bias : bool, default is True
+    with_cp : bool


I think we can just call it checkpointing to be more explicit?

Updated in the newest commit.

tltorch/factorized_layers/factorized_linear.py

JeanKossaifi · 2022-06-19T19:21:39Z

tltorch/factorized_layers/factorized_linear.py

    device : PyTorch device to use, default is None
    dtype : PyTorch dtype, default is None
    """
    def __init__(self, in_tensorized_features, out_tensorized_features, bias=True,
-                 factorization='cp', rank='same', n_layers=1, device=None, dtype=None):
+                 factorization='cp', rank='same', implementation='factorized', n_layers=1,
+                 with_cp=False, device=None, dtype=None):


Instead of with_cp, we can use checkpointing

updated in the newest commit.

Sure, could you create a suggested change to replace all 'with_cp' to 'checkpointing', so I can directly add it to the batched commit. Thanks. Co-authored-by: Jean Kossaifi <jean.kossaifi@gmail.com>

with_cp to checkpointing, move weight out of _inner_forward

JeanKossaifi · 2022-06-23T18:59:42Z

Looks good, thanks @JeremieMelo, merging!

jiaqig added 4 commits June 16, 2022 12:19

Embedding supports CUDA Tensor

4e5af82

Embedding supports CUDA Tensor

c147c1e

original version

aa24134

support implementation switch and gradient checkpoint

3cec478

JeanKossaifi reviewed Jun 19, 2022

View reviewed changes

JeremieMelo and others added 2 commits June 23, 2022 12:17

Apply suggestions from code review

6f8e368

Sure, could you create a suggested change to replace all 'with_cp' to 'checkpointing', so I can directly add it to the batched commit. Thanks. Co-authored-by: Jean Kossaifi <jean.kossaifi@gmail.com>

Update factorized_linear.py

cb4a507

with_cp to checkpointing, move weight out of _inner_forward

JeanKossaifi merged commit b8bf48d into tensorly:main Jun 23, 2022

JeanKossaifi mentioned this pull request Jun 30, 2022

Operating on the Decomposed Form? #20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Factorized linear supports implementation switch and gradient checkpoint #26

Factorized linear supports implementation switch and gradient checkpoint #26

JeremieMelo commented Jun 16, 2022

JeanKossaifi left a comment

JeanKossaifi Jun 19, 2022

JeremieMelo Jun 23, 2022 •

edited

Loading

JeanKossaifi Jun 19, 2022

JeremieMelo Jun 23, 2022

JeanKossaifi commented Jun 23, 2022

Factorized linear supports implementation switch and gradient checkpoint #26

Factorized linear supports implementation switch and gradient checkpoint #26

Conversation

JeremieMelo commented Jun 16, 2022

JeanKossaifi left a comment

Choose a reason for hiding this comment

JeanKossaifi Jun 19, 2022

Choose a reason for hiding this comment

JeremieMelo Jun 23, 2022 • edited Loading

Choose a reason for hiding this comment

JeanKossaifi Jun 19, 2022

Choose a reason for hiding this comment

JeremieMelo Jun 23, 2022

Choose a reason for hiding this comment

JeanKossaifi commented Jun 23, 2022

JeremieMelo Jun 23, 2022 •

edited

Loading