Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ILQL for T5 model, Fix PPO T5 for refactored code #290

Merged
merged 38 commits into from
Mar 6, 2023

Conversation

PhungVanDuy
Copy link
Collaborator

@PhungVanDuy PhungVanDuy commented Feb 8, 2023

Adapted ILQL implementation for Causal models, this PR support ILQL for the T5 model.

  • Related issue: Support for T5 for ILQL #204

  • Add ILQL for the T5 model

  • Run Summarization example by OpenAI dataset

  • Add sentiment examples for PPO T5

  • Add tests

ILQL T5 on OpenAI Dataset: https://wandb.ai/pvduy/trlx/runs/sm5gug89
Sentiment T5: https://wandb.ai/pvduy/trlx/runs/vgvph8cc

@PhungVanDuy PhungVanDuy marked this pull request as draft February 8, 2023 06:49
@PhungVanDuy PhungVanDuy marked this pull request as ready for review February 14, 2023 02:33
@PhungVanDuy PhungVanDuy marked this pull request as draft February 14, 2023 02:34
Copy link
Collaborator

@maxreciprocate maxreciprocate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad that it works overall, but we still have to find a way to refactor many repetitions here, at least up to the same level as it's done in PPO

@PhungVanDuy PhungVanDuy marked this pull request as ready for review February 20, 2023 09:16
Copy link
Collaborator

@jon-tow jon-tow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @PhungVanDuy, great work! I've left some feedback to look at when you get a chance 👍 👍


input_ids: TensorType["query_size"]
attention_mask: TensorType["query_size"]
decoder_input_ids: TensorType["reward_size"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can decoder_input_ids be moved into ILQLElement, possibly with a None default? Seems like a large amount of duplicate code for 1 extra field (this propagates down elsewhere in the package like to ilql_seq2seq_collate_fn). Same question for ILQLSeq2SeqBatch

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to separate this one for easier maintenance. We can re-design this one along when we find a way to seperate "seq2seq" like you said here: #290 (comment)

Tokenizes samples and shapes rewards into proper tensors and then inserts the resulting dataset into the trainer
"""

if self.config.model.model_arch_type == "seq2seq":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: At some point, we need to design a better way to handle these arch-type conditionals 😅

@PhungVanDuy PhungVanDuy changed the title Support ILQL for T5 model Support ILQL for T5 model, Fix PPO T5 for refactored code Mar 2, 2023
Copy link
Collaborator

@jon-tow jon-tow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on my end! Great work :)

@PhungVanDuy
Copy link
Collaborator Author

Looks good on my end! Great work :)

Thank you so much for your comment!

@PhungVanDuy PhungVanDuy merged commit 7a947dd into CarperAI:main Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants