-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial commit for trlx LORA support #110
Conversation
This should take a similar form to how hydra models are built. It shouldn't be required and directly integrated into ilql or PPO model |
#80 Relevant issue. |
@ethankim00 just a gentle push on when you expect to finish this? |
cc @Sayanc93 |
I can get to it tomorrow or Monday. I'm wondering what the API should be to avoid modifying the model definitions? |
I think it would be like, instead of modifying the |
merge upstream changes
Circling back around on this. |
@cat-state does it make sense to do lora + hydra or just have lora be entirely separate... |
We could have a function to modify the base model of each different model type rather than creating subclasses. |
Hm... I differ to better software engineers haha @jon-tow your input would be great here too |
@ethankim00 This looks great! I've made a few changes based on some testing on our cluster. Here's the summary:
Overall things look very promising. Check out these runs from the PPO sentiments task here. I'm going to begin testing on ILQL and once that's cleared up, we can get this ready for review and a merge. Reports: |
This looks fantastic! Definitely worth including for the 0.4 release next week. Let's get this merged :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Basic support for low rank adaptation.