-
Notifications
You must be signed in to change notification settings - Fork 16
Add Cosine LR Scheduler for Fine-Tuning #273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/together/cli/api/finetune.py
Outdated
"--num-cycles", | ||
type=float, | ||
default=0.5, | ||
help="Number of cycles for cosine learning rate scheduler.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
help="Number of cycles for cosine learning rate scheduler.", | |
help="Number of cycles for the cosine learning rate scheduler.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe also add what fractional values mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to "Number or fraction of cycles".
src/together/resources/finetune.py
Outdated
lr_scheduler_args=FinetuneLinearLRSchedulerArgs(min_lr_ratio=min_lr_ratio), | ||
) | ||
if lr_scheduler_type == "cosine": | ||
if num_cycles <= 0.0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe also add some meaningful upperbound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard to know what a reasonable upperbound would be without knowing the number of steps afaik. I think it makes sense to follow the hf implementation and let the cosine alias if the user inputs something unreasonably large for their job.
src/together/resources/finetune.py
Outdated
min_lr_ratio: float = 0.0, | ||
num_cycles: float = 0.5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call it scheduler_num_cycles for clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The backend expects the field in the request to be num_cycles
, but in the scheduler args. Would just changing the CLI arg be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed CLI arg to scheduler_num_cycles
. Worth noting FinetuneCosineLRSChedulerArgs
still has the field as num_cycles
Have you read the Contributing Guidelines?
Yes
Issue # ENG-23270
Adds support for Cosine LR Scheduler in the fine-tuning api as well as simplifies future support of additional schedulers (https://git.1-hub.cntogethercomputer/together-finetune/pull/1349)