Skip to content

Model shapes config #2116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 24, 2025
Merged

Model shapes config #2116

merged 15 commits into from
Apr 24, 2025

Conversation

jainapurva
Copy link
Contributor

This pull request introduces significant updates to the microbenchmarking framework, focusing on new model types, and enhanced shape generation options. The changes aim to expand functionality, and more extensive benchmarking configurations.

Enhancements to Model Types and Shape Generation

  • Added support for new model types, including ln_linear_<activation> (e.g., sigmoid, relu, gelu) and transformer_block with self-attention and MLP. These are documented in benchmarks/microbenchmarks/README.md.
  • Introduced multiple shape generation options (custom, llama, pow2, pow2_extended, sweep) to support diverse matrix shapes for benchmarking. These options are implemented in benchmark_runner.py and documented in the README.

Refactoring and Code Simplification

  • Refactored model creation logic by replacing create_model_and_input with create_model_and_input_data, now imported from torchao.testing.model_architectures. This centralizes model definitions and input data generation.
  • Removed redundant model definitions (ToyLinearModel, LNLinearSigmoid) from utils.py, consolidating them into torchao.testing.model_architectures.

Future TODO: Refactor Torchao to use model definitions from torchao.testing.model_architectures, #2078

Updates to Configuration

  • Expanded benchmark_config.yml to include configurations for new model types and shape generation options, such as llama and pow2.

Documentation Improvements

  • Updated README.md to provide detailed descriptions of new model types and shape generation options, ensuring users can easily understand and utilize the new features.

These changes collectively enhance the flexibility, maintainability, and usability of the benchmarking framework.

Sample configuration.yml for inference benchmarks

benchmark_mode: "inference"
quantization_config_recipe_names:
  - "float8dq"
  - "float8wo"
output_dir: "benchmarks/microbenchmarks/results"
model_params:
  - name: "ln_linear_sigmoid_cuda"
    matrix_shapes:
      - name: "custom"
        shapes: [
          [2048, 4096, 1024],
        ]
    high_precision_dtype: "torch.bfloat16"
    use_torch_compile: true
    torch_compile_mode: "max-autotune"
    device: "cuda"
    model_type: "ln_linear_sigmoid"
    enable_profiler: true

  - name: "bf16_transformer_block"
    matrix_shapes:
      - name: "custom"
        shapes: [
          [2048, 4096, 1024],  # For transformer_block, k is the hidden dimension
        ]
    high_precision_dtype: "torch.bfloat16"
    use_torch_compile: true
    torch_compile_mode: "max-autotune"
    device: "cuda"
    model_type: "transformer_block" # TODO: Add a custom model (Figure out how to do this, maybe pass a .py file with model definition)
    enable_profiler: true

  - name: "large_bf16_ln_linear"
    matrix_shapes:
      - name: "llama"  # Example of using LLaMa shapes
      - name: "pow2"  # Example of using power of 2 shapes
        min_power: 10  # 1024
        max_power: 12  # 4096
      - name: "pow2_extended"  # Example of using extended power of 2 shapes
        min_power: 10  # 1024
        max_power: 11  # 2048
      - name: "sweep"  # Example of using sweep shapes (commented out as it generates many shapes)
        min_power: 8   # 256
        max_power: 9   # 512
    high_precision_dtype: "torch.bfloat16"
    use_torch_compile: true
    torch_compile_mode: "max-autotune"
    device: "cuda"
    model_type: "linear"
    enable_profiler: true  # Enable profiling for this model

@jainapurva jainapurva added topic: for developers Use this tag if this PR is mainly developer facing topic: performance Use this tag if this PR improves the performance of a feature labels Apr 23, 2025
Copy link

pytorch-bot bot commented Apr 23, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2116

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (6 Unrelated Failures)

As of commit a750555 with merge base dd5c7b0 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 23, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@jainapurva jainapurva requested a review from Copilot April 23, 2025 18:43
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request refactors the microbenchmarking framework by centralizing model creation into torchao.testing.model_architectures, adds new model types with varied activation functions, and extends shape generation options for benchmarking. It also updates tests and documentation accordingly.

  • Centralizes model definitions and input generation via create_model_and_input_data.
  • Implements new shape generation options (custom, llama, pow2, pow2_extended, sweep).
  • Updates tests and documentation to reflect refactoring and new functionality.

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
torchao/testing/model_architectures.py New model classes and unified model creation function added/refined
test/test_model_architecture.py Tests updated to use the new model creation function
benchmarks/microbenchmarks/utils.py Removed redundant model definitions and deprecated function
benchmarks/microbenchmarks/test/* Updated tests to work with refactored models and shape generation
benchmarks/microbenchmarks/benchmark_runner.py Enhanced shape generation and improved error messaging
benchmarks/microbenchmarks/benchmark_inference.py Updated to use the refactored model creation function
benchmarks/microbenchmarks/README.md Documentation updated with new model types and shape generation options
Comments suppressed due to low confidence (2)

torchao/testing/model_architectures.py:164

  • [nitpick] Consider using a stricter regex (e.g., r"ln_linear_(\w+)$") to extract the activation type more precisely from model_type, ensuring it only matches valid activations.
match = re.search(r"ln_linear_?(\w+)?", model_type)

torchao/testing/model_architectures.py:141

  • Update the docstring for create_model_and_input_data to reflect the actual parameter names (e.g., 'm' instead of 'batch_size') for clarity and to avoid confusion.
def create_model_and_input_data(

@jainapurva
Copy link
Contributor Author

Duplicates PR: #2036

@jainapurva jainapurva marked this pull request as ready for review April 23, 2025 20:30
@jainapurva jainapurva merged commit 2fcab01 into main Apr 24, 2025
15 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: for developers Use this tag if this PR is mainly developer facing topic: performance Use this tag if this PR improves the performance of a feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants