Model shapes config #2116

jainapurva · 2025-04-23T18:34:33Z

This pull request introduces significant updates to the microbenchmarking framework, focusing on new model types, and enhanced shape generation options. The changes aim to expand functionality, and more extensive benchmarking configurations.

Enhancements to Model Types and Shape Generation

Added support for new model types, including ln_linear_<activation> (e.g., sigmoid, relu, gelu) and transformer_block with self-attention and MLP. These are documented in benchmarks/microbenchmarks/README.md.
Introduced multiple shape generation options (custom, llama, pow2, pow2_extended, sweep) to support diverse matrix shapes for benchmarking. These options are implemented in benchmark_runner.py and documented in the README.

Refactoring and Code Simplification

Refactored model creation logic by replacing create_model_and_input with create_model_and_input_data, now imported from torchao.testing.model_architectures. This centralizes model definitions and input data generation.
Removed redundant model definitions (ToyLinearModel, LNLinearSigmoid) from utils.py, consolidating them into torchao.testing.model_architectures.

Future TODO: Refactor Torchao to use model definitions from torchao.testing.model_architectures, #2078

Updates to Configuration

Expanded benchmark_config.yml to include configurations for new model types and shape generation options, such as llama and pow2.

Documentation Improvements

Updated README.md to provide detailed descriptions of new model types and shape generation options, ensuring users can easily understand and utilize the new features.

These changes collectively enhance the flexibility, maintainability, and usability of the benchmarking framework.

Sample configuration.yml for inference benchmarks

benchmark_mode: "inference"
quantization_config_recipe_names:
  - "float8dq"
  - "float8wo"
output_dir: "benchmarks/microbenchmarks/results"
model_params:
  - name: "ln_linear_sigmoid_cuda"
    matrix_shapes:
      - name: "custom"
        shapes: [
          [2048, 4096, 1024],
        ]
    high_precision_dtype: "torch.bfloat16"
    use_torch_compile: true
    torch_compile_mode: "max-autotune"
    device: "cuda"
    model_type: "ln_linear_sigmoid"
    enable_profiler: true

  - name: "bf16_transformer_block"
    matrix_shapes:
      - name: "custom"
        shapes: [
          [2048, 4096, 1024],  # For transformer_block, k is the hidden dimension
        ]
    high_precision_dtype: "torch.bfloat16"
    use_torch_compile: true
    torch_compile_mode: "max-autotune"
    device: "cuda"
    model_type: "transformer_block" # TODO: Add a custom model (Figure out how to do this, maybe pass a .py file with model definition)
    enable_profiler: true

  - name: "large_bf16_ln_linear"
    matrix_shapes:
      - name: "llama"  # Example of using LLaMa shapes
      - name: "pow2"  # Example of using power of 2 shapes
        min_power: 10  # 1024
        max_power: 12  # 4096
      - name: "pow2_extended"  # Example of using extended power of 2 shapes
        min_power: 10  # 1024
        max_power: 11  # 2048
      - name: "sweep"  # Example of using sweep shapes (commented out as it generates many shapes)
        min_power: 8   # 256
        max_power: 9   # 512
    high_precision_dtype: "torch.bfloat16"
    use_torch_compile: true
    torch_compile_mode: "max-autotune"
    device: "cuda"
    model_type: "linear"
    enable_profiler: true  # Enable profiling for this model

[ghstack-poisoned]

…shapes_config

pytorch-bot · 2025-04-23T18:34:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2116

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (6 Unrelated Failures)

As of commit a750555 with merge base dd5c7b0 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

Run Regression Tests / test (CUDA 2.5.1, linux.g5.12xlarge.nvidia.gpu, torch==2.5.1 --index-url https://download.pytorch... / linux-job (gh) (similar failure)
test/quantization/test_qat.py::TestQAT::test_qat_8da4w_prepare_vs_convert_2

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Run Regression Tests / test (CPU 2.4, linux.4xlarge, torch==2.4.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh) (trunk failure)
test/quantization/test_qat.py::TestQAT::test_qat_8da4w_prepare_vs_convert_2
Run Regression Tests / test (CPU 2.5.1, linux.4xlarge, torch==2.5.1 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh) (trunk failure)
test/quantization/test_qat.py::TestQAT::test_qat_8da4w_prepare_vs_convert_2
Run Regression Tests / test (CUDA 2.4, linux.g5.12xlarge.nvidia.gpu, torch==2.4.0, cuda, 12.1) / linux-job (gh) (trunk failure)
test/quantization/test_qat.py::TestQAT::test_qat_8da4w_prepare_vs_convert_2
Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch --index-url https://download.pytorch.org/wh... / linux-job (gh) (trunk failure)
test/quantization/test_qat.py::TestQAT::test_qat_8da4w_prepare_vs_convert_2
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh) (trunk failure)
test/quantization/test_qat.py::TestQAT::test_qat_8da4w_prepare_vs_convert_2

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull Request Overview

This pull request refactors the microbenchmarking framework by centralizing model creation into torchao.testing.model_architectures, adds new model types with varied activation functions, and extends shape generation options for benchmarking. It also updates tests and documentation accordingly.

Centralizes model definitions and input generation via create_model_and_input_data.
Implements new shape generation options (custom, llama, pow2, pow2_extended, sweep).
Updates tests and documentation to reflect refactoring and new functionality.

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
torchao/testing/model_architectures.py	New model classes and unified model creation function added/refined
test/test_model_architecture.py	Tests updated to use the new model creation function
benchmarks/microbenchmarks/utils.py	Removed redundant model definitions and deprecated function
benchmarks/microbenchmarks/test/*	Updated tests to work with refactored models and shape generation
benchmarks/microbenchmarks/benchmark_runner.py	Enhanced shape generation and improved error messaging
benchmarks/microbenchmarks/benchmark_inference.py	Updated to use the refactored model creation function
benchmarks/microbenchmarks/README.md	Documentation updated with new model types and shape generation options

Comments suppressed due to low confidence (2)

torchao/testing/model_architectures.py:164

[nitpick] Consider using a stricter regex (e.g., r"ln_linear_(\w+)$") to extract the activation type more precisely from model_type, ensuring it only matches valid activations.

match = re.search(r"ln_linear_?(\w+)?", model_type)

torchao/testing/model_architectures.py:141

Update the docstring for create_model_and_input_data to reflect the actual parameter names (e.g., 'm' instead of 'batch_size') for clarity and to avoid confusion.

def create_model_and_input_data(

jainapurva · 2025-04-23T18:44:04Z

Duplicates PR: #2036

jainapurva added 15 commits April 8, 2025 14:35

Update

8b22a68

[ghstack-poisoned]

Add profiler

04f39ef

Add support for different models and different shapes

4b7ea5d

Add ruff fixes

33fa3ca

Updates

5ee6b58

Updates

345a00c

Merge remote-tracking branch 'origin/bench-gpu-profiling' into model_…

6e88306

…shapes_config

Updates

bbcba36

updates

d5bdb4a

Merge remote-tracking branch 'origin/bench-gpu-profiling' into model_…

7677902

…shapes_config

Merge remote-tracking branch 'origin/main' into model_shapes_config

06f5ee7

Added a future todo

784ec94

Merge remote-tracking branch 'origin/main' into model_shapes_config

9f5e595

Lint fixes

8f73ebf

Merge remote-tracking branch 'origin/main' into model_shapes_config

a750555

jainapurva added topic: for developers Use this tag if this PR is mainly developer facing topic: performance Use this tag if this PR improves the performance of a feature labels Apr 23, 2025

jainapurva requested review from jerryzh168 and Copilot April 23, 2025 18:34

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 23, 2025

Copilot AI reviewed Apr 23, 2025

View reviewed changes

jainapurva requested a review from Copilot April 23, 2025 18:43

Copilot AI reviewed Apr 23, 2025

View reviewed changes

jainapurva marked this pull request as ready for review April 23, 2025 20:30

jerryzh168 approved these changes Apr 23, 2025

View reviewed changes

jainapurva merged commit 2fcab01 into main Apr 24, 2025
15 of 21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model shapes config #2116

Model shapes config #2116

jainapurva commented Apr 23, 2025

pytorch-bot bot commented Apr 23, 2025 •

edited

Loading

Copilot AI left a comment

Copilot AI left a comment

jainapurva commented Apr 23, 2025

Model shapes config #2116

Model shapes config #2116

Conversation

jainapurva commented Apr 23, 2025

Enhancements to Model Types and Shape Generation

Refactoring and Code Simplification

Updates to Configuration

Documentation Improvements

Sample configuration.yml for inference benchmarks

pytorch-bot bot commented Apr 23, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2116

✅ You can merge normally! (6 Unrelated Failures)

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

jainapurva commented Apr 23, 2025

pytorch-bot bot commented Apr 23, 2025 •

edited

Loading