Add LlamaSafetyOptimizer for Runtime Safety Checks and Performance Optimization #1326

dhawalagarwal10 · 2025-02-17T17:07:35Z

Changes Made and Why
I've implemented a new module called LlamaSafetyOptimizer that wraps around the existing Llama model to provide safety checks, performance monitoring, and memory optimization capabilities. The specific changes include:

Added a new file safety/wrapper.py containing:

LlamaSafetyOptimizer class for wrapping Llama models
PerformanceMetrics dataclass for tracking performance statistics
Methods for safety validation, memory tracking, and batch size optimization

Created unit tests to verify the functionality of the new module:

Tests for initialization
Tests for memory tracking capabilities
Tests for safety check mechanisms
Tests for the safe forward pass

Provided a simple example implementation showing how to use the optimizer with an existing Llama model

These changes were necessary to enhance the safety and performance monitoring capabilities of Llama models in production environments, where both safety guardrails and resource optimization are critical concerns.
Project Improvements
This PR improves the project in several key ways:

Enhanced Safety: Adds runtime validation of model outputs to detect potentially problematic generation patterns
Resource Optimization: Automatically finds the optimal batch size based on available memory
Performance Monitoring: Tracks and reports on inference time, memory usage, and GPU utilization
Easy Integration: Designed as a wrapper that can be added to existing models with minimal code changes
Testability: Includes comprehensive unit tests to ensure reliability

Testing Performed
I've conducted the following tests to ensure the new module works correctly:

Unit Tests: Created pytest-based tests for all main components:

Initialization with different parameters
Memory tracking functionality (CPU and GPU when available)
Safety check algorithms
Performance monitoring accuracy

Integration Testing:

Tested with a simplified Llama model to verify correct behavior
Verified that performance metrics are collected accurately
Confirmed that batch size optimization works as expected

All tests pass successfully, demonstrating that the module performs as intended.
Additional Notes
This implementation is designed to be non-intrusive and can be enabled or disabled based on the specific deployment needs. The safety checks are currently based on simple statistical analysis of model outputs, but the framework is extensible to incorporate more sophisticated safety mechanisms in the future.
The memory tracking components are compatible with both CPU-only and GPU environments, with appropriate fallbacks when CUDA is not available.
I welcome feedback on:

The safety metrics implementation - are there additional checks that would be valuable?
Performance optimization strategies - any suggestions for further reducing memory overhead?
Any edge cases I might have missed in the testing

facebook-github-bot · 2025-02-17T17:07:42Z

Hi @dhawalagarwal10!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

facebook-github-bot · 2025-02-17T17:15:17Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

dhawalagarwal10 · 2025-02-17T17:19:06Z

All checks have passed. Could someone review and merge this when possible? Thanks!

Add safety optimization wrapper

ed94e7a

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LlamaSafetyOptimizer for Runtime Safety Checks and Performance Optimization #1326

Add LlamaSafetyOptimizer for Runtime Safety Checks and Performance Optimization #1326

dhawalagarwal10 commented Feb 17, 2025

facebook-github-bot commented Feb 17, 2025

facebook-github-bot commented Feb 17, 2025

dhawalagarwal10 commented Feb 17, 2025

Add LlamaSafetyOptimizer for Runtime Safety Checks and Performance Optimization #1326

Are you sure you want to change the base?

Add LlamaSafetyOptimizer for Runtime Safety Checks and Performance Optimization #1326

Conversation

dhawalagarwal10 commented Feb 17, 2025

facebook-github-bot commented Feb 17, 2025

Action Required

Process

facebook-github-bot commented Feb 17, 2025

dhawalagarwal10 commented Feb 17, 2025