Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LlamaSafetyOptimizer for Runtime Safety Checks and Performance Optimization #1326

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dhawalagarwal10
Copy link

Changes Made and Why
I've implemented a new module called LlamaSafetyOptimizer that wraps around the existing Llama model to provide safety checks, performance monitoring, and memory optimization capabilities. The specific changes include:

Added a new file safety/wrapper.py containing:

LlamaSafetyOptimizer class for wrapping Llama models
PerformanceMetrics dataclass for tracking performance statistics
Methods for safety validation, memory tracking, and batch size optimization

Created unit tests to verify the functionality of the new module:

Tests for initialization
Tests for memory tracking capabilities
Tests for safety check mechanisms
Tests for the safe forward pass

Provided a simple example implementation showing how to use the optimizer with an existing Llama model

These changes were necessary to enhance the safety and performance monitoring capabilities of Llama models in production environments, where both safety guardrails and resource optimization are critical concerns.
Project Improvements
This PR improves the project in several key ways:

Enhanced Safety: Adds runtime validation of model outputs to detect potentially problematic generation patterns
Resource Optimization: Automatically finds the optimal batch size based on available memory
Performance Monitoring: Tracks and reports on inference time, memory usage, and GPU utilization
Easy Integration: Designed as a wrapper that can be added to existing models with minimal code changes
Testability: Includes comprehensive unit tests to ensure reliability

Testing Performed
I've conducted the following tests to ensure the new module works correctly:

Unit Tests: Created pytest-based tests for all main components:

Initialization with different parameters
Memory tracking functionality (CPU and GPU when available)
Safety check algorithms
Performance monitoring accuracy

Integration Testing:

Tested with a simplified Llama model to verify correct behavior
Verified that performance metrics are collected accurately
Confirmed that batch size optimization works as expected

All tests pass successfully, demonstrating that the module performs as intended.
Additional Notes
This implementation is designed to be non-intrusive and can be enabled or disabled based on the specific deployment needs. The safety checks are currently based on simple statistical analysis of model outputs, but the framework is extensible to incorporate more sophisticated safety mechanisms in the future.
The memory tracking components are compatible with both CPU-only and GPU environments, with appropriate fallbacks when CUDA is not available.
I welcome feedback on:

The safety metrics implementation - are there additional checks that would be valuable?
Performance optimization strategies - any suggestions for further reducing memory overhead?
Any edge cases I might have missed in the testing

@facebook-github-bot
Copy link

Hi @dhawalagarwal10!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 17, 2025
@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@dhawalagarwal10
Copy link
Author

All checks have passed. Could someone review and merge this when possible? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants