Skip to content

Faster reduce_add on avx #1112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 27, 2025
Merged

Conversation

serge-sans-paille
Copy link
Contributor

Just forward to sse after a split and a sum.

Just forward to sse after a split and a sum. Improve the generic
reduce_add as a side effect.
@serge-sans-paille serge-sans-paille force-pushed the feature/reduce-add-float branch from 222c647 to ad31228 Compare April 27, 2025 14:35
@serge-sans-paille
Copy link
Contributor Author

serge-sans-paille commented Apr 27, 2025

the optimized generated code looks similar but at least we've improved the generic case and we have the same code path.

@serge-sans-paille serge-sans-paille merged commit fb25021 into master Apr 27, 2025
120 checks passed
@JohanMabille JohanMabille deleted the feature/reduce-add-float branch May 6, 2025 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant