Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter unused fragments from peak_df #89

Merged
merged 4 commits into from
Mar 21, 2025
Merged

Filter unused fragments from peak_df #89

merged 4 commits into from
Mar 21, 2025

Conversation

GeorgWa
Copy link
Collaborator

@GeorgWa GeorgWa commented Mar 19, 2025

Add remove_unused_peaks method to MSData_Base for efficient peak dataframe cleanup after spectrum filtering. Includes a numba-accelerated mask generator for high performance and supports both in-place and copy operations with full class inheritance preservation.

This PR was 1.87USD

@GeorgWa GeorgWa requested review from mschwoer and mo-sameh March 19, 2025 17:03
@jalew188
Copy link
Collaborator

Any redundencies with alphabase's remove unused fragments?

Copy link

The implementation adds functionality to remove unused peaks from MS data, which is useful for memory efficiency. The code is generally well-structured with good documentation. The main issues are:

  1. The use of Numba without error handling if Numba is not available (consider adding a fallback implementation)
  2. The get_peaks_to_keep_mask function could be more efficient by using vectorized operations instead of loops
  3. Consider adding parameter validation to the remove_unused_peaks method to check if the required columns exist in the dataframes
  4. The test cases are thorough, but they don't test error cases or edge cases like empty dataframes

Copy link

Number of tokens: input_tokens=11268 output_tokens=2153 max_tokens=4096
review_instructions=''
config={}
thinking: ```
[]

@GeorgWa GeorgWa requested a review from mlorenz49 March 20, 2025 13:41
Copy link

@mlorenz49 mlorenz49 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this out on test file. Function is working and properly removing peaks from peak_df upon filtering spectrum_df.

@GeorgWa
Copy link
Collaborator Author

GeorgWa commented Mar 20, 2025

Any redundencies with alphabase's remove unused fragments?

Yes, potentially. This is something which could be improved in the future.

@GeorgWa GeorgWa merged commit d24bd2f into main Mar 21, 2025
2 checks passed
@GeorgWa GeorgWa deleted the remove-unused-peaks branch March 21, 2025 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants