Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow dispatch() during symbolication - 8 seconds of 100% CPU usage checking selectors #5403

Open
mstange opened this issue Mar 25, 2025 · 4 comments

Comments

@mstange
Copy link
Contributor

mstange commented Mar 25, 2025

Profile: https://share.firefox.dev/41V11ym
Profile with JS execution tracer (short excerpt): https://share.firefox.dev/4hKqDnz

I was loading and symbolicating this profile with samply load perf.data: https://share.firefox.dev/4iYLfJN
During symbolication, the same 21 tracks were visible which are visible when you open the uploaded profile.

The profile has 360 libs that need to be symbolicated. requestSymbolsWithCallback does one call to dispatch(requestingSymbolTable(lib)) for each of the 360 libs. And each of those dispatch calls appears to take a full 10ms! That means 3.6 seconds of CPU usage for dispatching all these requestingSymbolTable actions! And another 3.6 seconds once the results have come in.

10ms per dispatch is too much. All the time is spent checking whether we need to re-run selectors.

I haven't dug into which selectors we're checking and how many times which level is being checked.

┆Issue is synchronized with this Jira Task

@mstange
Copy link
Contributor Author

mstange commented Mar 25, 2025

Oh I see, our React tree contains TimelineGlobalTrack and TimelineLocalTrack nodes for every track, not just for visible tracks. And they're all connected components so we need to check if their props changed. So on every dispatch we do some work for every track including hidden tracks.

@mstange
Copy link
Contributor Author

mstange commented Mar 25, 2025

Oh right, and after every action, the redux state object has changed (it now contains the updated list of pending libraries), so all the selectors used in mapStateToProps of the connected track components need to have all their dependencies recursively checked, because every selector caches its previous value based on the old state object. One of the selectors that runs is the getThreadProcessDetails selector for every thread, which depends on the getThread selector, and the getThread selector recently gained some new dependencies. Changing getThreadProcessDetails to use getRawThread instead of getThread improves the performance by more than 2x.

I think the real fix is to stop having a connected component for every thread. We can have a regular, not-connected, component as the child of the <Reorderable>, and then we can have a connected component in there only if the track is actually visible. So the unconnected wrapper component needs to have a prop passed to it that tells it whether it's visible.

But even with 2000 connected components, it's silly how much time we're spending checking dependencies of memoized selectors. We're also paying the cost of checking the cache based on a generic arguments list when we know that the only argument we get is the single redux state object. I think we could write a special cased version of createSelector which only works for that case and it should be more efficient.

@julienw
Copy link
Contributor

julienw commented Mar 25, 2025

Oh right, and after every action, the redux state object has changed (it now contains the updated list of pending libraries), so all the selectors used in mapStateToProps of the connected track components need to have all their dependencies recursively checked, because every selector caches its previous value based on the old state object. One of the selectors that runs is the getThreadProcessDetails selector for every thread, which depends on the getThread selector, and the getThread selector recently gained some new dependencies. Changing getThreadProcessDetails to use getRawThread instead of getThread improves the performance by more than 2x.

I think the real fix is to stop having a connected component for every thread. We can have a regular, not-connected, component as the child of the <Reorderable>, and then we can have a connected component in there only if the track is actually visible. So the unconnected wrapper component needs to have a prop passed to it that tells it whether it's visible.

This was my plan: have the connected component only rendered when its container is marked visible by the intersection observer.
However it wasn't straightforward when I tried it. I don't fully remember what the problem was though (maybe related to the WithSize HOC).

But even with 2000 connected components, it's silly how much time we're spending checking dependencies of memoized selectors. We're also paying the cost of checking the cache based on a generic arguments list when we know that the only argument we get is the single redux state object. I think we could write a special cased version of createSelector which only works for that case and it should be more efficient.

The new version of reselect allows to specify different versions of memoize for the output selector (the result of createSelector) and the result function (the one computing the output from the results of the input selectors). What you suggest is to change the memoize for the output selector only I think? Now that you mention it, I wonder why they don't special-case the case arguments.length == 1 indeed.

Note that we don't use that new version of reselect currently, because they switched to a weak map-based memoize function now, and I decided this would possibly leak data in our cases, and didn't want to look at it.

@mstange
Copy link
Contributor Author

mstange commented Mar 26, 2025

Changing getThreadProcessDetails to use getRawThread instead of getThread improves the performance by more than 2x.

I've made that change in #5405.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants