-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Blazor - rendering metrics and tracing #61609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
328a584
to
cebb68e
Compare
# Conflicts: # src/Components/Components/src/PublicAPI.Unshipped.txt
- add tracing
You're adding a lot of metrics here. I think you should do some performance testing. There is performance overhead of metrics - they require some synronization when incrementing counters and recording values. Having many low level metrics could cause performance issues. |
I removed few and kept only the most useful ones. I have 2 remaining issues
|
I don't know how Blazor circuits are created, but if it's from a Hub method then Activity.Current won't be the HTTP activity. We hop off the HTTP activity on purpose in SignalR: aspnetcore/src/SignalR/server/Core/src/Internal/DefaultHubDispatcher.cs Lines 398 to 403 in 9f2b088
Is that because the HTTP request is still running? I don't think activites show up in the dashboard until they're stopped, and if you're using SignalR you're likely using a websocket request which is long running. |
I'm capturing
This is it, thank you @BrennanConroy ! |
It's also topic to discuss for long running activities on Blazor.
We have 2 way how to deal with them I think
Right now I have short+links implementation. I guess developers use OTEL mostly in production and so even the long running traces would be recorded already. But maybe developers also use it in inner dev loop ? In which case it would be great to have "trace preview" for thing that started but not stopped yet. To not get confused the same way as I did. |
- cleanup
Adding a general naming one here - |
description: "Total number of exceptions during browser event processing."); | ||
|
||
_parametersDuration = _meter.CreateHistogram( | ||
"aspnetcore.components.parameters.duration", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a neophyte Blazor developer, I don't quite understand why you'd want metrics broken down to the level of parameters. I am probably misunderstanding what this represents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blazor "parameters" are properties of a component that can receive values from its parent component, marked with the [Parameter] attribute. They enable data to flow down from parent to child components. When Blazor parameters change, the component goes through a re-rendering cycle. I think it they are well defined term.
The duration measured here is the act of parameter propagation and the user business logic that is triggered by it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your feedback is that meaning of individual diagnostic instruments needs to be documented after we are done here.
cc @guardrex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally an experienced Blazor developer should be able to make a good guess at the meaning of the metric given just its name. I'm fine to assume those devs understand the meaning of 'parameters'. As described above it sounded like 'parameters' is a noun that doesn't inherently have a notion of time duration associated with it? Perhaps we could name this something like aspnetcore.components.update_parameters.duration
? I'm not sure if there is a better term than update that Blazor uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aspnetcore.components.update_parameters.duration
Sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the granularity here is probably too fine. I don't think we need to be tracking this on a per component/parameter basis - there could potentially be hundreds of those on a page. I would suggest that we focus on what the end user will see which is that they take an action and that results in an update to the page. That will admittedly include a network round-trip, but understanding it from the server level is probably sufficient as it is what is in the developers control (unlike the network from the browser)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user code that's running in the triggered events could make async HTTP call or database calls. If they do SELECT N+1 anti-pattern, it would be visible here.
Those problems are currently not easy to diagnose, especially if the components are from different vendors or teams.
I think it's good to know which component was rendered when state changed. How many times and how long it took.
The action they could take based on this data, is to cache/redesign data acquisition in their components or reduce number of components or tree depth. Maybe also reduce percentage of cases that the sub-tree is re-rendered on data propagation.
Maybe we could have separate meter called Microsoft.AspNetCore.Components.Lifecycle
which have this finer granularity and Microsoft.AspNetCore.Components
could be for the big events.
@danroth27 thoughts ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to know which component was rendered when state changed. How many times and how long it took.
I agree
|
I'm a bit confused by the distributed tracing part of this. Partly that might be my limited experience with Blazor. I'm not sure what period of time is being measured by the different OnCircuit, OnRoute, OnEvent spans. For example I know what a Blazor circuit is but I don't know what 'OnCircuit' is measuring. Is this a span that measures the entire duration of 1 blazor circuit? I'll probably have more questions once I understand what each of the spans represents. |
I don't think we currently define in our public Blazor docs what a render batch is. The only public mention of render batches that I could find is the CircuitOptions.MaxBufferedUnacknowledgedRenderBatches property. So, it's not clear to me what this value represents or whether it is useful. Should we be measuring something else that is more directly correlated the publicly documented component lifecycle? Or do we want to introduce the concept of a render batch in our docs?
Again, since render batch isn't currently a publicly defined concept, should we be counting the number of exceptions per some other period?
I assume this is an average duration of all browser event handlers across the entire app regardless of render mode. That seems reasonable as a high-level view of the responsiveness of the app. But what does the "asynchronously" imply? Are synchronous event handlers not included in this metric?
I'm not sure what's included in the "processing" of component parameters. Is this the duration of the
Is this a total count of all page navigations across the entire app regardless of render mode? What would that be used for? |
I was under the (perhaps mistaken?) impression that the scenarios using the metrics had already been looked at and appropriate metrics identified. If not, perhaps a good starting point is to identify what diagnostic questions we'd like users to be able to solve here. Usually I'd recommend:
|
Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>
Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>
Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>
P0-P2 - This is useful angle, thanks!
Note, I also mention |
This goes back to my questions about long running activities. We can definitely improve naming.
Right now, the short circuit and route activities mostly serve as something that click event activities could link to. For the context. |
Maybe we just need to rename it? Anyway, this is more on the troubleshooting side of misbehaving component. Producing long diffs/batches leads to network traffic, latency and slow rendering. As I suggested above, we could have separate namespace for it with separate opt-in.
We also count exceptions per click/event. But I need to see if the exceptions from batch related problems would appear there.
At the moment this works only for SignalR interactive. I think we could also make it work for form-submit.
I already renamed this and dropped "async". It means including your DB request or whatever async business logic.
Yes, or
Except WASM.
It has the route pattern as tag/dimension that you can use as filter. It's more business oriented KPI. Which of my pages are hot ? |
Making circuit/route activity/trace long lived has troubles with re-installing them into If we keep them short, maybe they should be literally 0ms long. Just an context anchor, grouping other traces. Re Activity names: they are not very visible in the Aspire UI, and Circuit Activity/trace is created in internal Route Activity/trace is created in Regarding click/event. We already have concept of event. The activity should be active thru whole duration of I would like event Activity also trigger for form submit, interop call from JS, and enhanced navigation. Maybe we can change it to |
Better rendering metrics
new meter
Microsoft.AspNetCore.Components
aspnetcore.components.navigation.count
- Total number of route changes.aspnetcore.components.event.duration
- Duration of processing browser event asynchronously.aspnetcore.components.event.exceptions
- Duration of processing browser event asynchronously.new meter
Microsoft.AspNetCore.Components.Lifecycle
aspnetcore.components.update_parameters.duration
- Duration of processing component parameters asynchronously.aspnetcore.components.update_parameters.exceptions
- Duration of processing component parameters asynchronously.aspnetcore.components.rendering.batch.duration
- Duration of rendering batch.aspnetcore.components.rendering.batch.exceptions
- Total number of exceptions during batch rendering.Blazor activity tracing
Microsoft.AspNetCore.Components
Microsoft.AspNetCore.Components.OnCircuit
:CIRCUIT {circuitId}
circuit.id
Microsoft.AspNetCore.Components.OnRoute
:ROUTE {route} -> {componentType}
circuit.id
,component.type
,route
Microsoft.AspNetCore.Components.OnEvent
:EVENT {attributeName} -> {componentType}.{methodName}
circuit.id
,component.type
,component.method
,attribute.name
Feedback
IMeterFactory
to be available in DITODO - Metrics need to be documented at https://learn.microsoft.com/en-us/aspnet/core/log-mon/metrics/built-in
Out of scope
Contributes to #53613
Contributes to #29846
Feedback for #61516