-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Production - [Alerting] Servicing jobs in R&D queues alert #5277
Comments
Kusto shows all results relate to use of
I lean toward (1) but am unsure. thoughts on which path to take @premun, @mmitche, @ilyas1974❓ |
I think the "better" question to ask is why are they running jobs related to servicing in this pool at all? This pool has been around for a couple of years, and this is the first time it was used for this purpose. @mmitche please correct me if I'm wrong, but I seem to remember this pool being created to speed up publishing during release. In my opinion, it shouldn't be used for jobs outside that scope. |
the jobs are all publishing-related: |
actually, build jobs aren't the issue here. please ignore everything I said above. this is about Helix queues and the problem is perhaps threefold:
current rule shows the following for the past week: the current query for this rule is // Note: If you are changing how we filter jobs, remember to make the same changes in the graph and table
let UntrackedQueues = Jobs
| project QueueName = tolower(QueueName)
| where QueueName has_any ('osx','perf','armarch','arm64','arcade','coreappcompat','iot','ppc64le.experimental','s390x')
or QueueName matches regex 'windows.*amd64.android.open'
or QueueName == 'windows.10.amd64.x86.rt'
| distinct QueueName;
Jobs
| where $__timeFilter(Queued)
| where tolower(QueueName) !in (UntrackedQueues)
| extend TargetBranch=parse_json(Properties)["System.PullRequest.TargetBranch"]
| where (Branch contains "/release/" or Branch startswith "release/" or TargetBranch startswith "release/" or TargetBranch contains "/release/") and QueueName !endswith ".svc"
| project JobId, Queued, Repository, Branch, TargetBranch, QueueName note
|
I don't have a problem with us implementing your above suggestions. |
basically, logic in dotnet-helix-service does the redirect and only warns if the new queue name doesn't exist. so, minimal action here unless we update
for now, we can add |
I created a few sub-issues and expect #5291 will clear this alert, allowing us to close it. the other two sub-issues need a bit more work and/or discussion |
/fyi @ilyas1974 I'm leaving this as assigned to me since #5291 is assigned to me and waiting for rollout |
💔 Metric state changed to alerting
Go to rule
@dotnet/dnceng, @dotnet/prodconsvcs, please investigate
Automation information below, do not change
Grafana-Automated-Alert-Id-5aa74f27ef6445ce9d3d8d3d382e7e35
The text was updated successfully, but these errors were encountered: