[external API] alerts: the renamening #8169

hawkw · 2025-05-15T19:11:03Z

In conversation with @ahl, we have determined that the external API for webhooks added in #7277 should be changed to focus on "alerts" as the first-class user-facing concept, with "webhooks" as one delivery mechanism for alerts. This way, we can talk about alerts as an entity in the API that exist independently of webhooks that deliver alerts, and the same alert types can be shared with other alert delivery mechanisms if any are added in the future.

What we currently refer to as "webhook events" and "webhook event classes" are therefore renamed to "alerts" and "alert classes". The current concept of "webhook receivers" is generalized to an "alert receiver" resource, of which webhook receivers are (currently) the only subtype. This way, if we add other mechanisms of delivering alerts in the future (email, first-class Slack integration, etc), we can introduce new subtypes of alert receivers. I've restructured the API to have both /v1/alert-receivers/... and /v1/webhook-receivers/... routes, with operations common to all alert receivers (list, view, add/remove subscriptions, delete) under the alert-receivers route, and operations related to webhook-specific configuration (add/remove secrets, probe, deliveries) under the webhook-receivers route. I've also changed the AlertReceiver view to have a "kind" enum that stores the subtype-specific configuration; currently, this will only ever be "webhook", but I thought it was worth doing this now to make future additions cause less breakage for API consumers.

This is, admittedly, a somewhat large diff, but fortunately, most of it is just renaming stuff and moving it around. Reviewers can focus more or less exclusively to the changes to the external API routes and models, and maybe the database migrations. Any mistakes while renaming and moving things around have already been caught by the Rust compiler. :)

had to go back and un-rename some columns since apparently CRDB can't do that (sad face!)

i gotta stop forgetting expectorate tests

david-crespo · 2025-05-15T22:35:54Z

I think this is good. One thing that throws me off (but I could definitely get used it) is that you create a webhook receiver with POST /v1/webhook-receivers but then you list and view it with /v1/alert-receivers. I can see why it works that way, and maybe with two kinds of receiver, the structure would be more obvious. On the other hand it's rare for end users to be looking at a list like what's in nexus_tags.txt. The docs site sidebar is the closest thing, but the structure there is not as visible as I'd like, mostly because we're using plain english for the titles, which I don't think we can avoid.

smklein · 2025-05-16T16:36:34Z

nexus/db-model/src/alert_subscription.rs

+    Clone, Debug, Queryable, Selectable, Insertable, Serialize, Deserialize,
+)]
+#[diesel(table_name = alert_subscription)]
+pub struct AlertRxSubscription {


Note for myself: this was moved from nexus/db-model/src/webhook_rx.rs

nexus/db-queries/src/db/datastore/saga.rs

nexus/external-api/src/lib.rs

nexus/tests/integration_tests/endpoints.rs

nexus/types/src/external_api/shared.rs

Co-authored-by: Sean Klein <sean@oxide.computer>

hawkw · 2025-05-16T17:16:04Z

@david-crespo re:

One thing that throws me off (but I could definitely get used it) is that you create a webhook receiver with POST /v1/webhook-receivers but then you list and view it with /v1/alert-receivers. I can see why it works that way, and maybe with two kinds of receiver, the structure would be more obvious.

Yeah, I agree that this feels a bit weird.

One thing I considered doing is also having a GET /v1/webhook-receivers route to list/view webhook receivers only, in addition to the GET routes for /v1/alert-receivers. That would return the webhook-specific models, and the view route would 404 if you requested a receiver ID/name that was a type other than webhook. I can see a couple advantages of this: it makes the API feel a bit more "complete", and it also provides you a way to get a webhook-receiver-specific model without having to handle the enum if you know the receiver you want is a webhook receiver (and similarly, it gives you a way to list only webhook receivers). On the other hand, it means we have two separate routes that list/view the same entities, which could be confusing for users, and it requires us to maintain more endpoints. What do you think? Is it worth adding routes like that?

hawkw · 2025-05-16T17:26:22Z

Oh, @smklein, one other thing: there are a couple of places where we now have tables that have a few columns with unfortunate names ("event_class" and "event_id" rather than "alert_class" and "alert_id") because CRDB doesn't support renaming columns idempotently. Do you think it's worth changing the migrations to drop those tables and create new ones with nicer names, instead of just renaming the table?

schema/crdb/alerts-renamening/up04.sql

ahl

took a pass comparing what we had determined in the CLI with the changes you're proposing to the API

ahl · 2025-05-16T17:28:54Z

nexus/external-api/src/lib.rs

@@ -3660,79 +3693,48 @@ pub trait NexusExternalApi {
    /// queued for re-delivery.
    #[endpoint {
        method = POST,
-        path = "/v1/webhooks/receivers/{receiver}/probe",
-        tags = ["system/webhooks"],
+        path = "/v1/webhook-receivers/{receiver}/probe",


did you consider nesting webhook-receivers under alerts?

yeah, I had initially wanted to do /v1/alert-receivers/webhooks. however, we can't nest routes under a route which can also look up a resource by name. because there's a /v1/alert-receivers/{name-or-id} route, we can't also have a /v1/alert-receivers/webhooks/{name-or-id} route, since it's unclear whether /webhooks should be treated as a receiver name or as a routable path segment.

i did also consider nesting all of this stuff under a top-level /v1/alerts, so /v1/alerts/receivers, /v1/alerts/webhook-receivers, and so on. however, /v1/alerts is also the route for looking up the actual alert resource (currently used for resend but maybe also for actually getting the payload etc in future). alerts are only looked up by UUID and never by name, so we could nest other routes under /v1/alerts, but it felt a bit weird, and i didn't want to put the alert-lookup route under /v1/alerts/alerts because...that's gross.

also, @david-crespo had previously told me that we try to keep the public API as "flat" as possible rather than nesting, and i think the /v1/alerts, /v1/alert-receivers, /v1/alert-deliveries etc structure is closer to what we've done elsewhere? if either of you have suggestions for a better structure given all that, though, i'm all ears!

@david-crespo can you comment? It seems like "as flat as possible... but not flatter" might be the addendum.

In this case I think the middle paragraph is the real constraint, rather than the aesthetic norm about flatness. We don't have anywhere else in the API where we support both /v1/alerts/1f733f7b-b2eb-429c-8b3a-69203cb55cd7 and /v1/alerts/receivers/.... I think that's worse than what we currently have in this PR. Though overall the difficulty here I think is the same one I was pointing to in #8169 (comment) — the structure doesn't feel quite natural, though I concede that's an initial impression and I am totally open to the idea that a) it's not yet worth the effort required to get it to feel intuitive, since we don't know how the feature will evolve, and b) initial impression is less important than whether it's usable after spending a little time wiht it.

There is also the option of double-nesting, i.e. having /v1/alerts/alerts/1f733f7b-b2eb-429c-8b3a-69203cb55cd7 and /v1/alerts/receivers/my-cool-receiver, I suppose. But, the "alerts/alerts" feels unpleasant.

in my humble opinion, no to alerts/alerts

yeah, i really didn't like it either...

nexus/external-api/src/lib.rs

ahl · 2025-05-16T17:32:57Z

nexus/external-api/src/lib.rs

    #[endpoint {
-        method = PUT,
-        path = "/v1/webhooks/receivers/{receiver}",
-        tags = ["system/webhooks"],
+        method = POST,
+        path = "/v1/alert-receivers/{receiver}/subscriptions",
+        tags = ["system/alerts"],
    }]
-    async fn webhook_receiver_update(
+    async fn alert_receiver_subscription_add(
        rqctx: RequestContext<Self::Context>,
-        path_params: Path<params::WebhookReceiverSelector>,
-        params: TypedBody<params::WebhookReceiverUpdate>,
-    ) -> Result<HttpResponseUpdatedNoContent, HttpError>;
+        path_params: Path<params::AlertReceiverSelector>,
+        params: TypedBody<params::AlertSubscriptionCreate>,
+    ) -> Result<HttpResponseCreated<views::AlertSubscriptionCreated>, HttpError>;

-    /// Delete webhook receiver
+    /// Remove alert receiver subscription
    #[endpoint {
        method = DELETE,
-        path = "/v1/webhooks/receivers/{receiver}",
-        tags = ["system/webhooks"],
+        path = "/v1/alert-receivers/{receiver}/subscriptions/{subscription}",
+        tags = ["system/alerts"],
    }]


you made what I thought was a good suggestion in the CLI to call these oxide alert receiver subscribe and oxide alert receiver unsubscribe. Do you now prefer subscription add/remove? I think I prefer the former, but my only strong preference is that CLI and API match in this regard.

i like the "subscribe"/"unsubscribe" naming for the CLI. per this comment from @david-crespo we prefer to use verbs like "add"/"remove" in API routes because they're consistent with other API operations, rather than descriptive verbs like "subscribe"/"unsubscribe". personally i think i would strongly prefer the more descriptive verbs in the CLI and don't have strong preferences about whether we should also use that in the API...perhaps it's more important for both to be consistent with each other than to use the more descriptive verbs, i dunno...

Sounds fine; just wanted to make sure we're making the decision eyes open.

i definitely think there's value in using the same verbs in the CLI and the API, because then the CLI becomes a sort of teaching tool for the API: if you've done something manually, you know exactly where to look if you want to do it programmatically. but, i'm not totally sure how strongly we've weighed that against other concerns in the past. are there any cases where we've intentionally chosen to use different verbs (or nouns!) in the CLI, or are they always the same?

nexus/external-api/src/lib.rs

ahl · 2025-05-16T17:38:39Z

@david-crespo re:

One thing that throws me off (but I could definitely get used it) is that you create a webhook receiver with POST /v1/webhook-receivers but then you list and view it with /v1/alert-receivers. I can see why it works that way, and maybe with two kinds of receiver, the structure would be more obvious.

What about creation via /v1/alert-receivers/webhook POST?

One thing I considered doing is also having a GET /v1/webhook-receivers route to list/view webhook receivers only, in addition to the GET routes for /v1/alert-receivers.

My suggestion is that we do one or the other for now i.e. "list all receivers" (which happen to just be webhooks) or "list webhook receivers" (which happen to be all receivers). In the future, I can't imagine the utility for listing JUST the webhooks, but I have been known to lack imagination!

That would return the webhook-specific models, and the view route would 404 if you requested a receiver ID/name that was a type other than webhook. I can see a couple advantages of this: it makes the API feel a bit more "complete", and it also provides you a way to get a webhook-receiver-specific model without having to handle the enum if you know the receiver you want is a webhook receiver (and similarly, it gives you a way to list only webhook receivers). On the other hand, it means we have two separate routes that list/view the same entities, which could be confusing for users, and it requires us to maintain more endpoints. What do you think? Is it worth adding routes like that?

Given that a second flavor of webhook is speculation at this point, I would suggest just pick something and re-evaluate when we have something more concrete in the future.

hawkw · 2025-05-16T18:41:03Z

@david-crespo re:

One thing that throws me off (but I could definitely get used it) is that you create a webhook receiver with POST /v1/webhook-receivers but then you list and view it with /v1/alert-receivers. I can see why it works that way, and maybe with two kinds of receiver, the structure would be more obvious.

What about creation via /v1/alert-receivers/webhook POST?

Unfortunately we can't have a route like that, as receivers are looked up by name, and there's an ambiguity as to whether /webhook is interpreted as a name of a receiver to look up or a fixed path segment (as i discussed in #8169 (comment))

hawkw added 16 commits May 8, 2025 14:47

WHEW

b8fdb3a

more pain

0f3c38b

reticulating views

6694d25

rename some internal files

4f4f371

split "webhook" and "alert" fns into separate files

9f6b866

rename absolutely E•V•E•R•Y•T•H•I•N•G

84fd31e

more fixy

57168cf

omdb expectorate update

01cc404

polar should use correct name

b1dd690

update authz endpoint tests

2385fda

update webhook tests

b850c5f

migration to rename DB tables

33abcdd

had to go back and un-rename some columns since apparently CRDB can't do that (sad face!)

rename typed uuid kind i forgot about

904c9e1

more internal renaming

571abf1

Merge branch 'main' into eliza/s/webhook/alert

3e62416

post merge fixup

47b19dc

hawkw requested review from ahl, smklein and david-crespo May 15, 2025 19:11

hawkw added 5 commits May 15, 2025 12:18

upadte expectorate query tests

52c4d77

make clippy happy

63352e6

docs embetterment

953196c

forgot to commit some of the docs embetterment

ac3e846

OH GOD THERES MORE OF THEM

9a6ba64

i gotta stop forgetting expectorate tests

smklein reviewed May 16, 2025

View reviewed changes

hawkw and others added 3 commits May 16, 2025 09:53

Update shared.rs

ea8feae

Co-authored-by: Sean Klein <sean@oxide.computer>

Update shared.rs

cd89bce

Co-authored-by: Sean Klein <sean@oxide.computer>

Update shared.rs

858c231

Co-authored-by: Sean Klein <sean@oxide.computer>

hawkw added 4 commits May 16, 2025 10:19

fix accidentally renamed tests

a05697c

fix gross attribute formatting

1df6022

remove commented out code

fdadcf3

fix name of migration

e5267f8

hawkw commented May 16, 2025

View reviewed changes

schema/crdb/alerts-renamening/up04.sql Outdated Show resolved Hide resolved

ahl reviewed May 16, 2025

View reviewed changes

hawkw added 6 commits May 16, 2025 13:17

MAKE THE MIGRATIONS ACTUALLY WORK

10168a0

update openapi again

31b01fd

whoopsie

affced5

make delivery list return an enum

8e8b6d9

move delivery list impl to alerts.rs

e06cf1f

make delivery list and probe APIs not webhook specific

eae40e0

hawkw requested a review from ahl May 19, 2025 18:29

hawkw added 2 commits May 19, 2025 12:16

fix urls in integration tests

f1cd64d

oops the status for timeouts is supposed to be none

56399ec

hawkw force-pushed the eliza/s/webhook/alert branch from 9d384ea to 56399ec Compare May 19, 2025 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[external API] alerts: the renamening #8169

[external API] alerts: the renamening #8169

hawkw commented May 15, 2025

david-crespo commented May 15, 2025

smklein May 16, 2025

hawkw commented May 16, 2025

hawkw commented May 16, 2025

ahl left a comment

ahl May 16, 2025

hawkw May 16, 2025

ahl May 16, 2025

david-crespo May 19, 2025

hawkw May 19, 2025

david-crespo May 19, 2025

hawkw May 19, 2025

ahl May 16, 2025

hawkw May 16, 2025

ahl May 16, 2025

hawkw May 16, 2025

ahl commented May 16, 2025

hawkw commented May 16, 2025

[external API] alerts: the renamening #8169

Are you sure you want to change the base?

[external API] alerts: the renamening #8169

Conversation

hawkw commented May 15, 2025

david-crespo commented May 15, 2025

Choose a reason for hiding this comment

hawkw commented May 16, 2025

hawkw commented May 16, 2025

ahl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahl commented May 16, 2025

hawkw commented May 16, 2025