Add docs for the changefeed kafka header option #19588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

katmayb wants to merge 1 commit into main from headers-kafka-cdc-25.2

Contributor

katmayb commented May 6, 2025 •

edited

Loading

Fixes DOC-12596

This PR adds docs for the new Kafka header option:

Section added to Changefeed Messages page that provides a full use case example.
Shorter description added to full list of options on CREATE CHANGEFEED page, which links back to the new section example on the message page.

Preview

https://deploy-preview-19588--cockroachdb-docs.netlify.app/docs/v25.2/changefeed-messages.html#specify-a-column-as-a-kafka-header

github-actions bot commented May 6, 2025

Files changed:

netlify bot commented May 6, 2025 •

edited

Loading

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name	Link
🔨 Latest commit	`d3fe09b`
🔍 Latest deploy log	https://app.netlify.com/sites/cockroachdb-interactivetutorials-docs/deploys/681b6c100d12080008680e0b

netlify bot commented May 6, 2025 •

edited

Loading

✅ Deploy Preview for cockroachdb-api-docs canceled.

Name	Link
🔨 Latest commit	`d3fe09b`
🔍 Latest deploy log	https://app.netlify.com/sites/cockroachdb-api-docs/deploys/681b6c102b7cf8000923b7e5

netlify bot commented May 6, 2025 •

edited

Loading

✅ Netlify Preview

Name	Link
🔨 Latest commit	`d3fe09b`
🔍 Latest deploy log	https://app.netlify.com/sites/cockroachdb-docs/deploys/681b6c10fcac9e0008a95c59
😎 Deploy Preview	https://deploy-preview-19588--cockroachdb-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.


          Add docs for the changefeed kafka header option

d3fe09b

katmayb force-pushed the headers-kafka-cdc-25.2 branch from 04d8216 to d3fe09b Compare

May 7, 2025 14:19

katmayb requested a review from asg0451

May 7, 2025 15:18

Contributor Author

katmayb commented May 7, 2025

cc @rohan-joshi

asg0451 requested changes

View reviewed changes

src/current/v25.2/changefeed-messages.md

+              {% include_cached new-in.html version="v25.2" %} Use the `headers_json_column_name` option to specify a [`JSONB`]({% link {{ page.version.version }}/jsonb.md %}) column that the changefeed will emit as a Kafka header for each row. You can send metadata, such as routing or tracing information, at the protocol level in the header separate from the message payload. This allows for Kafka brokers or routers to filter the metadata the header contains without deserializing the payload.
+              {{site.data.alerts.callout_info}}
+              The `headers_json_column_name` option is supported with changefeeds emitting [JSON](#json) or [Avro](#avro) messages to [Kafka sinks]({% link {{ page.version.version }}/changefeed-sinks.md %}).

asg0451 May 8, 2025

it's kafka only, and kafka only supports those two formats, but the formats have nothing to do with the headers option

src/current/v25.2/changefeed-messages.md


		The Kafka topic receives the message payload containing the row-level change:

		~~~json

asg0451 May 8, 2025

can you also show the headers that would be received by kafka? rather than showing a query which would simulate it.

perhaps something like

| key | value | headers |
| A    |  {..}     | x=y, z=q|

src/current/v25.2/changefeed-messages.md

+              (5 rows)
+              ~~~
+              You may need to duplicate fields between the message envelope and headers to support efficient routing and filtering by intermediate systems, such as Kafka brokers, stream processors, or observability tools, but still maintain the full context of the change in the message for downstream applications.

asg0451 May 8, 2025

what is this referring to?

asg0451 May 8, 2025

can you also note use cases like distributed tracing

src/current/v25.2/changefeed-messages.md

+              {% include_cached copy-clipboard.html %}
+              ~~~ sql
+              INSERT INTO customer_updates (

asg0451 May 8, 2025

can you make an example where the data isn't as duplicated?

src/current/v25.2/changefeed-messages.md

+              The Kafka topic receives the message payload containing the row-level change:
+              ~~~json
+              {"after": {"change_description": "Updated phone number", "change_version": "v2", "customer_id": "5896dc90-a972-43e8-b69b-8b5a52691ce2", "operation_type": "update", "source_system": "crm_mobile", "update_id": "39a7bb4c-ee3b-4897-88fd-cfed94558e72", "updated_at": "2025-05-06T14:57:42.378814Z"}}

asg0451 May 8, 2025

can you call out that the headers col is omitted

src/current/v25.2/create-changefeed.md

@@ @@ -122,6 +122,7 @@ Option | Value | Description @@
               <a name="format"></a>`format` | `json` / `avro` / `csv` / `parquet` | Format of the emitted message. <br><br>`avro`: For mappings of CockroachDB types to Avro types, [refer-to-the-table]({% link {{ page.version.version }}/changefeed-messages.md %}#avro-types) and detail on [Avro-limitations]({% link {{ page.version.version }}/changefeed-messages.md %}#avro-limitations). **Note:** [`confluent_schema_registry`](#confluent-schema-registry) is required with `format=avro`. <br><br>`csv`: You cannot combine `format=csv` with the [`diff`](#diff) or [`resolved`](#resolved) options. Changefeeds use the same CSV format as the [`EXPORT`](export.html) statement. Refer to [Export-data-with-changefeeds]({% link {{ page.version.version }}/export-data-with-changefeeds.md %}) for details using these options to create a changefeed as an alternative to `EXPORT`. **Note:** [`initial_scan = 'only'`](#initial-scan) is required with `format=csv`. <br><br>`parquet`: Cloud storage is the only supported sink. The [`topic_in_value`](#topic-in-value) option is not compatible with `parquet` format.<br><br>Default: `format=json`.
               <a name="full-table-name"></a>`full_table_name` | N/A | Use fully qualified table name in topics, subjects, schemas, and record output instead of the default table name. This can prevent unintended behavior when the same table name is present in multiple databases.<br><br>**Note:** This option cannot modify existing table names used as topics, subjects, etc., as part of an [`ALTER CHANGEFEED`]({% link {{ page.version.version }}/alter-changefeed.md %}) statement. To modify a topic, subject, etc., to use a fully qualified table name, create a new changefeed with this option. <br><br>Example: `CREATE CHANGEFEED FOR foo... WITH full_table_name` will create the topic name `defaultdb.public.foo` instead of `foo`.
               <a name="gc-protect-expires-after"></a>`gc_protect_expires_after` | [Duration string](https://pkg.go.dev/time#ParseDuration) | Automatically expires protected timestamp records that are older than the defined duration. In the case where a changefeed job remains paused, `gc_protect_expires_after` will trigger the underlying protected timestamp record to expire and cancel the changefeed job to prevent accumulation of protected data.<br><br>Refer to [Protect-Changefeed-Data-from-Garbage-Collection]({% link {{ page.version.version }}/protect-changefeed-data.md %}) for more detail on protecting changefeed data.
+              <a name="headers-json-column-name"></a><span class="version-tag">New in v25.2:</span>`headers_json_column_name` | [`STRING`]({% link {{ page.version.version }}/string.md %}) | Specify a [`JSONB`]({% link {{ page.version.version }}/jsonb.md %}) column that the changefeed will emit as a Kafka header for each row. Supported for JSON and Avro message format emitting to Kafka sinks. For more details, refer to [Specify a column as a Kafka header]({% link {{ page.version.version }}/changefeed-messages.md %}#specify-a-column-as-a-kafka-header).

asg0451 May 8, 2025

ditto re format

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet