Skip to content

sql: distSQLSpecExecFactory.ConstructVectorSearch not using InitAllowingExternalRowData #146125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
michae2 opened this issue May 5, 2025 · 2 comments
Labels
A-vector-index branch-master Failures and bugs on the master branch. branch-release-25.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-qa T-sql-queries SQL Queries Team

Comments

@michae2
Copy link
Collaborator

michae2 commented May 5, 2025

In this slack conversation, @andy-kimball pointed out that sql.(*distSQLSpecExecFactory).ConstructVectorSearch is using span.(*Builder).Init instead of span.(*Builder).InitAllowingExternalRowData and so will get an internal error when used on external row data.

This is pretty tough to hit. It requires SET experimental_distsql_planning = always; + vector search + external row data. Here's a repro using cockroach demo:

-- 1. create a table with a vector index

SET CLUSTER SETTING feature.vector_index.enabled = on;

CREATE TABLE abc (
  a INT PRIMARY KEY,
  b INT,
  c VECTOR(3),
  VECTOR INDEX (b, c)
);

INSERT INTO abc VALUES (1, 2, '[1, 2, 3]');
ANALYZE abc;
SELECT * FROM abc WHERE b = 2 ORDER BY c <-> '[1, 2, 3]' LIMIT 1;

-- 2. connect to the system tenant and start replication to a new tenant

\demo ls
\connect '<url of system tenant here>'

CREATE VIRTUAL CLUSTER standby FROM REPLICATION OF demoapp ON 'demo://system' WITH READ VIRTUAL CLUSTER;
-- wait until the standby-readonly tenant data_state is ready
SHOW VIRTUAL CLUSTERS;

-- 3. connect to the standby-readonly tenant and try querying the table

\connect '<url of demoapp tenant with demoapp replaced with standby-readonly>'

SHOW TABLES;
SET experimental_distsql_planning = always;
SELECT * FROM abc WHERE b = 2 ORDER BY c <-> '[1, 2, 3]' LIMIT 1;

The internal error looks like this:

demo@[local:/Users/michae2/.cockroach-demo]:26257/standby-readonly/defaultdb> SELECT * FROM abc WHERE b = 2 ORDER BY c <-> '[1, 2, 3]' LIMIT 1;
ERROR: internal error: abc uses external row data
SQLSTATE: XX000
DETAIL: stack trace:
pkg/sql/span/span_builder.go:49: Init()
pkg/sql/distsql_spec_exec_factory.go:1471: ConstructVectorSearch()
bazel-out/darwin_arm64-fastbuild/bin/pkg/sql/opt/exec/explain/plan_gist_factory.og.go:1223: ConstructVectorSearch()
bazel-out/darwin_arm64-fastbuild/bin/pkg/sql/opt/exec/explain/explain_factory.og.go:2108: ConstructVectorSearch()
pkg/sql/opt/exec/execbuilder/relational.go:3979: buildVectorSearch()
pkg/sql/opt/exec/execbuilder/relational.go:273: buildRelational()
pkg/sql/opt/exec/execbuilder/relational.go:2733: buildLookupJoin()
pkg/sql/opt/exec/execbuilder/relational.go:228: buildRelational()
pkg/sql/opt/exec/execbuilder/relational.go:1152: buildProject()
pkg/sql/opt/exec/execbuilder/relational.go:203: buildRelational()
pkg/sql/opt/exec/execbuilder/relational.go:2153: buildTopK()
pkg/sql/opt/exec/execbuilder/relational.go:213: buildRelational()
pkg/sql/opt/exec/execbuilder/builder.go:380: build()
pkg/sql/opt/exec/execbuilder/builder.go:298: Build()
pkg/sql/plan_opt.go:936: runExecBuilder()
pkg/sql/plan_opt.go:285: runExecBuild()
pkg/sql/plan_opt.go:268: makeOptimizerPlan()
pkg/sql/conn_executor_exec.go:3308: makeExecPlan()
pkg/sql/conn_executor_exec.go:2855: dispatchToExecutionEngine()
pkg/sql/conn_executor_exec.go:1080: execStmtInOpenState()
pkg/sql/conn_executor_exec.go:172: func2()
pkg/sql/conn_executor_exec.go:4477: execWithProfiling()
pkg/sql/conn_executor_exec.go:171: execStmt()
pkg/sql/conn_executor.go:2342: func1()
pkg/sql/conn_executor.go:2347: execCmd()
pkg/sql/conn_executor.go:2264: run()
pkg/sql/conn_executor.go:1048: ServeConn()
pkg/sql/pgwire/conn.go:252: processCommands()
pkg/sql/pgwire/server.go:1197: func4()
src/runtime/asm_arm64.s:1223: goexit()

HINT: You have encountered an unexpected error.

Please check the public issue tracker to check whether this problem is
already tracked. If you cannot find it there, please report the error
with details by creating a new issue.

If you would rather not post publicly, please contact us directly
using the support form.

We appreciate your feedback.

Jira issue: CRDB-50399

@michae2 michae2 added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label May 5, 2025

This comment has been minimized.

@michae2 michae2 added branch-master Failures and bugs on the master branch. T-sql-queries SQL Queries Team branch-release-25.2 labels May 5, 2025
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries May 5, 2025
@michae2 michae2 added A-cross-cluster-replication Related to cross-cluster replication (PCR or LDR) A-vector-index and removed A-cross-cluster-replication Related to cross-cluster replication (PCR or LDR) labels May 5, 2025
@michae2
Copy link
Collaborator Author

michae2 commented May 5, 2025

(as part of fixing this we should rename span.(*Builder).Init and InitAllowingExternalRowData to something easier to understand...)

@michae2 michae2 moved this from Triage to Bugs to Fix in SQL Queries May 6, 2025
@michae2 michae2 added the O-qa label May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-vector-index branch-master Failures and bugs on the master branch. branch-release-25.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-qa T-sql-queries SQL Queries Team
Projects
Status: Bugs to Fix
Development

No branches or pull requests

1 participant