Skip to content

Cluster Joins part 2 - global mode#1527

Open
ianton-ru wants to merge 5 commits intoantalya-26.1from
feature/antalya-26.1/json_part2
Open

Cluster Joins part 2 - global mode#1527
ianton-ru wants to merge 5 commits intoantalya-26.1from
feature/antalya-26.1/json_part2

Conversation

@ianton-ru
Copy link

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Cluster Joins part 2 - global mode

Documentation entry for user-facing changes

Setting object_storage_cluster_join_mode wiith value global.
In queries like

SELECT * FROM iceberg_table(...) JOIN local_table(...) ON ...

when left table is executed on cluster (s3Cluster, Iceberg with object_storage_cluster setting, etc.) data from right table is extracted and sent to swarm nodes as temorary tables. JOIN is executed on swarm nodes.

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • Tiered Storage (2h)

@ianton-ru
Copy link
Author

@codex review

@github-actions
Copy link

github-actions bot commented Mar 13, 2026

Workflow [PR], commit [e15af07]

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 38e89f4657

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +384 to +385
if (send_external_tables)
external_tables = query_info.planner_context->getMutableQueryContext()->getExternalTables();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reset external-table forwarding state per query

send_external_tables is process state on the storage object and is only ever set to true in the GLOBAL-join path, so once one query enables it, later reads in other modes still take this branch; in those later reads query_info.planner_context may be null (old-analyzer path), so query_info.planner_context->getMutableQueryContext() can crash, and even when non-null it forwards external tables for unrelated queries. This needs to be scoped/reset per read instead of persisting across requests.

Useful? React with 👍 / 👎.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

send_external_tables can be true only when object_storage_cluster_join_mode='global'. But if allow_experimental_analyzer=0 with object_storage_cluster_join_mode='global', exception n getQueryProcessingStage must be thrown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant