Skip to content

Timezone for partitioning of Iceberg tables#1349

Merged
zvonand merged 9 commits intoantalya-25.8from
feature/antalya-25.8/timezone_for_partitioning
Mar 17, 2026
Merged

Timezone for partitioning of Iceberg tables#1349
zvonand merged 9 commits intoantalya-25.8from
feature/antalya-25.8/timezone_for_partitioning

Conversation

@ianton-ru
Copy link

@ianton-ru ianton-ru commented Jan 27, 2026

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Setting iceberg_partition_timezone with timezone, used to create Iceberg tables partitions.

Documentation entry for user-facing changes

Solved #1299
When Iceberg table created with third-party tools, data are split on partitions with specific time zone, UTC in most cases.
But ClickHouse tries to make partition pruning based on server time zone, and when server time zone is not UTC, some data can be incorrectly pruned during select queries.
This PR introduces new setting iceberg_partition_timezone, which contains time zone used for partitions. This time zone can be different from server time zone, session time zone or column time zone.
Default value is empty for backward compatibility. Empty value means 'use current time zone' as before.

Support for writing is added but not tested, due to it not working in antalya branch and in upstream master (no related to current PR):
#1350
Try to investigate it later.

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • Tiered Storage (2h)

@ianton-ru
Copy link
Author

@codex review

@github-actions
Copy link

github-actions bot commented Jan 27, 2026

Workflow [PR], commit [f30b2b2]

@chatgpt-codex-connector
Copy link

Codex Review: Something went wrong. Try again later by commenting “@codex review”.

We were unable to download your code in a timely manner.
ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ianton-ru ianton-ru changed the title Feature/antalya 25.8/timezone for partitioning Timezone for partitioning of Iceberg tables Jan 27, 2026
@ianton-ru
Copy link
Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

arthurpassos
arthurpassos previously approved these changes Jan 29, 2026
Copy link
Collaborator

@arthurpassos arthurpassos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should submit this to upstream and get their opinion as well. Otherwise, as far as I can evaluate this PR, it looks good to me.

{
String transform_name;
std::optional<size_t> argument;
std::optional<String> time_zone;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, the TransformAndArgument struct is a generic struct that holds a function name and a single optional argument. It can hold any function, not only date specific functions. I suggest you add a comment explaining why this optional time_zone field exists.

DECLARE(Bool, serialize_string_in_memory_with_zero_byte, true, R"(
Serialize String values during aggregation with zero byte at the end. Enable to keep compatibility when querying cluster of incompatible versions.
)", 0) \
DECLARE(Timezone, iceberg_partition_timezone, "", R"(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a short explanation on how this interacts with server / session timezone settings?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not interact. Server or session time zone is used when setting is not set, it is in empty value describe below

arthurpassos
arthurpassos previously approved these changes Feb 3, 2026
@ianton-ru ianton-ru changed the title Timezone for partitioning of Iceberg tables https://github.com/Altinity/ClickHouse/pull/1453 Feb 26, 2026
@ianton-ru ianton-ru changed the title https://github.com/Altinity/ClickHouse/pull/1453 #1453 Feb 26, 2026
@ianton-ru ianton-ru changed the title #1453 Timezone for partitioning of Iceberg tables Feb 27, 2026
@CarlosFelipeOR
Copy link
Collaborator

CarlosFelipeOR commented Mar 17, 2026

QA Verification

PR Summary

This PR ports the iceberg_partition_timezone setting to antalya-25.8, enabling correct date/time partitioning in Iceberg tables when the server timezone differs from the data timezone (forward-port from antalya-26.1 PR #1453).

Integration Tests (Altinity/ClickHouse CI)

CI run #22498397932

PR test passed:

  • test_database_iceberg/test_partition_timezone.py::test_partition_timezonePASSED in Integration tests (amd_binary, 2/5) (json.html)

Regression tests (release build) also passed:

  • RegressionTestsRelease / Iceberg (1) — ✅ PASSED
  • RegressionTestsRelease / Iceberg (2) — ✅ PASSED

Unrelated failures (confirmed via CI database):

Test / Job Status Root Cause
Stateless tests (amd_tsan, parallel, 2/2) 1 FAIL 03212_variant_dynamic_cast_or_default — pre-existing flaky test. Confirmed failing across PRs #1339, #1201, #1150, #1082, #1097, and master (PR=0) over the past months. No connection to Iceberg or timezone.
Stateless tests (amd_tsan, s3 storage, parallel) ERROR failure: Start ClickHouse Server — persistent infrastructure issue on this specific job. Confirmed failing on master (PR=0) continuously throughout March 2026, including on the same day as this PR's CI run. Unrelated to code changes.

Test Coverage Notes

The integration test covers the primary scenario: partition pruning with iceberg_partition_timezone using DayTransform on TimestampType with a positive timezone offset (Asia/Istanbul, UTC+3). This matches the coverage introduced in PR #1453 for antalya-26.1.

Note: issue #1487 (unquoted timezone in sort key expression) does not affect antalya-25.8, as the sort key code path using getSortingKeyDescriptionFromMetadata() was not present in this version. That bug is specific to antalya-26.1 and addressed by PR #1526.

Conclusion

test_partition_timezone passed in CI. All other failures are confirmed pre-existing/infrastructure issues unrelated to the PR changes. Approved.

@CarlosFelipeOR CarlosFelipeOR added the verified Verified by QA label Mar 17, 2026
@zvonand zvonand merged commit 366fb38 into antalya-25.8 Mar 17, 2026
506 of 510 checks passed
@CarlosFelipeOR
Copy link
Collaborator

AI audit note: This review comment was generated by AI (gpt-5.3-codex).

Audit update for PR #1349 (Timezone for partitioning of Iceberg tables):

Confirmed defects:

  • No confirmed defects in reviewed scope.

Coverage summary:

  • Scope reviewed: all changed files in PR Timezone for partitioning of Iceberg tables #1349, focused on Iceberg transform/timezone propagation for partition pruning and writes (parseTransformAndArgument, getASTFromTransform, ChunkPartitioner in IcebergWrites), plus setting declaration/history and new integration test/config.
  • Categories failed: none.
  • Categories passed: setting validation/registration, transform dispatch and argument construction paths, partition-pruning AST generation with timezone literal, write-path partition function execution with timezone argument, error-contract consistency for unsupported transforms, and C++ risk checks in touched paths (lifetime/race/deadlock/exception-safety/UB) with no confirmed issues.
  • Assumptions/limits: static audit only (no runtime execution in this session); conclusions are limited to PR Timezone for partitioning of Iceberg tables #1349 diff and directly connected code paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants