Setting object_storage_remote_initiator#756
Merged
Conversation
Enmk
reviewed
May 7, 2025
| } | ||
| else | ||
| { | ||
| LOG_TEST( |
Member
There was a problem hiding this comment.
Is that a case where we request whole object?
Author
There was a problem hiding this comment.
Iceberg metadata for example
:) CREATE DATABASE datalake ENGINE = Iceberg('http://rest:8181/v1', 'minio', 'minio123') SETTINGS catalog_type = 'rest', storage_endpoint = 'http://minio:9000/warehouse', warehouse = 'iceberg'
:) SELECT * FROM datalake.`iceberg.bids`
Query id: d1bb9862-c077-403f-9843-94fd28173760
┌───────────────────datetime─┬─symbol─┬────bid─┬────ask─┐
1. │ 2019-08-09 08:35:00.000000 │ AAPL │ 198.23 │ 195.45 │
2. │ 2019-08-09 08:35:00.000000 │ AAPL │ 198.25 │ 198.5 │
3. │ 2019-08-07 08:35:00.000000 │ AAPL │ 195.23 │ 195.28 │
4. │ 2019-08-07 08:35:00.000000 │ AAPL │ 195.22 │ 195.28 │
5. │ 2019-08-09 08:35:00.000000 │ AAPL │ 198.23 │ 195.45 │
6. │ 2019-08-09 08:35:00.000000 │ AAPL │ 198.25 │ 198.5 │
└────────────────────────────┴────────┴────────┴────────┘
:) select ProfileEvents['S3GetObject'] from system.query_log where type='QueryFinish' and query_id='d1bb9862-c077-403f-9843-94fd28173760'
┌─arrayElement⋯GetObject')─┐
1. │ 8 │
└──────────────────────────┘
...
grep "Read S3 object" /var/log/clickhouse-server/clickhouse-server.log
2025.05.07 22:38:10.414791 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/00003-ad725ef4-c28e-4ed4-aa4b-2e2aae0716d4.metadata.json, Version: Latest
2025.05.07 22:38:10.416600 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/snap-182060351258856937-0-ff436521-29e9-4437-be5b-eb60f209baa9.avro, Version: Latest
2025.05.07 22:38:10.418360 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/ff436521-29e9-4437-be5b-eb60f209baa9-m0.avro, Version: Latest
2025.05.07 22:38:10.420138 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/6f3e6993-47c9-4556-b70c-c6c48d2ced6f-m0.avro, Version: Latest
2025.05.07 22:38:10.421658 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/f0de1c43-e367-4e3d-8c9d-4076d8fb0cbd-m0.avro, Version: Latest
2025.05.07 22:38:10.426911 [ 767 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/data/datetime_day=2019-08-09/00000-0-ff436521-29e9-4437-be5b-eb60f209baa9.parquet, Version: Latest, Range: 0-1643
2025.05.07 22:38:10.427003 [ 762 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/data/datetime_day=2019-08-09/00000-0-6f3e6993-47c9-4556-b70c-c6c48d2ced6f.parquet, Version: Latest, Range: 0-1643
2025.05.07 22:38:10.427050 [ 770 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/data/datetime_day=2019-08-07/00000-0-f0de1c43-e367-4e3d-8c9d-4076d8fb0cbd.parquet, Version: Latest, Range: 0-1635
I added this for consistency, all requests count in ProfileEvents['S3GetObject'], but in logs only part of requests.
ianton-ru
pushed a commit
that referenced
this pull request
Jun 3, 2025
…nitiator Setting object_storage_remote_initiator
ianton-ru
pushed a commit
that referenced
this pull request
Jun 3, 2025
…nitiator Setting object_storage_remote_initiator
ianton-ru
pushed a commit
that referenced
this pull request
Jun 4, 2025
…nitiator Setting object_storage_remote_initiator
Enmk
added a commit
that referenced
this pull request
Jun 4, 2025
…rage_remote_initiator 25.3 Antalya port of #756 - object storage cluster function
ianton-ru
pushed a commit
that referenced
this pull request
Sep 9, 2025
…nitiator Setting object_storage_remote_initiator
13 tasks
25 tasks
25 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Make remote call of object storage cluster function.
Documentation entry for user-facing changes
Execute query
as
where
swarm_nodeis a random node fromswarmcluster.Requirements -
swarmcluster must know about cluster with nameswarm. In 'classic' old way only local initiator must know aboutswarm.Also method
getDataFilesreturned (was removed as unused in ClickHouse#78775)And small optimization - reusingsample_pathin StorageObjectStorage (get once in StorageObjectStorageCluster), and gettingsample_pathfrom metadata inresolveSchemaAndFormatOptimization removed because of strange side effects - inconsistent column type detection (
LowCardinalityinstead ofNullablein some cases).