Metrics points and tags/partition index runtime binding#85
Draft
gene-bordegaray wants to merge 13 commits intobranch-52from
Draft
Metrics points and tags/partition index runtime binding#85gene-bordegaray wants to merge 13 commits intobranch-52from
gene-bordegaray wants to merge 13 commits intobranch-52from
Conversation
Includes both lines of work needed for metrics-points-and-tags: - Generic PhysicalExpr runtime binding API (bind_runtime + traversal helpers) - Partition-aware dynamic filter routing for file-partitioned joins (Hash/PartitionIndex/Global) - Datasource wiring for partition-specific DynamicFilter evaluation - Associated tests and sqllogictest updates
…ization and deserialization processes (apache#19437) - Closes apache#18477 This PR adds a new trait for converting to and from Protobuf objects and Physical expressions and plans. - Add `PhysicalExtensionProtoCodec` and default implementation. - Update all methods in the physical encoding/decoding methods to use this trait. - Added two examples - Added unit test Two examples and round trip unit test are added. If users are going through the recommended interfaces in the documentation, `logical_plan_to_bytes` and `logical_plan_from_bytes` they will have no user facing change. If they are instead calling into the inner methods `PhysicalPlanNode::try_from_physical_plan` and so on, then they will need to provide a proto converter. A default implementation is provided. --------- Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>
… unique identifiers (apache#20037) Replaces apache#18192 using the APIs in apache#19437. Similar to apache#18192 the end goal here is specifically to enable deduplication of `DynamicFilterPhysicalExpr` so that distributed query engines can get one step closer to using dynamic filters. Because it's actually simpler we apply this deduplication to all `PhysicalExpr`s with the added benefit that we more faithfully preserve the original expression tree (instead of adding new duplicate branches) which will have the immediate impact of e.g. not duplicating large `InListExpr`s.
Informs: datafusion-contrib/datafusion-distributed#180 Closes: apache#20418 Consider this scenario 1. You have a plan with a `HashJoinExec` and `DataSourceExec` 2. You run the physical optimizer and the `DataSourceExec` accepts `DynamicFilterPhysicalExpr` pushdown from the `HashJoinExec` 3. You serialize the plan, deserialize it, and execute it What should happen is that the dynamic filter should "work", meaning 1. When you deserialize the plan, both the `HashJoinExec` and `DataSourceExec` should have pointers to the same `DynamicFilterPhysicalExpr` 2. The `DynamicFilterPhysicalExpr` should be updated during execution by the `HashJoinExec` and the `DataSourceExec` should filter out rows This does not happen today for a few reasons, a couple of which this PR aims to address 1. `DynamicFilterPhysicalExpr` is not survive round-tripping. The internal exprs get inlined (ex. it may be serialized as `Literal`) 2. Even if `DynamicFilterPhysicalExpr` survives round-tripping, during pushdown, it's often the case that the `DynamicFilterPhysicalExpr` is rewritten. In this case, you have two `DynamicFilterPhysicalExpr` which are different `Arc`s but share the same `Inner` dynamic filter state. The current `DeduplicatingProtoConverter` does not handle this specific form of deduping. This PR aims to fix those problems by adding serde for `DynamicFilterPhysicalExpr` and deduping logic for the inner state of dynamic filters. It does not yet add a test for the `HashJoinExec` and `DataSourceExec` filter pushdown case, but this is relevant follow up work. I tried to keep the PR small for reviewers. Yes, via unit tests. `DynamicFilterPhysicalExpr` are now serialized by the default codec
Fixups for the cherry-picked commits from PRs apache#19437, apache#20037, apache#20416, and jayshrivastava#2 to work with branch-52's partition-index APIs: - Update remap_children callers to use instance method signature - Adapt DynamicFilterUpdate::Global enum for new code paths - Add missing partitioned_exprs/runtime_partition fields to new constructors - Remove null_aware field (not on branch-52) - Replace FilterExecBuilder with FilterExec::try_new - Remove non-compiling tests that depend on upstream-only APIs - Fix duplicate imports in roundtrip test file Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Regenerate pbjson.rs from proto definition to fix stale artifacts - Fix cargo fmt issues - Remove unfulfilled #[expect(deprecated)] on CoalesceBatchesExec - Add missing #[test] attribute on partition test - Remove unused EmptyExec import in aggregates tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…mic filter deduping to work in custom plan nodes
Remove temporary proto debug logging while wiring routing mode through protobuf encode/decode.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.