GH-35500: [C++][Go][Java][FlightRPC] Add support for result set expiration#36009
GH-35500: [C++][Go][Java][FlightRPC] Add support for result set expiration#36009kou merged 91 commits intoapache:mainfrom
Conversation
|
|
|
It's not completed yet. See TODO list in the description. |
format/Flight.proto
Outdated
There was a problem hiding this comment.
We may want to remove Action:
| message ActionCancelFlightInfoResult { | |
| message CancelFlightInfoResult { |
format/Flight.proto
Outdated
There was a problem hiding this comment.
We may want to remove Cancel:
| enum CancelResult { | |
| enum Result { |
Or we may want to move this to top-level and return this enum directly instead of wrapping by ActionCancelFlightInfoResult message.
There was a problem hiding this comment.
At least in Protobuf/gRPC itself, you can only return messages and not bare enums
format/Flight.proto
Outdated
There was a problem hiding this comment.
CANCEL_RESULT_ prefix may be redundant.
| CANCEL_RESULT_UNSPECIFIED = 0; | |
| UNSPECIFIED = 0; |
There was a problem hiding this comment.
We should use prefix.
https://protobuf.dev/programming-guides/style/#enums
Prefer prefixing enum values instead of surrounding them in an enclosing message.
There was a problem hiding this comment.
Yes, the reason (in case you didn't find it) is that enums in Protobuf use C-style namespacing, which is carried over into generated code. So it will end up as ActionCancelFlightInfoResult::[CANCEL_RESULT_]UNSPECIFIED, and so if you ever need a second enum, you need to prefix the name.
There was a problem hiding this comment.
I moved CancelResult to top-level to reduce ActionCancelFlightInfoResult::... prefix.
Is it OK?
There was a problem hiding this comment.
Can we use CancelStatus or something instead of CancelResult to use Result in CancelResult and ActionCancelFlightInfoResult?
There was a problem hiding this comment.
Ah, to avoid confusion with Arrow Result?
I guess Arrow also uses Status, so both are the same to me. If using CancelStatus is clearer to you, then it sounds good to me.
There was a problem hiding this comment.
Ah, to avoid confusion with Arrow
Result?
No. I just want to avoid Result duplication in CancelFlightInfoResult and CancelResult.
There was a problem hiding this comment.
BTW, we have message Result in Flight.proto. It may confuse us with Arrow Result...
There was a problem hiding this comment.
BTW, we have
message ResultinFlight.proto. It may confuse us with ArrowResult...
Yes, the Flight one actually happened to come first, but it's unfortunate...
No. I just want to avoid Result duplication in CancelFlightInfoResult and CancelResult.
Ok. I think status is also good then.
|
@zeroshade Thanks for the offer! I have the Go implementation in local and I just need to implement one more integration test to cover all cases implemented in the C++ implementation. I'll be able to complete it and push to this branch today. But the Go implementation is needed to be cleaned up. Please review and/or push improvements to this branch after I push the Go implementation. |
ec1d2ad to
c2f2f05
Compare
|
The vote carried: https://lists.apache.org/thread/no26s310qn3v0n5x830d50k598fh0pvr I'll merge this. |
|
Conbench analyzed the 6 benchmark runs on commit There were 7 benchmark results indicating a performance regression:
The full Conbench report has more details. |
… expiration (apache#36009) ### Rationale for this change Currently, it is undefined whether a client can call DoGet more than once. Clients may want to retry requests, and servers may not want to persist a query result forever. ### What changes are included in this PR? Add an expiration time to FlightEndpoint. If present, clients may assume they can retry DoGet requests. Otherwise, clients should avoid retrying DoGet requests. This is not a full retry protocol. Also, add "pre-defined" actions to Flight RPC for working with result sets. These are pre-defined Protobuf messages with standardized encodings for use with DoAction: * CancelFlightInfo: Asynchronously cancel the execution of a distributed query. (Replaces the equivalent Flight SQL action.) * RefreshFlightEndpoint: Request an extension of the expiration of a FlightEndpoint. This lets the ADBC/JDBC/ODBC drivers for Flight SQL explicitly manage result set lifetimes. These can be used with Flight SQL as regular actions. #### Backward compatibility Flight SQL's CancelQuery is deprecated by Flight RPC's CancelFlightInfo. But some clients may not be able to migrate to CancelFlightInfo entirely. If a client can assume that a server requires 13.0.0 or later, the client can always use CancelFlightInfo. Otherwise, the client need to use CancelQuery (for old server) and/or CancelFlightInfo (for newer server). A server needs to implement only CancelFlightInfo. Because Flight SQL server libraries provide the default CancelQuery implementation that delegates to CancelFlightInfo. Clients can use both of Flight SQL's CancelQuery and Flight RPC's CancelFLightInfo by this feature. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * Closes: apache#35500 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
This PR removes C++ APIs deprecated in releases before January 1, 2025, following Arrow's deprecation policy (>1 year old deprecations). **Rationale for This Change** Cleans up deprecated APIs per maintainer request in apacheGH-49153. **What Changes Are Included in This PR?** 1. **ServerAuthHandler::IsValid() (old signature)** - **PR**: apache#35378 (merged May 2, 2023) - **Released**: v13.0.0 (August 17, 2023) - **Age**: 2.5 years old - **Removed**: Old IsValid(token, peer_identity) method - **Made pure virtual**: IsValid(context, token, peer_identity) signature - **Replacement**: Implement the new signature with ServerCallContext parameter 2. **decimal() factory function** - **PR**: apache#43957 (merged September 30, 2024) - **Released**: v18.0.0 (October 16, 2024) - **Age**: 1.3 years old - **Removed**: �rrow::decimal(precision, scale) function - **Replacement**: Use smallest_decimal(precision, scale) instead 3. **FlightSqlServerBase::CancelQuery()** - **PR**: apache#36009 (merged July 3, 2023) - **Released**: v13.0.0 (August 17, 2023) - **Age**: 2.5 years old - **Removed**: - CancelQuery() method - ActionCancelQueryRequest struct - CancelResult enum - kCancelQueryActionType action type - **Replacement**: Use CancelFlightInfo() instead 4. **FlightSqlClient::CancelQuery()** - **PR**: apache#36009 (merged July 3, 2023) - **Released**: v13.0.0 (August 17, 2023) - **Age**: 2.5 years old - **Removed**: Client-side CancelQuery() method - **Replacement**: Use CancelFlightInfo() instead **Are These Changes Tested?** Yes. Verified via codebase search that: - All removed symbols have zero remaining internal usages - Derived ServerAuthHandler classes use the new IsValid() signature - Tests already use the replacement APIs **Are There Any User-Facing Changes?** Yes (breaking changes for C++ users): - Code using deprecated APIs must migrate to documented replacements - No behavior changes for code already using replacement APIs Closes apache#49153
Remove 4 deprecated C++ APIs: - decimal() function (v18.0.0, PR apache#43957) - FlightSql CancelQuery() server/client (v13.0.0, PR apache#36009) - cuda::DefaultMemoryMapper() (v16.0.0, PR apache#40699) All deprecated before 2025. Tests updated to use replacement APIs.
Remove 4 deprecated C++ APIs: - decimal() function (v18.0.0, PR apache#43957) - FlightSql CancelQuery() server/client (v13.0.0, PR apache#36009) - cuda::DefaultMemoryMapper() (v16.0.0, PR apache#40699) All deprecated before 2025. Tests updated to use replacement APIs.
Rationale for this change
Currently, it is undefined whether a client can call DoGet more than once. Clients may want to retry requests, and servers may not want to persist a query result forever.
What changes are included in this PR?
Add an expiration time to FlightEndpoint. If present, clients may assume they can retry DoGet requests. Otherwise, clients should avoid retrying DoGet requests.
This is not a full retry protocol.
Also, add "pre-defined" actions to Flight RPC for working with result sets. These are pre-defined Protobuf messages with standardized encodings for use with DoAction:
This lets the ADBC/JDBC/ODBC drivers for Flight SQL explicitly manage result set lifetimes. These can be used with Flight SQL as regular actions.
Backward compatibility
Flight SQL's CancelQuery is deprecated by Flight RPC's CancelFlightInfo. But some clients may not be able to migrate to CancelFlightInfo entirely.
If a client can assume that a server requires 13.0.0 or later, the client can always use CancelFlightInfo. Otherwise, the client need to use CancelQuery (for old server) and/or CancelFlightInfo (for newer server).
A server needs to implement only CancelFlightInfo. Because Flight SQL server libraries provide the default CancelQuery implementation that delegates to CancelFlightInfo. Clients can use both of Flight SQL's CancelQuery and Flight RPC's CancelFLightInfo by this feature.
Are these changes tested?
Yes.
Are there any user-facing changes?
Yes.