Skip to content

feat: add json serde for expressions#553

Open
evindj wants to merge 1 commit intoapache:mainfrom
evindj:unbound_term_serde
Open

feat: add json serde for expressions#553
evindj wants to merge 1 commit intoapache:mainfrom
evindj:unbound_term_serde

Conversation

@evindj
Copy link
Contributor

@evindj evindj commented Feb 2, 2026

This is the second PR in addressing serde for expression for predicate push down. It focuses on unbound transforms and and named references.
Added unit tests to ensure that serialization happens correctly. while name references will map just names like schema columns, unbound transforms will make sure take care of transform serde. for example it will ensure

{ "type": "transform", "transform": "bucket[16]", "term": "id" }

is converted to an object and vice versa.

@evindj evindj force-pushed the unbound_term_serde branch from 602bef1 to 684d85c Compare February 2, 2026 16:51

Result<std::shared_ptr<NamedReference>> NamedReferenceFromJson(
const nlohmann::json& json) {
if (!json.is_string()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a [[unlikely]] hint?

* under the License.
*/

#include <cctype>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this header?

nlohmann::json ToJson(const UnboundTransform& transform) {
auto& mutable_transform = const_cast<UnboundTransform&>(transform);
nlohmann::json json;
json[kType] = std::string(kTransform);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
json[kType] = std::string(kTransform);
json[kType] = kTransform;

I don't think the string ctor around kTransform is necessary. The same applies to other places as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

windows does not seem to like it when we don't do the explicit copy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm not entirely sure, but I noticed a similar usage here: https://github.com/apache/iceberg-cpp/blob/main/src/iceberg/json_serde.cc#L1295

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the error message in the failed windows ci? Can we figure it out how to fix this since other places are good with it?


Result<std::shared_ptr<UnboundTransform>> UnboundTransformFromJson(
const nlohmann::json& json) {
if (json.is_object() && json.contains(kType) && json[kType] == kTransform &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the build error msg, it seems windows don't like json[kType] == kTransform, you might change this back.

nlohmann::json json;
json[kType] = kTransform;
json[kTransform] = transform.transform()->ToString();
json[kTerm] = std::string(mutable_transform.reference()->name());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
json[kTerm] = std::string(mutable_transform.reference()->name());
json[kTerm] = mutable_transform.reference()->name();

line 146 seems ok, so the same apply to line 148?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I donl think like 146 is ok, after some research, I think what is happening here is the following: MSVC is having hard times deciding what overload for operators to use between
nlohmann::json::operator== (from nlohmann_json library)
std::operator==<char, std::char_traits> for string_view (from MSVC's STL)

a similar confusion is happening for constructors. I can push a another diff to confirm, I will push another version to confirm

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. It seems that Compiler Explorer doesn't currently support libraries with MSVC[1], so we have to rely on the CI to confirm that.

[1] https://godbolt.org/z/4nsKx3o5o

@evindj evindj force-pushed the unbound_term_serde branch 2 times, most recently from 2c7a523 to 6e6031b Compare February 5, 2026 21:56
@evindj evindj force-pushed the unbound_term_serde branch from 6e6031b to f7a9960 Compare February 5, 2026 22:14
class Literal;
class Term;
class UnboundPredicate;
class NamedReference;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's sort them alphabetically.

EXPECT_FALSE(IsUnaryOperation(Expression::Operation::kTrue));
}

TEST(ExpressionJsonTest, NameReferenceRoundTrip) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try to reuse TestJsonConversion to avoid boilerplate of round trip test like this?

void TestJsonConversion(const T& obj, const nlohmann::json& expected_json) {
auto json = ToJson(obj);
EXPECT_EQ(expected_json, json) << "JSON conversion mismatch.";
// Specialize FromJson based on type (T)
auto obj_ex = FromJsonHelper<T>(expected_json);
EXPECT_TRUE(obj_ex.has_value()) << "Failed to deserialize JSON.";
EXPECT_EQ(obj, *obj_ex.value()) << "Deserialized object mismatch.";
}
? We can use a simple for-loop or even parameterized test pattern so we don't need to repeat this when adding new expr serde later.

nlohmann::json ToJson(const UnboundTransform& transform) {
auto& mutable_transform = const_cast<UnboundTransform&>(transform);
nlohmann::json json;
json[kType] = std::string(kTransform);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the error message in the failed windows ci? Can we figure it out how to fix this since other places are good with it?

Comment on lines +139 to +140
ICEBERG_ASSIGN_OR_RAISE(auto ref, NamedReference::Make(json.get<std::string>()));
return std::shared_ptr<NamedReference>(std::move(ref));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ICEBERG_ASSIGN_OR_RAISE(auto ref, NamedReference::Make(json.get<std::string>()));
return std::shared_ptr<NamedReference>(std::move(ref));
return NamedReference::Make(json.get<std::string>());

return json;
}

Result<std::shared_ptr<UnboundTransform>> UnboundTransformFromJson(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please call this in the ExpressionFromJson in this PR and verify it works in the test case? It is still unclear to me whether the checks below is worth because we should already know that json is an unbound transform and those checks are performed in the ExpressionFromJson already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants