perf: Optimize scalar path for ltrim function by kumarUjjawal · Pull Request #20032 · apache/datafusion

kumarUjjawal · 2026-01-27T16:06:52Z

Which issue does this PR close?

Part of [EPIC] Optimize performance for slow expressions datafusion-comet#2986

Rationale for this change

ltrim currently routes scalar inputs through make_scalar_function, which converts scalar values into size-1 arrays and then converts the result back. This adds avoidable overhead in constant-folding / scalar evaluation scenarios.

What changes are included in this PR?

Add a match-based scalar fast path in LtrimFunc::invoke_with_args
Handles Utf8, Utf8View, and LargeUtf8
Handles early null returns for scalar null inputs

Type	Before	After	Improvement
`ltrim/scalar_utf8`	116.12 ns	112.58 ns	−3.1%
`ltrim/scalar_utf8view`	116.37 ns	112.05 ns	−3.7%

Are these changes tested?

Yes

Are there any user-facing changes?

No

kumarUjjawal · 2026-01-27T17:58:51Z

Made changes to common.rs as I wanted to work on other trim functions in seperate PRs and it would be handy to have the shared code.

Jefffrey · 2026-01-28T13:37:19Z

datafusion/functions/src/string/ltrim.rs

+            .iter()
+            .any(|v| matches!(v, ColumnarValue::Scalar(s) if s.is_null()))
+        {
+            if args.iter().any(|v| matches!(v, ColumnarValue::Array(_))) {


We shouldn't return a columnar array here; it should just be columnar scalar only

Jefffrey · 2026-01-28T13:41:12Z

datafusion/functions/src/string/ltrim.rs

+        match args[0].data_type() {
            DataType::Utf8 | DataType::Utf8View => make_scalar_function(
                ltrim::<i32>,
                vec![Hint::Pad, Hint::AcceptsSingular],


It seems a bit surprising we get such a speedup when make_scalar_function with hints should already ensure we don't expand the arrays 🤔

I will do another benchmark to ensure i didn't mess up μs and ns.

Maybe benchmark the null fast path changes separately from this scalar fast path change as well

I did another benchmark, i had messed up the privious one, now it only shows around ~3% of speedup.

Could you amend the PR body results in that case, since they seem to mislead the performance improvements of this PR

Jefffrey · 2026-01-28T13:41:38Z

datafusion/functions/src/string/ltrim.rs

+        }
+
+        // Scalar fast path
+        if args.iter().all(|v| matches!(v, ColumnarValue::Scalar(_))) {


Using an iter check on this when we have at most 2 arguments is overkill; just use a match

Would subsequent PRs for other trim fundtions a good effort or should I leave those?

Can we identify in this PR which changes actually lead to speedup? If its the null handling or the new scalar fast path? I'm hesistant on this PR since though we have small gains, the scalar fast path in particular is confusing since it tacks on a new scalar fast path when the existing fast path still remains (but would remain unused?)

I did all the possible combination benchmark and the scalar path provides negligible speedup also the common.rs changes are net negative. We are better droping of this PR.

Thanks for checking this 👍

Should I close this?

Yes I think that would be best

Thanks @Jefffrey for your time. I should have done the benchmark properly since the start.

kumarUjjawal added 3 commits January 27, 2026 16:01

perf: Optimize scalar path for ltrim function

8833749

Early return for scalar null arguments

d8da31c

remove unused code

9f94ecf

github-actions bot added the functions Changes to functions implementation label Jan 27, 2026

Jefffrey reviewed Jan 28, 2026

View reviewed changes

use match and remove columan return

9707a1e

kumarUjjawal closed this Jan 30, 2026

Conversation

kumarUjjawal commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

kumarUjjawal commented Jan 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kumarUjjawal commented Jan 27, 2026 •

edited

Loading