Skip to content

perf: some python optimizations#21

Draft
jonmmease wants to merge 5 commits intofix/reindent-quadratic-perffrom
jonmmease/more-optimizations
Draft

perf: some python optimizations#21
jonmmease wants to merge 5 commits intofix/reindent-quadratic-perffrom
jonmmease/more-optimizations

Conversation

@jonmmease
Copy link

@jonmmease jonmmease commented Feb 24, 2026

Had claude do some more performance investigations on top of #18. It found some wins, but nothing that improved benchmarks by more than %33, so I'm on the fence on whether it's worth the review effort. But wanted to document it here.


Summary

Five independent pure-Python performance optimizations, each in its own commit for easy bisecting:

  1. OPT-1: Combined regex lexer (lexer.py) — Replace per-pattern inner loop with single alternation regex. Dollar-quoted strings handled separately due to backreference.
  2. OPT-13: Batch list.insert in reindent (reindent.py) — Replace repeated list.insert/del mutations with two-pass scan-then-rebuild.
  3. OPT-6: Skip str(self) in TokenList.init (sql.py) — Eliminate O(n) flatten during construction; value becomes a computed @property.
  4. OPT-8: Deferred compaction in group_tokens (sql.py, grouping.py) — Use None sentinels instead of O(n) list slicing per group operation; compact once per pass.
  5. OPT-11: Specialize imt() at hot call sites (grouping.py, sql.py) — Inline isinstance/ttype checks at hot call sites to eliminate function call overhead.

Benchmark results (base → optimized)

Benchmark Base (ms) Current (ms) Change
large_in_list 5785.8 3861.3 -33.3%
large_insert 2357.8 1896.2 -19.6%
many_joins 74.7 61.0 -18.3%
insert_scaling[25k] 2279.9 1914.5 -16.0%
insert_scaling[5k] 357.6 326.5 -8.7%
wide_select 147.1 142.9 -2.9%
insert_scaling[10k] 764.1 747.3 -2.2%
heavy_formatting 49.1 48.3 -1.7%
complex_where 569.4 560.3 -1.6%
mixed_batch 121.2 119.4 -1.5%

Biggest wins are on large flat-list benchmarks where OPT-8 (deferred compaction) and OPT-13 (batch reindent) eliminate O(n²) behavior.

🤖 Generated with Claude Code

jonmmease and others added 5 commits February 24, 2026 14:48
Replace the per-position inner loop over ~43 individual regex patterns
with a single combined alternation regex using named groups. This
eliminates millions of Python-level re.Pattern.match calls on large inputs.

The dollar-quoted string pattern (which uses a backreference) is handled
separately since backreferences break in combined alternation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace O(n) insert_before + del mutations in _split_kwds and
_split_statements with a two-pass scan-then-rebuild approach. Pass 1
scans the unmodified token list to collect all edit operations. Pass 2
builds the new token list in a single O(n) sweep.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make TokenList.value a computed property instead of a stale cache.
Previously __init__ called str(self) which flattened all descendants
O(n). Now value is computed lazily only when accessed.

- Add value property to TokenList that computes from children
- Use str(token) instead of token.value at sites with TokenList args
- Fix StripCommentsFilter to not recurse into Comment groups
  (previously relied on stale .value cache)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace O(n) list slicing in group_tokens with None sentinels.
Consumed tokens are set to None instead of deleted, keeping list
length stable. Compact all sentinels after each grouping pass.

- group_tokens: use None sentinels instead of slice assignment
- Reverse-scan optimization for extend case avoids O(n^2)
- _compact_all: recursive cleanup after each pass in group()
- Add None guards in _token_matching, flatten, get_sublists
- _group_matching/_group: use live-list iteration with None skips

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace generic imt() dispatch with inline isinstance/ttype checks at
hot call sites in grouping.py and sql.py. Remove imt import from
grouping.py. imt() is kept in utils.py for cold paths and external use.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jonmmease jonmmease changed the base branch from master to fix/reindent-quadratic-perf February 24, 2026 20:58
@jonmmease jonmmease changed the title perf: 5 pure-Python optimizations (up to 33% faster) perf: some python optimizations Feb 24, 2026
@jonmmease
Copy link
Author

cc @stevephodgson and @glentakahashi, I'm not convinced this is beneficial enough to be worth a detailed review, but wanted to at least document it in case you all see it differently

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant