Fix lexing for unterminated strings/heredocs etc. by Earlopain · Pull Request #3924 · ruby/prism

Earlopain · 2026-02-13T09:56:22Z

When we hit EOF and still have lex modes left, it means some content was unterminated. Heredocs specifically have logic that needs to happen when the body finished lexing. If we don't reset the mode back to how it was before, it will not continue lexing at the correct place.

Followup to #3918. We can't call into parser_lex since it resets token locations. So I went back to goto.

Closes #3911

When we hit EOF and still have lex modes left, it means some content was unterminated. Heredocs specifically have logic that needs to happen when the body finished lexing. If we don't reset the mode back to how it was before, it will not continue lexing at the correct place. Followup to ruby#3918. We can't call into `parser_lex` since it resets token locations.

Earlopain · 2026-02-13T09:59:50Z

src/prism.c

            pm_statements_node_t *statements = NULL;

-            if (!match1(parser, PM_TOKEN_EMBEXPR_END)) {
+            if (!match3(parser, PM_TOKEN_EMBEXPR_END, PM_TOKEN_HEREDOC_END, PM_TOKEN_EOF)) {


#{ currently gives the interpolation a statement with MissingNode. That's not correct, the missing end token leads to syntax error. It also messes up the locations when it finds the synthetic heredoc end token. I don't think there are any other tokens to consider here, should be all hopefully.

# Before Prism.parse("\"\#{").value.statements.body[0] => @ InterpolatedStringNode (location: (1,0)-(1,3)) ├── flags: newline ├── opening_loc: (1,0)-(1,1) = "\"" ├── parts: (length: 1) │ └── @ EmbeddedStatementsNode (location: (1,1)-(1,3)) │ ├── flags: ∅ │ ├── opening_loc: (1,1)-(1,3) = "\#{" │ ├── statements: │ │ @ StatementsNode (location: (1,1)-(1,3)) │ │ ├── flags: ∅ │ │ └── body: (length: 1) │ │ └── @ MissingNode (location: (1,1)-(1,3)) │ │ └── flags: ∅ │ └── closing_loc: (1,3)-(1,3) = "" └── closing_loc: ∅ # After Prism.parse("\"\#{").value.statements.body[0] => @ InterpolatedStringNode (location: (1,0)-(1,3)) ├── flags: newline ├── opening_loc: (1,0)-(1,1) = "\"" ├── parts: (length: 1) │ └── @ EmbeddedStatementsNode (location: (1,1)-(1,3)) │ ├── flags: ∅ │ ├── opening_loc: (1,1)-(1,3) = "\#{" │ ├── statements: ∅ │ └── closing_loc: (1,3)-(1,3) = "" └── closing_loc: ∅

Earlopain mentioned this pull request Feb 13, 2026

Prism.lex_compat creates wrong on_sp token when used with heredoc and unclosed embexpr #3911

Closed

Earlopain commented Feb 13, 2026

View reviewed changes

kddnewton approved these changes Feb 13, 2026

View reviewed changes

kddnewton merged commit 27c24fd into ruby:main Feb 13, 2026
67 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix lexing for unterminated strings/heredocs etc.#3924

Fix lexing for unterminated strings/heredocs etc.#3924
kddnewton merged 1 commit intoruby:mainfrom
Earlopain:unterminated-heredoc-v2

Earlopain commented Feb 13, 2026

Uh oh!

Earlopain Feb 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Earlopain commented Feb 13, 2026

Uh oh!

Earlopain Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Earlopain Feb 13, 2026 •

edited

Loading