HTML API: Use locked tokens to implement safe fragment parsing#7912
HTML API: Use locked tokens to implement safe fragment parsing#7912sirreal wants to merge 26 commits intoWordPress:trunkfrom
Conversation
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
5067487 to
950bade
Compare
4c59182 to
5adcba3
Compare
d270715 to
30f77b5
Compare
|
This approach makes |
step should continue to work as before
|
When trying to control contexts like this, the insertion mode is also dangerous and is problematic in this PR as implemented. The fragment implementation currently on trunk is correct where a In this PR, these checks do not behave as expected in the spec so could cause the parser to incorrectly move into after-body or after-after-body insertion modes. |
Address problems using the HTML specification for fragment parsing which can lead to documents that are impossible to represent in HTML.
As an HTML processor, documents that cannot be represented in HTML should be rejected. This includes and document fragment (a document with a context) where the fragment could leak out of the context.
This PR includes a number of tests with examples of problematic HTML, but a simple example is the HTML
<p>in the context of aPelement. If this is naively parsed, it would lead to a tree likeP > P, which cannot be represented in HTML. This document would bail with an unsupported error upon encountering the<p>tag in the context of the P element.Implementation
Instead of using a simple context element in the fragment parser, this change moves a copy of the stack of open elements, the stack of active formatting elements, and the head and form element pointers into the context processor. These elements have a new
lockedproperty set. The implementation is adapted to prevent locked items from being modified on the stack or element pointers from being modified.The goal is to maintain a coherent HTML structure of the fragment document inside its context.
Trac ticket: https://core.trac.wordpress.org/ticket/62584
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.