Skip to content

Conversation

@sideshowbarker
Copy link
Member

@sideshowbarker sideshowbarker commented Dec 30, 2025

This change, in order to support the “customizable” <select> element:

  • “relaxes” parsing rules in <select> (to allow more elements as children)
  • adds the <selectedcontent> element for cloning selected <option> content

ElementName.java

  • Add <selectedcontent> element with TreeBuilder.SELECTEDCONTENT group

TreeBuilder.java

  • Add SELECTEDCONTENT group constant and IN_SELECT_IN_CONTENT mode
  • Track selectedContentPointer and activeOptionStackPos for cloning
  • Allow <div>, <span>, and other elements in <select> in “relaxed” mode
  • When <option> is popped, deep-clone its content to <selectedcontent> via an overridable cloneOptionContentToSelectedContent() hook — operating on the live DOM, not parser state (so adoption-agency restructuring is correctly reflected)
  • Handle <selectedcontent> special-casing (store pointer, no stack push)
  • Add cloning hook to all pop variants (pop(), popForeign(), silentPop(), popOnEof())

SAXTreeBuilder.java

  • Implement cloneOptionContentToSelectedContent(): clears <selectedcontent> children, then deep-clones from the live DOM via getFirstChild()/getNextSibling() traversal
  • Avoid intermediate String allocation when cloning Characters nodes (use getBuffer() directly)

DOMTreeBuilder.java

  • Implement cloneOptionContentToSelectedContent() using DOM cloneNode(true)

XOMTreeBuilder.java

  • Implement cloneOptionContentToSelectedContent() using XOM’s Node.copy()

BrowserTreeBuilder.java

  • Implement cloneOptionContentToSelectedContent() using cloneNodeDeep()

CharBufferNode.java

  • Add getBuffer() accessor for direct char[] access during cloning

ParentNode.java

  • Add clearChildren() method, to support clearing <selectedcontent>

@hsivonen
Copy link
Member

Sorry I missed this over the holidays.

@sideshowbarker
Copy link
Member Author

Sorry I missed this over the holidays.

No worries — there's no real rush on it

Copy link
Member

@hsivonen hsivonen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delay.

As discussed on Matrix, I think the way this patch makes the cloning work on what the parser saw is incorrect and the spec instead requires the cloning to work on what's in the live DOM, which may be different if the option element opens, the network stalls, setTimeout modifies the DOM, and network resumes.

Further remarks in an inline comment.

@sideshowbarker sideshowbarker force-pushed the sideshowbarker/relaxed-select-parsing branch 4 times, most recently from 020921d to f7b4c0f Compare February 2, 2026 05:31
This change, in order to support the “customizable” <select> element:

  - “relaxes” parsing rules in <select> (to allow more elements as children)
  - adds the <selectedcontent> element for cloning selected <option> content

ElementName.java
----------------
  - Add <selectedcontent> element with TreeBuilder.SELECTEDCONTENT group

TreeBuilder.java
----------------
  - Add SELECTEDCONTENT group constant and IN_SELECT_IN_CONTENT mode
  - Track selectedContentPointer and activeOptionStackPos for cloning
  - Allow <div>, <span>, and other elements in <select> in “relaxed” mode
  - Deep clone <option> content to <selectedcontent> when <option> closes
  - Handle <selectedcontent> special-casing (store pointer, no stack push)

SAXTreeBuilder.java
-------------------
  - Implement clearSelectedContentChildren() for clearing before clone
  - Implement deepCloneChildren() and deepCloneNode() for <option> cloning

ParentNode.java
---------------
  - Add clearChildren() method, to support clearing <selectedcontent>

html5lib-tests submodule
------------------------
  - Updated to pull in corresponding tests for “relaxed” <select> parsing
The initial implementation did inline cloning during parsing (mirroring
elements/characters into <selectedcontent> as tokens were processed) AND
deep-cloned from the DOM when option was popped.

The inline cloning was wrong per spec — cloning should operate on the
DOM, not on what the parser saw (the adoption agency algorithm can
restructure the tree) — and redundant, since pop() cleared
<selectedcontent> and deep-cloned from the DOM anyway.

This replaces all inline cloning machinery with a single overridable
hook method, cloneOptionContentToSelectedContent(), called when <option>
is popped. The tree builder tracks which <option> is active and where
<selectedcontent> is, but delegates the actual DOM cloning to subclasses.
Add necessary corresponding implementations to the other tree builders
that construct persistent trees:

  - DOMTreeBuilder: uses the DOM cloneNode method for deep cloning
  - XOMTreeBuilder: uses XOM’s Node.copy() for deep cloning
  - BrowserTreeBuilder: uses the existing cloneNodeDeep method
Add a getBuffer() accessor to CharBufferNode so that deepCloneNode()
can copy directly from char[] to char[] via the Characters constructor,
instead of going through toString() and toCharArray().
@sideshowbarker sideshowbarker force-pushed the sideshowbarker/relaxed-select-parsing branch from f7b4c0f to 87fe37a Compare February 2, 2026 05:35
attributes = null; // CPP
break starttagloop;
case HR:
// Check if select is in scope
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec seems to implicitly close "p" before the "select" check.

// Check if select is in scope
if (findLastInScope("select") != TreeBuilder.NOT_FOUND_ON_STACK) {
// Close any open option or optgroup first
if (isCurrent("option")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are "option" and "optgroup" popped separately instead of calling generateImpliedEndTags();?

attributes = null; // CPP
break starttagloop;
case INPUT:
// Check if select is in scope and close it (for compatibility)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the fragment-case special case if context is "select" is missing.

while (currentPtr >= eltPos) {
pop();
}
resetTheInsertionMode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment that explains this.

attributes = null; // CPP
break starttagloop;
case TEXTAREA:
// Check if select is in scope and close it (for compatibility)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find spec text corresponding to this. Why is this needed?

if (eltPos != TreeBuilder.NOT_FOUND_ON_STACK) {
if (errorHandler != null && !isCurrent("optgroup")) {
errUnclosedElementsImplied(eltPos, "optgroup");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that also "option" being in scope is an error per spec.

case TBODY_OR_THEAD_OR_TFOOT:
case TR:
case TD_OR_TH:
// Check if we're inside a select inside a table
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this map to the spec?

attributes, formPointer);
attributes = null; // CPP
break starttagloop;
case SELECTEDCONTENT:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me the tree builder shouldn't know about selectedcontent and looking for it is a DOM-side operation.

}
}
break endtagloop;
case OPTION:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that "Any other end tag" should continue to handle these, except poping an "option" (regardless of tag) should trigger the DOM-side cloning.

case TEMPLATE:
// fall through to IN_HEAD;
break;
case SELECT:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this handled in a select-specific way instead of being bundled together with various block-ish elements?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants