-
Notifications
You must be signed in to change notification settings - Fork 90
Description
It would be helpful to add explicit pretty serialization properties on the Lexbor based API, for easier debugging and inspection of the DOM tree.
Motivation
selectolax already exposes properties like:
LexborNode.htmlandLexborHTMLParser.htmlLexborNode.inner_htmlandLexborHTMLParser.inner_html
These properties already use Lexbor’s serialization helpers to produce HTML output. In many cases, however, it is useful to have a dedicated “pretty” view that is stable and easy to read, with clear indentation and line breaks. This is especially helpful when:
- inspecting complex or deeply nested trees
- comparing output in tests
- logging debug views of specific nodes or subtrees
Lexbor already provides specific functions for pretty printing (nicely formatted output with indentation), so selectolax can expose these directly as separate properties.
Proposal
Add four new properties on the Lexbor based API:
-
LexborNode.pretty_htmlandLexborHTMLParser.pretty_html- Return pretty printed HTML of the node plus its entire tree.
- Internally use
lxb_html_serialize_pretty_tree_str.
-
LexborNode.pretty_inner_htmlandLexborHTMLParser.pretty_inner_html- Return pretty printed HTML for the children of the node and their trees.
- Internally use
lxb_html_serialize_pretty_deep_str.
These properties would live next to the existing html and inner_html properties, but would be clearly marked as “pretty” and debug friendly.
Benefits
- Makes it simple to get a debug friendly, pretty printed representation of any node or parser root with a clear and explicit API.
- Provides stable, indented output that is well suited for tests and log output.
- Reuses Lexbor’s existing pretty serialization functions instead of reimplementing formatting in Python.
Overall, pretty_html and pretty_inner_html on both LexborHTMLParser and LexborNode would be a small but very practical addition for debugging and development.