Skip to content

Add pretty serialization properties: pretty_html, pretty_inner_html for LexborHTMLParser and LexborNode. #196

@pygarap

Description

@pygarap

It would be helpful to add explicit pretty serialization properties on the Lexbor based API, for easier debugging and inspection of the DOM tree.

Motivation

selectolax already exposes properties like:

  • LexborNode.html and LexborHTMLParser.html
  • LexborNode.inner_html and LexborHTMLParser.inner_html

These properties already use Lexbor’s serialization helpers to produce HTML output. In many cases, however, it is useful to have a dedicated “pretty” view that is stable and easy to read, with clear indentation and line breaks. This is especially helpful when:

  • inspecting complex or deeply nested trees
  • comparing output in tests
  • logging debug views of specific nodes or subtrees

Lexbor already provides specific functions for pretty printing (nicely formatted output with indentation), so selectolax can expose these directly as separate properties.

Proposal

Add four new properties on the Lexbor based API:

  1. LexborNode.pretty_html and LexborHTMLParser.pretty_html

    • Return pretty printed HTML of the node plus its entire tree.
    • Internally use lxb_html_serialize_pretty_tree_str.
  2. LexborNode.pretty_inner_html and LexborHTMLParser.pretty_inner_html

    • Return pretty printed HTML for the children of the node and their trees.
    • Internally use lxb_html_serialize_pretty_deep_str.

These properties would live next to the existing html and inner_html properties, but would be clearly marked as “pretty” and debug friendly.

Benefits

  • Makes it simple to get a debug friendly, pretty printed representation of any node or parser root with a clear and explicit API.
  • Provides stable, indented output that is well suited for tests and log output.
  • Reuses Lexbor’s existing pretty serialization functions instead of reimplementing formatting in Python.

Overall, pretty_html and pretty_inner_html on both LexborHTMLParser and LexborNode would be a small but very practical addition for debugging and development.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions