Skip to content

Does not work on Turkish (tr-TR) locales #86

@soundasleep

Description

@soundasleep

It seems a lot of this project uses String.toLowerCase() instead of String.toLowerCase(Locale.ROOT), which causes issues on Turkish locales.

In tr-TR, title becomes tıtle, and (I think more importantly) Xdiv becomes Xdıv, which I think is breaking layout or parsing of elements.

For the same document, for my BoxRenderer, on an English locale I get the following elements sent through startElementContents:

  • Xdiv, html, body, label, Xspan, select, option, ...

However if I set -Duser.country=TR -Duser.language=tr I instead get the following:

  • Xdiv, html, body, Xdiv, label, ...

Clearly there's a different code path in the different locale, I suspect it has to do with toLowerCase() and/or toUpperCase() throughout either CssBox or jQueryStyle. For example in cz.vubtr.web.csskit.ElementMatcherSafeCI#matchesClass or org.fit.cssbox.css.HTMLNorm#attributesToStyles.

Related: radkovo/jStyleParser#29

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions