Skip to content

Use the Unicode CLDR Transliteration Guidelines to generate slugs #2

@jayjun

Description

@jayjun

A simple lookup table is currently used to transliterate Unicode codepoints into Latin characters. While this is acceptably accurate for some languages (e.g. chinese characters to pinyin), it fails badly in languages with more complex transliteration rules.

Thankfully, the Unicode CLDR Transliteration Guidelines has done the heavy lifting.

Most platforms tap into this data through the ICU4C library. Personally I think the API is excellent, but calling C code from Erlang/Elixir has always been iffy.

I’m keen to hear what others have to say.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions