Skip to content

Archives

Latest Script Hack: utf8lint

Perl: double-encoding is a frequent problem when dealing with UTF-8 text, where a UTF-8 string is treated as (typically) ISO Latin-1, and is re-encoded.

utf8lint is a quick hack script which uses perl’s Encode module to detect this. Feed it your data on STDIN, and it’ll flag lines that contain text which may be doubly-encoded UTF-8, in a lintish way.

Comments closed