from Hacker News

Perl Time::Piece Unicode Issue

by jjjbokma on 6/3/21, 10:31 AM with 2 comments

  • by nanis on 6/3/21, 10:57 AM

    One might want to use Unicode::UTF8[1] instead of the hand-rolled helper:

        #!/usr/bin/perl
    
        use strict;
        use warnings FATAL => 'utf8'
    
        use open ':std', ':encoding(UTF-8)';
    
        use Time::Piece;
        use Unicode::UTF8 qw( decode_utf8 );
    
        for my $month (1 .. 12) {
            my $date = sprintf '2021-%02d-01', $month;
            my $tp = Time::Piece->strptime($date, '%Y-%m-%d');
            print decode_utf8($tp->strftime('%B')), ' ' x ($month != 12);
        }
        print "\n";
    
    > Here is a summary of features for comparison with Encode's UTF-8 implementation:

    > * Simple API which makes use of Perl's standard warning categories.

    > * Recognizes all noncharacters regardless of Perl version

    > * Implements Unicode's recommended practice for using U+FFFD.

    > * Better diagnostics in warning messages

    > * Detects and reports inconsistency in Perl's internal representation of wide characters (UTF-X)

    > * Preserves taintedness of decoded $octets or encoded $string

    > * Better performance ~ 600% - 1200% (JA: 600%, AR: 700%, SV: 900%, EN: 1200%, see benchmarks directory in git repository)

    [1]: https://metacpan.org/pod/Unicode::UTF8