[cgiapp] Re: utf8 form processing

Mon Oct 20 10:24:57 EDT 2008

Mark Stosberg wrote:
> On Wed, 15 Oct 2008 17:11:34 +0200
> Rhesa Rozendaal <perl at rhesa.com> wrote:
> 
>> Mike Tonks wrote:
>>> Hi All,
>>>
>>> I recently encountered the dreaded utf8 funny characters, again.  This
>>> time on the input data coming from form entry fields.
>>>
>> Here's what I use:

[...]
 >>          my $might_decode = sub {
 >>              my $p = shift;
 >>              return ( !$p || ( ref $p && fileno($p) ) )
 >>                  ? $p
 >>                  : eval { decode_utf8($p) } || $p;
 >>          };
> 
> That looks useful, Rhesa.
> 
> Is there a variation of it that makes sense to submit as patch for CGI.pm?

I hadn't considered that. The more recent "-utf8" looks like it does the same 
thing:

# in CGI->param
   my @result = @{$self->{param}{$name}};

   if ($PARAM_UTF8) {
     eval "require Encode; 1;" unless Encode->can('decode'); # bring in these 
functions
     @result = map {ref $_ ? $_ : Encode::decode(utf8=>$_) } @result;
   }

The only differences I can see is that
* I don't try to decode false values
* I do try to decode values that are references, but not filenos
* I wrap the decode in an eval

I have a hard time imagining the first two would break Mike's code, but he 
said it didn't work for him. Would it have been the lack of eval?

rhesa