[cgiapp] Problem displaying French, sometimes

Ron Savage ron at savage.net.au
Sun Sep 7 19:31:19 EDT 2008


Hi Mike

On Sun, 2008-09-07 at 09:33 +0100, Mike Tonks wrote:
> I don't have a full explanation, but the second character looks like a
> wrongly encoded double-byte utf8 issue, i.e. utf8 character (double
> byte) being displayed as two characters.  Do you see 'wide character
> in print anywhere?

Nope. No such warning appears.

> This can happen when you concatenate two string together if the uft8
> flags are not set correctly or if corruption has occurred.  For
> example I recently had a mysql table with badly encoded utf8 stored in
> it, which caused similar things to appear.
> 
> The DBI function data_string_desc may be useful to debug the status of
> your strings.

Opens another can of worms. From the database log (sorry about the
wrap):

 id  | level |                                message
|         timestamp          
-----+-------+------------------------------------------------------------------------+----------------------------
470 | info  | CGIApp: ------------------------------
| 2008-09-08 09:27:01.123344
 471 | info  | CGIApp: http://127.0.0.1/search/sites.fcgi
| 2008-09-08 09:27:01.127116
 472 | info  | CGIApp: CÔTE D'IVOIRE. Encoding: UTF8 off, ASCII, 3
characters 3 bytes | 2008-09-08 09:27:01.185259
 473 | info  | CGIApp: ------------------------------
| 2008-09-08 09:27:03.845652
 474 | info  | CGIApp: http://127.0.0.1/test/sites
| 2008-09-08 09:27:03.852502
 475 | info  | CGIApp: CÔTE D'IVOIRE. Encoding: UTF8 off, ASCII, 3
characters 3 bytes | 2008-09-08 09:27:03.887059

3 chars? WTF?

The code is:

$self -> log($_ -> name() . '. Encoding: ' . DBI -> data_string_desc($_
-> name() ) );

> What character set is your database using?

My code uses 'PerlSetEnv   PGCLIENTENCODING LATIN1' to set the client
character set encoding.

-- 
Ron Savage
ron at savage.net.au
http://savage.net.au/index.html




More information about the cgiapp mailing list