[cgiapp] Solved: Problem displaying French, sometimes

Ron Savage ron at savage.net.au
Wed Sep 10 21:23:14 EDT 2008


Hi Folks

Here's a report of at least one way of getting non-ASCII data into a
database, and then from there to a web page.

Warning: The different uses (spellings) of utf-8 and utf8 etc are not
accidental. Specifically, users of Encode absolutely must read the
section in the docs titled 'UTF-8 vs. utf8 vs. UTF8'. All uses here have
been copy-and-pasted from the original docs.

(1) Environment:

OS: Debian 2.6.26-1-686
Database server: Postgres 8.2.5
Programming language: Perl v5.10.0
Web server: Apache Apache/2.2.9 (Unix) mod_ssl/2.2.9 OpenSSL/0.9.8g
mod_apreq2-20051231/2.6.0 mod_perl/2.0.4 Perl/v5.10.0
Web client: iceweasel (FireFox) 3.0.1

(2) Creating the database:

psql: create database contacts owner ron encoding 'utf8';

Nothing special was done when defining the column attributes.

(3) Populating the database:

(a) The data - country names and state/region/province/etc names per
country - come from Locale::SubCountry.

Note: The US Postal Service delivers to many army/etc outposts using
fake states not listed in that module. I subscribed to a reseller of US
postal data for 1 month for < $20 to get an updated list...

(b) Telling DBD::Pg to use UTF8:
$ENV{'PGCLIENTENCODING'} = 'UTF8';

(c) Telling DBD::Pg (via Rose in my case) to store the data:
use Encode; # V 2.23
and then insert a string using:
encode('UTF-8', $name, Encode::FB_CROAK);

Warning: encode() destroys its second parameter during this call, so
wrap it in a little sub.

(4) Retrieving and outputting the data:

(a) httpd.conf (for mod_perl handlers):
PerlSetEnv   PGCLIENTENCODING UTF8

(b) FCGI::ProcManager ignores setting this env var in httpd.conf, so in
FCGID scripts, use:
$ENV{'PGCLIENTENCODING'} = 'UTF8';
Specifically, I used this in sites.fcgi.

The net effect of (a) and (b) is that I did not set this env var in my
CGI::App-based module.

I'm not convinced setting it in the module would work automatically,
since the code in my startup.pl may pre-load my modules, Rose modules,
and DBD::Pg in such a way as to pre-empt any attempt to set it during
the execution of my module.

Also, I did not trying setting it in startup.pl.

(c) No need to use Encode in the programs which retrieve data.
(Well, not for this anyway.)
That, in turn, means no need to call decode(), i.e. the reverse of the
encode() above.

(d) A HTML template (the DOCTYPE is not necessary):
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
  <title>Countries</title>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<h3 align="center" style="{color: #80c0ff;}">Countries</h3>
<table align="center">
  <tmpl_loop name=country_loop><tr><td><tmpl_var
name=name></td></td></tr></tmpl_loop>
</table>
</body>
</html>

(e) Populating the template:
my($country) = Local::Sites::Rose::Countries::Manager ->
get_countries();
my($template) = $self -> load_tmpl('country.tmpl');
$template -> param(country_loop => [map{ {name => $_ -> name()} } @
$country]);

(f) Outputting UTF-8 via an app based on CGI::Application:
$self -> header_add(-charset => 'utf-8');
before your run mode method returns the content of the template:
return $template -> output();

(5) Checking the output:
The country also known as The Ivory Coast should be displayed as:
CÔTE D'IVOIRE
(assuming this email displays the first O with a ^ over it :-).

This works for both FCGID scripts and mod_perl handlers.

(6) Credits:
Thanx to Mike Tonks, Cees Hek and Peter Karman for suggestions, via the
CGI::App and Rose::DB::Object mailing lists, and directly.

(7) Other factors:

(a) CGI::Simple:
I use CGI::Simple 1.105 as a light-weight version of CGI, but it has not
been properly patched to handle Apache V 2, so you'll need to patch the
source as per my 2 bug reports on RT:
http://rt.cpan.org/Ticket/Display.html?id=38931
http://rt.cpan.org/Ticket/Display.html?id=39146

(b) Yahoo! User Interface Library:
http://developer.yahoo.com/yui/
During this time I ran hundreds of tests incorporating YUI, although
it's not needed for the simple code above.
The 2 worse problems I had were:
- Editing a template and clicking the Reload button usually causes
DBD::Pg to abort, and
- Right-clicking on the page to Show Source, and then clicking the
Reload button causes DBD::Pg to abort (as unbelievable as that sounds).
The message in Apache's log (appreaing twice per error) is:
DBD::Pg::st execute failed: server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
at /usr/local/share/perl/5.10.0/Rose/DB/Object/Manager.pm line 1963.

Retarting Apache is the solution.

I can't say if this is just a YUI problem, or only occurs due to my
specific combination of packages.

Just re-clicking the Reload button many times, without doing anything
else, never produces this error.

I does mean I'm deeply uneasy about YUI, though.

(8) Background reading:
http://www.oreillynet.com/onlamp/blog/2005/10/repeat_after_me_lack_of__outpu.html






Hint: This would make a good addition to the wiki.




-- 
Ron Savage
ron at savage.net.au
http://savage.net.au/index.html





More information about the cgiapp mailing list