[cgiapp] [Fwd: Re: ValidateRM not PP]

Michael Peters mpeters at plusthree.com
Mon Jan 26 09:22:11 EST 2009


Lyle wrote:

> I've looked for one, the only one I could find is HTML::TagParser but it 
> isn't suitable as it can't be used to recreate the page. Also looking at 
> the source it uses regexp.

Just looking at the source code briefly, it seems that it's using regexes as part of it's 
lexing/tokenizing, which is completely appropriate.

> As much as the idea of writing a Pure Perl parser intrigues me, I don't 
> have the time :( Especially as at this time I wouldn't actually be using 
> it (my script is generating all the html input tags and parsing them 
> into the html template).

Have you thought about maybe using an XML module? XML::SAX has a pure perl driver. Maybe if your 
HTML is XHTML it could work. Or if you want, you can probably use a libxml based module. It's 
extremely common (installed on most systems) and has an forgiving/HTML mode.

-- 
Michael Peters
Plus Three, LP



More information about the cgiapp mailing list