[RAS] utf8 reference strings

Thomas Krichel krichel at openlib.org
Sat Feb 21 15:37:22 EST 2009


  Jose Manuel Barrueco writes

> I've managed to see correct utf8 characters in:
>
> CitEc database -> AMF files
>
> but now the problem is in the ACIS database.

  A bit of background here. JMBC and I have been working on the issue
  of citations lost between RAS and CitEc. It appears that there are
  issues in the character sets of the reference string
  (refstring). CitEc produced latin-1 refstrings and stuck them into
  the AMF files. We changed the column of the reference to utf-8.

> There, the character set 
> used in the citations table is still latin1. I've re-processed a  
> document with problems in characters (RePEc:mar:volksw:200425) to test 
> the changes. Before, the characters were ok in ACIS but wrong in CitEc. 
> Not we have the problem in the other side. Try for instance:
>
> mysql> select clid,cnid,ostring from citations where ostring like 
> "SALMON, P. (2003), As%";
>
> Should we change the character set for ACIS too?

  I think this will have to be done.  I am not sure how it is 
  best to be done and hope that Ivan can advice. We can change
  the columns to utf-8 and reload all the citations. Maybe at 
  this stage we will remove the link to the citations screen
  temporarily so that we have a chance to test things.

  Cheers,

  Thomas Krichel                    http://openlib.org/home/krichel
                                RePEc:per:1965-06-05:thomas_krichel
                                               skype: thomaskrichel



More information about the RAS-run mailing list