[RAS] Migration Help

Fri Aug 26 08:44:32 CDT 2011

  Ivan Kurmanov writes

> I've been working on nebka last week or more with RAS database.
> 
> The first thing i did: I modified the ACIS code to use Storable's
> nfreeze() function when storing the data into the db. Also at that
> time I've found that part of the ACIS code on nebka was already using
> nfreeze().
> 
> Than I've written a test script which runs through the specified
> database tables (mysql) and checks their values for being written with
> Storable correctly.
> 
> Then the last 5 days or so i was working on the upgrade daemon
> database (Berkeley DB) of RAS on nebka. It also contains
> Storable-encoded strings, and is important. First, i've written a
> script (actually, a version of the same script mentioned above), to
> check the values in it (by full scanning, and checking each). That
> found some issues.
> 
> In fact, that checking has found some serious data corruption, where
> some keys where mixed with values and some values being hopelessly
> broken.

  I did suspect things where not ok because when I do fewer
  update (timout more then the default 1 week) things are
  not updated properly. I got serious complaints from CZ about
  it. I wanted to reduce the load on the box by setting 
  a higher TOO_OLD. 

> Second, I've written a script to correct values which need correction
> (via nfreeze) and to remove the ones that cannot be corrected.
> 
> That work is now done. BUT at least a part of the update daemon
> database is still corrupted there. The good news is that this part is
> not RePEc-part and it would not (should not) cause any trouble to
> rebuild it. The bad news is that it is corrupted on the internal
> Berkeley DB level and I do not know how to fix it. The corruption is
> reported by the db4.6_verify tool. Berkeley DB documentation is not
> clear on how to fix it, or I'm not looking hard enough.

  I do a 

db_dump foo | db_load foo_clean 

  when I am desparate. On PubMed 20 million records, that takes
  several days to do. 

  I have been desparete many times. 

> 
> Which leads me to the whole other topic: in the longer run we should
> avoid Berkeley DB and use something else instead. Luckily, this
> shouldn't be too hard to do, since all the BerkleyDB-related code is
> concentrated in a couple of modules or so. And there are alternatives.
> Kyoto Cabinet http://fallabs.com/kyotocabinet/ being one of them.

  or mongo or couch.... 

  I have had *tons* of trouble with BDB. I hate it.
  But still I don't think we should work on this now. 
  CZ will kick our buts. 

  You need to look at the ACIS code to find where the problem
  is. Blaming them on BDB is not the way forward I think. 

> Now, I've moved out the corrupted BDB file ~/acis/RI/data/ACIS/records
> (renamed it).
> 
> I think the data is now ready to be migrated to the new server
> (again). The changes in ~/acis/ need to be copied over too, as well as
> the (partially) fixed ~/acis/RI/data contents. Dan I suggest you to do
> that.
> 

  I think he should not do this at all. Instead he should start with
  the text data, and the most recent ACIS release, and build a clear
  dataset. I have done this today and I will forward my notes on it.

> It may be a good idea to wait for the nearest run of the nightly
> script to create the RI database snapshots in  ~/backup/2011/08/26 and
> to use those.
> 
> 
> With these data and code changes, it should all run on the new server,
> but we'll do our tests and checks as soon as it is copied.
> 
> Any comments? Questions?

  I bet that this will not work. Dan will still not be able 
  to read your storables, and he will not read his own storables
  when he updates perl. Get rid of Storable. Keep BDB, for now.

  Cheers,

  Thomas Krichel                    http://openlib.org/home/krichel
                                      http://authorprofile.org/pkr1
                                               skype: thomaskrichel