[RAS] nebka problems

Thomas Krichel krichel at openlib.org
Sun Jan 20 10:43:48 EST 2008



  Christian Zimmermann writes

> - a reboot fixed it, the machine looked fine
> - nebka went down again, approx 24 hours after the first crash

  I just looked at this again, the crontab I commented 
  from mutabor, and whom I think is responsible, is NOT
  an upload from mutabor to nebka but the reverse

#!/bin/sh
rsync -t --log-format=%n aras at nebka.openlib.org:citec-export/* /home/adnetec/ras-exports/ | ~/Ivan/handle_ras_exports.pl /home/adnetec/ras-exports/


> - Thomas was doing a complete backup of the machine with rsync at the 
> time. He did not get to the original data of the machine, the aras 
> account.
> - We had a similar set of crashes in June 2006, that were diagnosed as an 
> issue with a directory in CitEc that had too many files. At the time, I 
> wrote:

  But this was an upload, and it was a number taht was a lot 
  bigger than the numbers we have now.


> Does this make sense? In the immediate, we would need to reboot the 
> machine Monday, comment out all crontab jobs, investigate the true origin 
> of the problem (we found it last year by looking a problematic inodes with 
> fsck), and then only try to back up (only the aras account, in particular 
> the userdata directory).

  OK. 

  In addition, I suggest you open an account at the ideas
  machine, to hold the most important data from acis and ras.
  This backup should be conducted every hour or so, in addition
  to backups to sahure (later to raneb) and fafner, done 
  on alternate days.

> I will be in a train back to Paris again while the machine probably gets 
> back up (Monday EST 10am-3pm), but I will check in as soon as 
> possible once back in Paris.

  I will be at home on Monday night. I am 5 hours ahead of
  you, 11 hours ahead of EST.

  If I can be of any help any time, please don't hesitate
  to call me on my home number below. I can call you right
  back.

  Cheers,

  Thomas Krichel                    http://openlib.org/home/krichel
                                RePEc:per:1965-06-05:thomas_krichel
  phone: +7 383 330 6813                       skype: thomaskrichel




More information about the RAS-run mailing list