[RAS] nebka up
Christian Zimmermann
christian.zimmermann at uconn.edu
Tue Jan 29 10:29:26 EST 2008
The machine survived overnight. It passed all test on the ACIS side. I am
now restoring progessively services. The RI daemon is now running, I
did a run of /home/aras/acis/bin/nightly >>/home/aras/nightly.log 2>&1,
which is scheduled it crontab to run at 23:45, i.e. just after the last
known instructions before the crashes. It worked well.
I have not reestablished the following services:
#
# Report, backup, rotate, archive
#
54 23 * * * /home/aras/acis/bin/nightly >>/home/aras/nightly.log 2>&1
#
# Make RePEc:ras (RePEc:per) archive
#
*/9 * * * * cd /home/aras/acis && /home/aras/acis/bin/make-repec-per.sh
#
# Clean up old ACIS user sessions
#
*/26 * * * * /home/aras/acis/bin/clean-up >> /home/aras/acis/clean-up.log
#
# Clean up old ACIS user sessions
#
*/26 * * * * /home/aras/acis/bin/clean-up >> /home/aras/acis/clean-up.log
#
# Update daemon database checkpoint
#
27 * * * * cd /home/aras/lib/bdb/bin ; ./db_checkpoint -1 -h
/home/aras/acis/RI\/data && ./db_archive -d -h /home/aras/acis/RI/data
I am holding off the rest for the moment. Should we revert the DNS record
so that people can connect now?
On Mon, 28 Jan 2008, Ivan Kurmanov wrote:
> Sounds hopeful.
>
> There is also a job or two in crontab of user adrepec.
>
> in root, do "crontab -lu adrepec"
>
> ivan
>
>
> On 28 Jan 2008, at 22:27, Christian Zimmermann wrote:
>
>> I looked everywhere in the logs, I see nothing wrong. There are some
>> indications of corrupt mysql tables, but when I checked those used by RAS
>> after the first crash, they were fine. Maybe there are corrupt tables
>> elsewhere. I have not yet run the checks, I'll try this evening.
>>
>> I commented out crontab in the root and aras accounts with '#CZ'. Let's
>> see whether the machine survives the night. If so, and nobody else see a
>> problem, we should gradually get the service back. The first thing would
>> be to get adrepec current. Then open the web server to users. Then get
>> CitEc data back. Does this make sense?
>>
>>
>>
>> On Mon, 28 Jan 2008, Christian Zimmermann wrote:
>>
>>> First things I see: both crashes happened exactly at the same time:
>>>
>>> Jan 17 23:09:01 nebka /USR/SBIN/CRON[14205]: (aras) CMD (cd
>>> /home/aras/acis && /home/aras/acis/bin/make-repec-per.sh )
>>> Jan 17 23:10:01 nebka /USR/SBIN/CRON[14237]: (www-data) CMD ([ -x
>>> /usr/lib/cgi-bin/awstats.pl -a -f /etc/awstats/awstats.conf -a -r
>>> /var/log/apache/access.log ] && /usr/lib/cgi-bin/awstats.pl
>>> -config=awstats -update >/dev/null)
>>> Jan 17 23:10:01 nebka /USR/SBIN/CRON[14238]: (root) CMD (test -x
>>> /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
>>> Jan 17 23:15:01 nebka /USR/SBIN/CRON[14474]: (root) CMD ([ -x
>>> /usr/lib/sysstat/sa1 ] && { [ -r "$DEFAULT" ] && . "$DEFAULT" ; [
>>> "$ENABLED" = "true" ] && exec /usr/lib/sysstat/sa1 $SA1_OPTIONS 1 1 ; })
>>> Jan 17 23:16:01 nebka /USR/SBIN/CRON[14476]: (aras) CMD
>>> (/home/aras/acis/bin/apu 7 >>/home/aras/apu-job.log 2>&1)
>>> Jan 17 23:17:01 nebka /USR/SBIN/CRON[14489]: (root) CMD ( cd / &&
>>> run-parts --report /etc/cron.hourly)
>>> Jan 17 23:18:01 nebka /USR/SBIN/CRON[14492]: (aras) CMD (cd
>>> /home/aras/acis && /home/aras/acis/bin/make-repec-per.sh )
>>> Jan 17 23:20:01 nebka /USR/SBIN/CRON[14547]: (www-data) CMD ([ -x
>>> /usr/lib/cgi-bin/awstats.pl -a -f /etc/awstats/awstats.conf -a -r
>>> /var/log/apache/access.log ] && /usr/lib/cgi-bin/awstats.pl
>>> -config=awstats -update >/dev/null)
>>> Jan 17 23:20:01 nebka /USR/SBIN/CRON[14548]: (root) CMD (test -x
>>> /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
>>> Jan 17 23:22:01 nebka /USR/SBIN/CRON[14703]: (root) CMD (du -cs /* >
>>> du_slash_`date -I`)
>>> Jan 18 14:04:08 nebka syslogd 1.4.1#18: restart.
>>>
>>>
>>> ...
>>>
>>>
>>> Jan 17 23:09:01 nebka /USR/SBIN/CRON[14205]: (aras) CMD (cd
>>> /home/aras/acis && /home/aras/acis/bin/make-repec-per.sh )
>>> Jan 17 23:10:01 nebka /USR/SBIN/CRON[14237]: (www-data) CMD ([ -x
>>> /usr/lib/cgi-bin/awstats.pl -a -f /etc/awstats/awstats.conf -a -r
>>> /var/log/apache/access.log ] && /usr/lib/cgi-bin/awstats.pl
>>> -config=awstats -update >/dev/null)
>>> Jan 17 23:10:01 nebka /USR/SBIN/CRON[14238]: (root) CMD (test -x
>>> /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
>>> Jan 17 23:15:01 nebka /USR/SBIN/CRON[14474]: (root) CMD ([ -x
>>> /usr/lib/sysstat/sa1 ] && { [ -r "$DEFAULT" ] && . "$DEFAULT" ; [
>>> "$ENABLED" = "true" ] && exec /usr/lib/sysstat/sa1 $SA1_OPTIONS 1 1 ; })
>>> Jan 17 23:16:01 nebka /USR/SBIN/CRON[14476]: (aras) CMD
>>> (/home/aras/acis/bin/apu 7 >>/home/aras/apu-job.log 2>&1)
>>> Jan 17 23:17:01 nebka /USR/SBIN/CRON[14489]: (root) CMD ( cd / &&
>>> run-parts --report /etc/cron.hourly)
>>> Jan 17 23:18:01 nebka /USR/SBIN/CRON[14492]: (aras) CMD (cd
>>> /home/aras/acis && /home/aras/acis/bin/make-repec-per.sh )
>>> Jan 17 23:20:01 nebka /USR/SBIN/CRON[14547]: (www-data) CMD ([ -x
>>> /usr/lib/cgi-bin/awstats.pl -a -f /etc/awstats/awstats.conf -a -r
>>> /var/log/apache/access.log ] && /usr/lib/cgi-bin/awstats.pl
>>> -config=awstats -update >/dev/null)
>>> Jan 17 23:20:01 nebka /USR/SBIN/CRON[14548]: (root) CMD (test -x
>>> /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
>>> Jan 17 23:22:01 nebka /USR/SBIN/CRON[14703]: (root) CMD (du -cs /* >
>>> du_slash_`date -I`)
>>> Jan 18 14:04:08 nebka syslogd 1.4.1#18: restart.
>>>
>>> du /* seems to be the tripping point.
>>>
>>> Christian Zimmermann FIGUGEGL!
>>> Department of Economics
>>> University of Connecticut
>>> 341 Mansfield Road, Unit 1063
>>> Storrs, CT 06269-1063
>>> http://ideas.repec.org/zimm/ christian.zimmermann at uconn.edu
>>> http://ideas.repec.org/e/pzi1.html
>>>
>>> On Mon, 28 Jan 2008, Christian Zimmermann wrote:
>>>
>>>> Tim seems to have put nebka back online, and it seems to be spewing out
>>>> emails. I will comment everything in crontab and kill whatever is running
>>>> to let us investigate the problems.
>>>>
>>>> Christian Zimmermann FIGUGEGL!
>>>> Department of Economics
>>>> University of Connecticut
>>>> 341 Mansfield Road, Unit 1063
>>>> Storrs, CT 06269-1063
>>>> http://ideas.repec.org/zimm/ christian.zimmermann at uconn.edu
>>>> http://ideas.repec.org/e/pzi1.html
>>>>
>>>> _______________________________________________
>>>> RAS-run mailing list
>>>> RAS-run at lists.openlib.org
>>>> http://lists.openlib.org/cgi-bin/mailman/listinfo/ras-run
>>>>
>>>
>>> _______________________________________________
>>> RAS-run mailing list
>>> RAS-run at lists.openlib.org
>>> http://lists.openlib.org/cgi-bin/mailman/listinfo/ras-run
>>>
>>
>> _______________________________________________
>> RAS-run mailing list
>> RAS-run at lists.openlib.org
>> http://lists.openlib.org/cgi-bin/mailman/listinfo/ras-run
>
> -ivan
>
>
>
More information about the RAS-run
mailing list