[RAS] badblocks
Thomas Krichel
krichel at openlib.org
Wed Feb 27 13:47:26 EST 2008
----- Forwarded message from Bob Parks <bparks at artsci.wustl.edu> -----
Envelope-to: krichel at localhost
Delivery-date: Wed, 27 Feb 2008 23:57:15 +0600
From: Bob Parks <bparks at artsci.wustl.edu>
To: Thomas Krichel <krichel at openlib.org>
X-Antivirus: avast! (VPS 080227-0, 02/27/2008), Outbound message
X-Antivirus-Status: Clean
X-SA-Exim-Connect-IP: 128.252.93.43
X-SA-Exim-Mail-From: bparks at artsci.wustl.edu
X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on snefru.openlib.org
X-Spam-Level:
X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham
version=3.2.3
Subject: Re: [RAS] badblocks
X-SA-Exim-Version: 4.2.1 (built Tue, 21 Aug 2007 23:39:36 +0000)
X-SA-Exim-Scanned: Yes (on snefru.openlib.org)
Thomas Krichel wrote:
>
> Bob Parks writes
>
>
>> My meory is different about raneb.
>>
>
> We had a bad disk. When we replaced, it was fine.
>
>
>> All of the errors seem to melt away when some of the crons are disabled -
>> such as Christian did with the ones involving du.
>>
>>
>
> So what does this conclude? A software problem?
>
Yes, IMHO. As Christian wrote earlier about nebka, there are limits to
directory sizes. He seemed to indicate that a cron job
with du might have been the entire problem. We have had similar problems
in the past.
>
>> Get rid of THE disk does not compute. There are two disks, and in a
>> configuration that is as fault tolerant as
>> it gets.
>>
>
> But it will break at some stage. The badblocks show it's broken.
>
There are bad blocks on every disk. Bad blocks, unless a large number, do
not show that the 'disk' is failing. And again, this is a mirror'ed disk,
two disks, in Raid 1, with a hardware controller. Now that I think on it,
it is not clear what badblocks on what disk are being reported by the
Adaptec controller -
Note that nearly identical hardware exists on Bill's RFE machine and never
an error. You have had problems
on nebka, and snefru (idential hardware) and raneb (very different
hardware). That alone leads me to suspect
software.
>
>> Up to you and Christian but I believe this is not the solution. Bob
>>
>
> What is the solution?
>
>
As Christian has done, carefully bring the machine back to life without all
the crons and add the crons sparingly. I have not
heard of any more problems with nebka since he did that and it is on the
same Raid 1 2 disk mirror.
If you do decide to make the 143 gig bootable, Christian should, after a
time, boot and enter the Adaptec controller.
Then break the 'container' which has the two 68 gig disks, and then you can
have two 68 gig disks, check them individually,
and gain 68 gig of space.
In the end, it is your choice.
Bob
> Cheers,
>
> Thomas Krichel http://openlib.org/home/krichel
> RePEc:per:1965-06-05:thomas_krichel
> phone: +7 383 330 6813 skype: thomaskrichel
>
----- End forwarded message -----
--
Cheers,
Thomas Krichel http://openlib.org/home/krichel
RePEc:per:1965-06-05:thomas_krichel
phone: +7 383 330 6813 skype: thomaskrichel
More information about the RAS-run
mailing list