[RAS] badblocks
Thomas Krichel
krichel at openlib.org
Wed Feb 27 13:52:19 EST 2008
----- Forwarded message from Bob Parks <bparks at artsci.wustl.edu> -----
Envelope-to: krichel at localhost
Delivery-date: Thu, 28 Feb 2008 00:47:49 +0600
From: Bob Parks <bparks at artsci.wustl.edu>
To: Thomas Krichel <krichel at openlib.org>
X-Antivirus: avast! (VPS 080227-0, 02/27/2008), Outbound message
X-Antivirus-Status: Clean
X-SA-Exim-Connect-IP: 128.252.93.43
X-SA-Exim-Mail-From: bparks at artsci.wustl.edu
X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on snefru.openlib.org
X-Spam-Level:
X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham
version=3.2.3
Subject: Re: [RAS] badblocks
X-SA-Exim-Version: 4.2.1 (built Tue, 21 Aug 2007 23:39:36 +0000)
X-SA-Exim-Scanned: Yes (on snefru.openlib.org)
Thomas Krichel wrote:
> Bob Parks writes
>
>
>> Yes, IMHO. As Christian wrote earlier about nebka, there are limits to
>> directory sizes. He seemed to indicate that a cron job
>> with du might have been the entire problem. We have had similar problems
>> in the past.
>
> my theory: du puts stress on the disk, it hits the bad block, and bang!
>
Possible, very possible.
>> There are bad blocks on every disk. Bad blocks, unless a large number,
>> do not show that the 'disk' is failing. And again, this is a mirror'ed
>> disk, two disks, in Raid 1, with a hardware controller. Now that I think
>> on it,
>> it is not clear what badblocks on what disk are being reported by the
>> Adaptec controller -
>>
>
> my theory: the disk is one disk to the o/s.
Yes it is, but a bad block is a physical disk concept - but who knows what
evil lurks in the depths.
>
>> Note that nearly identical hardware exists on Bill's RFE machine and
>> never an error. You have had problems
>> on nebka, and snefru (idential hardware) and raneb (very different
>> hardware). That alone leads me to suspect
>> software.
>>
>
> I don't remember a problem on snefru. The common file set are
> the adrepec files (common on raneb, sahure, fafner, nebka, mutabor) and
> the citec files, common on mutabor, raneb,
> snefru, sahure, fafner (Yes, I back up!).
> What I think is what's written in 27.2.4. badblocks and e2fsck
> of
> http://eduunix.ccut.edu.cn/index/html/linux/OReilly.LPI.Linux.Certification.in.a.Nutshell.2nd.Edition.Jul.2006/0596005288/lpicertnut2-CHP-27-SECT-2.html
>
> They say
> When a disk is failing, it will usually get an exponential increase in
> bad blocks, and after a short while it will run out of spare blocks,
> whereupon you will get into trouble with your filesystems on that
> disk.
>
> It has already run out of spare blocks, that's why some
> bad blocks show up to the o/s.
>
Could very well be - the eduunix.ccut.edu is very good and I will go with
your theory. I will be interested to know just
how you rsync to the 143 gig and then make it bootable.
Bob
> Cheers,
>
> Thomas Krichel http://openlib.org/home/krichel
> RePEc:per:1965-06-05:thomas_krichel
> phone: +7 383 330 6813 skype: thomaskrichel
>
> _______________________________________________
> RAS-run mailing list
> RAS-run at lists.openlib.org
> http://lists.openlib.org/cgi-bin/mailman/listinfo/ras-run
>
More information about the RAS-run
mailing list