FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo Desktop

 
 
LinkBack Thread Tools
 
Old 03-26-2011, 01:46 AM
Lindsay Haisley
 
Default System problems - some progress

On Fri, 2011-03-25 at 22:59 +0000, Duncan wrote:
> Simply my experience-educated opinion. YMMV, as they say. And of course,
> it applies to new installations more than your current situation, but as
> you mentioned that you are planning such a new installation...

Duncan, thanks for your very thorough discussion of current technologies
disk/RAID/filesystem/etc. technologies. Wow! I'm going to have to read
it through several times to absorb it. I've gotten to the point at
which I'm more involved with what I can _do_ with the Linux boxes I set
up than what I can do that's cool and cutting edge with Linux in setting
them up, but playing with bleeding-edge stuff has always been tempting.
Some of the stuff you've mentioned, such as btrfs, are totally new to me
since I haven't kept up with the state of the art. Some years ago we
had EVMS, which was developed by IBM here in Austin. I was a member of
the Capital Area Central Texas UNIX Society (CACTUS) and we had the EVMS
developers come and talk about it. EVMS was great. It was a layered
technology with an API for a management client, so you could have a CLI,
a GUI, a web-based management client, whatever, and a all of them useing
the same API to the disk management layer. It was an umbrella
technology which covered several levels of Linux MD Raid plus LVM. You
could put fundamental storage elements together like tinker-toys and
slice and dice them any way you wanted to.

EVMS was started from an initrd, which set up the EVMS platform and then
did a pivot_root to the EVMS-supported result. I have our SOHO
firewall/gateway and file server set up with it. The root fs is on a
Linux MD RAID-1 array, and what's on top of that I've rather forgotten
but the result is a drive and partition layout that makes sense for the
purpose of the box. I set this up as a kind of proof-of-concept
exercise because I was taken with EVMS and figured it would be useful,
which it was. The down side of this was that some time after that, IBM
dropped support for the EVMS project and pulled their developers off of
it. I was impressed with the fact that IBM was actually paying people
to develop open source stuff, but when they pulled the plug on it, EVMS
became an orphaned project. The firewall/gateway box runs Gentoo, so I
proceeded with regular updates until one day the box stopped booting.
The libraries, notably glibc, embedded in the initrd system got out of
sync, version wise, with the rest of the system, and I was getting some
severely strange errors early in the boot process followed by a kernel
panic. It took a bit of work to even _see_ the errors, since they were
emitted in the boot process earlier than CLI scroll-back comes into
play, and then there was further research to determine what I needed to
do to fix the problem. I ended up having to mount and then manually
repair the initrd internal filesystem, manually reconstituting library
symlinks as required.

I've built some Linux boxes for one of my clients - 1U servers and the
like. These folks are pretty straight-forward in their requirements,
mainly asking that the boxes just work. The really creative work goes
into the medical research PHP application that lives on the boxes, and
I've learned beaucoup stuff about OOP in PHP, AJAX, etc. from the main
programming fellow on the project. We've standardized on Ubuntu server
edition on SuperMicro 4 drive 1U boxes. These boxes generally come with
RAID supported by a proprietary chipset or two, which never works quite
right with Linux, so the first job I always do with these is to rip out
the SATA cabling from the back plane and replace the on-board RAID with
an LSI 3ware card. These cards don't mess around - they JUST WORK.
LSI/3ware has been very good about supporting Linux for their products.
We generally set these up as RAID-5 boxes. There's a web-based
monitoring daemon for Linux that comes with the card, and it just works,
too, although it takes a bit of dickering. The RAID has no component in
user-space (except for the monitor daemon) and shows up as a single SCSI
drive, which can be partitioned and formatted just as if it were a
single drive. The 3ware cards are nice! If you're using a redundant
array such as RAID-1 or RAID-5, you can designate a drive as a hot-spare
and if one of the drives in an array fails, the card will fail over to
the hot-spare, rebuild the array, and the monitor daemon will send you
an email telling you that it happened. Slick!

The LSI 3ware cards aren't cheap, but they're not unreasonable either,
and I've never had one fail. I'm thinking that the my drive setup on my
new desktop box will probably use RAID-1 supported by a 3ware card.
I'll probably use an ext3 filesystem on the partitions. I know ext4 is
under development, but I'm not sure if it would offer me any advantages.
I used reiserfs on some of the partitions on my servers, and on some
partitions on my desktop box too. Big mistake! There was a bug in
reiserfs support the current kernel when I built the first server and
the kernel crapped all over the hard drive one night and the box
crashed! I was able to fix it and salvage customer data, but it was
pretty scary. Hans Reiser is in prison for life for murder, and there's
like one person on the Linux kernel development group who maintains
reiserfs. Ext3/4, on the other hand, is solid, maybe not quite as fast,
but supported by a dedicated group of developers.

So I'm thinking that this new box will have a couple of
professional-grade (the 5-year warranty type) 1.5 or 2 TB drives and a
3ware card. I still haven't settled on the mainboard, which will have
to support the 3ware card, a couple of sound cards and a legacy Adaptec
SCSI card for our ancient but extremely well-built HP scanner. The
chipset will have to be well supported in Linux. I'll probably build
the box myself once I decide on the hardware.

I'm gonna apply the KISS principle to the OS design for this, and stay
away from bleeding edge software technologies, although, especially
after reading your essay, it's very tempting to try some of this stuff
out to see what the the _really_ smart people are coming up with! I'm
getting off of the Linux state-of-the art train for a while and go
walking in the woods. The kernel will have to be low-latency since I
may use the box for recording work with Jack and Ardour2, and almost
certainly for audio editing, and maybe video editing at some point.
That's where my energy is going to go for this one.

--
Lindsay Haisley |"Windows .....
FMP Computer Services | life's too short!"
512-259-1190 |
http://www.fmp.com | - Brad Johnston
 
Old 03-26-2011, 07:40 AM
Duncan
 
Default System problems - some progress

Lindsay Haisley posted on Fri, 25 Mar 2011 21:46:32 -0500 as excerpted:

> On Fri, 2011-03-25 at 22:59 +0000, Duncan wrote:
>> Simply my experience-educated opinion. YMMV, as they say. And of
>> course,
>> it applies to new installations more than your current situation, but
>> as you mentioned that you are planning such a new installation...
>
> Duncan, thanks for your very thorough discussion of current technologies
> disk/RAID/filesystem/etc. technologies. Wow! I'm going to have to read
> it through several times to absorb it. I've gotten to the point at
> which I'm more involved with what I can _do_ with the Linux boxes I set
> up than what I can do that's cool and cutting edge with Linux in setting
> them up, but playing with bleeding-edge stuff has always been tempting.

By contrast, Linux is still my hobby, tho really, a full time one in that
I spend hours a day at it, pretty much 7 days a week. I'm thinking I
might switch to Linux as a job at some point, perhaps soon, but it's not a
switch I'll make lightly, and it's not something I'll even consider
"selling my soul for" to take -- it'll be on my terms or I might as well
stay with Linux as a hobby -- an arrangement that works and that suits me
fine.

Because Linux is a hobby, I go where my interest leads me. Even tho I'm
not a developer, I test, bisect and file bugs on the latest git kernels
and am proud to be able to say that a number of bugs were fixed before
release (and one, a Radeon AGP graphics bug, after, it hit stable and two
kernel releases before it was ultimately fixed, as reasonably new graphics
cards on AGP busses aren't as common as they once were...) because of my
work.

Slowly, one at a time, I've tackled Bind DNS, NTPD, md/RAID, direct
netfilter/iptables (which interestingly enough were *SIGNIFICANTLY* easier
for me to wrap my mind around than the various so-called "easier" firewall
tools that ultimately use netfilter/iptables at the back end anyway,
perhaps because I already understood network basics and all the "simple"
ones simply obscured the situation for me) and other generally considered
"enterprise" tools. But while having my head around these normally
enterprise technologies well enough to troubleshoot them may well help me
with a job in the field in the future, that's not why I learned them. As
a Linux hobbyist, I learned them for much the same reason mountain climber
hobbyists climb mountains, because they were there, and for the personal
challenge.

Meanwhile, as I alluded to earlier, I tried LVM2 (successor to both EVMS
and the original LVM, as you likely already know) on top of md/RAID, but
found that for me, they layering of technologies obscured my
understanding, to the point where I was no longer comfortable with my
ability to recover in a disaster situation in which both the RAID and LVM2
levels were damaged.

Couple that with an experience where I had a broken LVM2 that needed
rebuilt, but with the portage tree on LVM2, and I realized that for what I
was doing, especially since md/raid's partitioned-raid support was now
quite robust, the LVM2 layer just added complexity for very little real
added flexibility or value, particularly since I couldn't put / on it
anyway, without an initr* (one of the technologies I've never taken time
to understand to the point I'm comfortable using it).

That's why I recommended that you pick a storage layer technology that
fits your needs as best you can, get comfortable with it, and avoid if
possible multi-layering. The keep-it-simple rule really /does/ help avoid
admin-level fat-fingering, which really /is/ a threat to data and system
integrity. Sure, there's a bit of additional flexibility by layering, but
it's worth the hard look at whether the benefit really does justify the
additional complexity. In triple-digit or even higher double-digit
storage device situations, basically enterprise level, there's certainly
many scenarios where the multi-level layering adds significant value, but
part of being a good admin, ultimately, is recognizing where that's not
the case, and with a bit of experience under my belt, I realized it wasn't
a good tradeoff for my usage.

Here, I picked md/raid over lvm2 (and over hardware RAID) for a number of
reasons. First, md/raid for / can be directly configured on the kernel
command line. No initr* needed. That allowed me to put off learning
initr* tech for another day, as well as reducing complexity. As for
md/raid over hardware RAID, there's certainly a place for both,
particularly when md/raid may be layered on hardware raid, but for low-
budget emphasis of the R(edundant) in Redundant Array of Independent
Devices (RAID), there's nothing like being able to plug in not just
another drive, but any standard (SATA in my case) controller, and/or any
mobo, and with at worst a kernel rebuild with the new drivers (since I
choose to build-in what I need and not build what I don't, so a kernel
rebuild would be necessary if it were different controllers), be back up
and running. No having to find a compatible RAID card... Plus, the Linux
kernel md/RAID stack has FAR FAR more testing under all sorts of corner-
case situations than any hardware RAID is **EVER** going to get.

But as I mentioned, the times they are a changin', and with the latest
desktop environments (kde4.6, gnome3 and I believe the latest gnome2, and
xfce?, at minimum) leaving the deprecated hal behind in favor of udev/
udisks/upower and etc, and with udisks in particular depending on device-
mapper, now part of lvm2, and the usual removable device auto-detect/auto-
mount functionality of the desktops dependent in turn on udisks, while it
probably won't affect non-X server based installations and arguably
doesn't /heavily/ (other than non-optional dependencies) affect *ix
traditionalist desktop users who aren't comfortable with auto-mount in the
first place (I'm one, there's known security tradeoffs involved, see
recent stories on auto-triggered vulns due to gnome scanning for icons on
auto-mounted thumbdrives and/or cds, for instance), it's a fact that
within the year, most new releases will be requiring that device-mapper
and thus lvm2 be installed and device-mapper enabled in the kernel, to
support their automount functionality. As such, and because lvm2 has at
least basic raid-0 and raid-1 support (tho not the more advanced stuff,
raid5/6/10/50/60 etc, last I checked, but I may well be behind) of its
own, particularly for distributions relying on prebuilt kernels, therefore
modules, therefore initr*s, already, so lvm2's initr* requirement isn't a
factor, lvm2 is likely to be a better choice for many than md/raid.

Meanwhile, while btrfs isn't /yet/ a major choice unless you want still
clearly experimental, by first distro releases next year, it's very likely
to be. And because it's the clear successor to ext*, and has built-in
multi-device and volume management flexibility of its own, come next year,
both lvm2 and md/raid will lose their place in the spotlight to a large
degree. Yet still, btrfs is not yet mature and while tantalizing in its
closeness, remains still an impractical immediate choice. Plus, there's
likely to be some limitations to its device management abilities that
aren't clear yet, at least not to those not intimately following its
development, and significant questions remain on what filesystem-supported
features will be boot-time-supported as well.

> Some of the stuff you've mentioned, such as btrfs, are totally new to me
> since I haven't kept up with the state of the art. Some years ago we
> had EVMS, which was developed by IBM here in Austin. I was a member of
> the Capital Area Central Texas UNIX Society (CACTUS) and we had the EVMS
> developers come and talk about it. EVMS was great. It was a layered
> technology with an API for a management client, so you could have a CLI,
> a GUI, a web-based management client, whatever, and a all of them useing
> the same API to the disk management layer. It was an umbrella
> technology which covered several levels of Linux MD Raid plus LVM. You
> could put fundamental storage elements together like tinker-toys and
> slice and dice them any way you wanted to.

The technology is truly wonderful. Unfortunately, the fact that running /
on it requires an initr* means it's significantly less useful than it
might be. Were that one requirement removed, the whole equation would be
altered and I may still be running LVM instead of md/raid. Not that it's
much of a problem for those running a binary-based distribution already
dependent on an initr*, but even in my early days on Linux, back on
Mandrake, I was one of the outliers who learned kernel config
customization and building within months of switching to Linux, and never
looked back. And once you're doing that, why have an initr* unless you're
absolutely forced into it, which in turn puts a pretty strong negative on
any technology that's going to force you into it...

> EVMS was started from an initrd, which set up the EVMS platform and then
> did a pivot_root to the EVMS-supported result. I have our SOHO
> firewall/gateway and file server set up with it. The root fs is on a
> Linux MD RAID-1 array, and what's on top of that I've rather forgotten
> but the result is a drive and partition layout that makes sense for the
> purpose of the box. I set this up as a kind of proof-of-concept
> exercise because I was taken with EVMS and figured it would be useful,
> which it was. The down side of this was that some time after that, IBM
> dropped support for the EVMS project and pulled their developers off of
> it. I was impressed with the fact that IBM was actually paying people
> to develop open source stuff, but when they pulled the plug on it, EVMS
> became an orphaned project. The firewall/gateway box runs Gentoo, so I
> proceeded with regular updates until one day the box stopped booting.
> The libraries, notably glibc, embedded in the initrd system got out of
> sync, version wise, with the rest of the system, and I was getting some
> severely strange errors early in the boot process followed by a kernel
> panic. It took a bit of work to even _see_ the errors, since they were
> emitted in the boot process earlier than CLI scroll-back comes into
> play, and then there was further research to determine what I needed to
> do to fix the problem. I ended up having to mount and then manually
> repair the initrd internal filesystem, manually reconstituting library
> symlinks as required.

That's interesting. I thought most distributions used or recommended use
of an alternative libc in their initr*, one that either fully static-
linked so didn't need included if the binaries were static-linked, or at
least was smaller and more fit for the purpose of a dedicated limited-
space-ram-disk early-boot environment.

But if you're basing the initr* on glibc, which would certainly be easier
and is, now that I think of it, probably the way gentoo handles it, yeah,
I could see the glibc getting stale in the initrd.

... Because if it was an initramfs, it'd be presumably rebuilt and thus
synced with the live system when the kernel was updated, since an
initramfs is appended to the kernel binary file itself. That seems to me
to be a big benefit to the initramfs system, easier syncing with the built
kernel and the main system, tho it would certainly take longer, and I
expect for that reason that the initramfs rebuild could be short-
circuited, thus allowed to get stale, if desired. But it should at least
be easier to keep updated if desired, because it /is/ dynamically attached
to the kernel binary at each kernel build.

But as I said I haven't really gotten into initr*s. In fact, I don't even
build busybox, etc, here, sticking it in package.provided. If my working
system gets so screwed up that I'd end up with busybox, I simply boot to
the backup / partition instead. That backup is actually a fully
operational system snapshot taken at the point the backup was made, so it
includes everything the system did at that point. As such, unlike most
people's limited recovery environments, I have a full-featured X, KDE,
etc, all fully functional and working just as they were on my main system
at the time of the backup. So if it comes to it, I can simply switch to
it as my main root, and go back to work, updating or building from a new
stage-3 as necessary, at my leisure, not because I have to before I can
get a working system again, because the backup /is/ a working system, as
fully functional (if outdated) as it was on the day I took the backup.

> I've built some Linux boxes for one of my clients - 1U servers and the
> like. These folks are pretty straight-forward in their requirements,
> mainly asking that the boxes just work. The really creative work goes
> into the medical research PHP application that lives on the boxes, and
> I've learned beaucoup stuff about OOP in PHP, AJAX, etc. from the main
> programming fellow on the project. We've standardized on Ubuntu server
> edition on SuperMicro 4 drive 1U boxes. These boxes generally come with
> RAID supported by a proprietary chipset or two, which never works quite
> right with Linux, so the first job I always do with these is to rip out
> the SATA cabling from the back plane and replace the on-board RAID with
> an LSI 3ware card. These cards don't mess around - they JUST WORK.
> LSI/3ware has been very good about supporting Linux for their products.
> We generally set these up as RAID-5 boxes. There's a web-based
> monitoring daemon for Linux that comes with the card, and it just works,
> too, although it takes a bit of dickering. The RAID has no component in
> user-space (except for the monitor daemon) and shows up as a single SCSI
> drive, which can be partitioned and formatted just as if it were a
> single drive. The 3ware cards are nice! If you're using a redundant
> array such as RAID-1 or RAID-5, you can designate a drive as a hot-spare
> and if one of the drives in an array fails, the card will fail over to
> the hot-spare, rebuild the array, and the monitor daemon will send you
> an email telling you that it happened. Slick!

> The LSI 3ware cards aren't cheap, but they're not unreasonable either,
> and I've never had one fail. I'm thinking that the my drive setup on my
> new desktop box will probably use RAID-1 supported by a 3ware card.

That sounds about as standard as a hardware RAID card can be. I like my
md/raid because I can literally use any standard SATA controller, but
there are certainly tradeoffs. I'll have to keep this in mind in case I
ever do need to scale an installation to where hardware RAID is needed
(say if I were layering md/kernel RAID-0 on top of hardware RAID-1 or
RAID-5/6).

FWIW, experience again. I don't believe software RAID-5/6 to be worth
it. md/raid 0 or 1 or 10, yes, but if I'm doing RAID-5 or 6, it'll be
hardware, likely underneath a kernel level RAID-0 or 10, for a final
RAID-50/60/500/600/510/610, with the left-most digit as the hardware
implementation. This because software RAID-5/6 is simply slow.

Similar experience, Linux md/RAID-1 is /surprisingly/ efficient, MUCH more
so than one might imagine, as the kernel I/O scheduler makes *VERY* good
use of parallel scheduling. In fact, in many cases it beats RAID-0
performance, unless of course you need the additional space of RAID-0.
Certainly, that has been my experience, at least.

> I'll
> probably use an ext3 filesystem on the partitions. I know ext4 is under
> development, but I'm not sure if it would offer me any advantages.

FWIW, ext4 is no longer "under development", it's officially mature and
ready for use, and has been for a number of kernels, now. In fact,
they're actually planning on killing separate ext2/3 driver support as the
ext4 driver implements it anyway.

As such, I'd definitely recommend considering ext4, noting of course that
you can specifically enable/disable various ext4 features at mkfs and/or
mount time, if desired. Thus there's really no reason to stick with ext3
now. Go with ext4, and if your situation warrants, disable one or more of
the ext4 features, making it more ext3-like.

OTOH, I'd specifically recommend evaluating the journal options,
regardless of ext3/ext4 choice. ext3 defaulted to "ordered" for years,
then for a few kernels, switched to "writeback" by default, then just
recently (2.6.38?) switched back to "ordered". AFAIK, ext4 has always
defaulted to the faster but less corner-case crash safe "writeback". The
third and most conservative option is of course "journal" (journal the
data too, as opposed to metadata only, with the other two).

Having lived thru the reiserfs "writeback" era and been OH so glad when
they implemented and defaulted to "ordered" for it, I don't believe I'll
/ever/ trust anything beyond what I'd trust on a RAID-0 without backups,
to "writeback" again, regardless of /what/ the filesystem designers say or
what the default is.

And, I know of at least one person that experienced data integrity issues
with writeback on ext3 when the kernel was defaulting to that, that
immediately disappeared when he switched back to ordered.

Bottom line, yeah I believe ext4 is safe, but ext3 or ext4, unless you
really do /not/ care about your data integrity or are going to the extreme
and already have data=journal, DEFINITELY specify data=ordered, both in
your mount options, and by setting the defaults via tune2fs.

If there's one bit of advice in all these posts that I'd have you take
away, it's that. It's NOT worth the integrity of your data! Use
data=ordered unless you really do NOT care, to the same degree that you
don't put data you care about on RAID-0, without at least ensuring that
it's backed up elsewhere. I've seen people lose data needlessly over
this; I've lost it on reiserfs myself before they implemented data=ordered
by default, and truly, just as with RAID-0, data=writeback is NOT worth
whatever performance increase it might bring, unless you really do /not/
care about the data integrity on that filesystem!

> I used reiserfs on some of the partitions on my servers, and on some
> partitions on my desktop box too. Big mistake! There was a bug in
> reiserfs support the current kernel when I built the first server and
> the kernel crapped all over the hard drive one night and the box
> crashed!

IDR the kernel version but there's one that's specifically warned about.
That must have been the one...

But FWIW, I've had no problems, even thru a period of bad-ram resulting in
kernel crashes and the like, since the introduction of journal=ordered.
Given the time mentioned above when ext3 defaulted to data=writeback, I'd
even venture to say that for that period and on those kernels, reiserfs
may well have been safer than ext3!

> I was able to fix it and salvage customer data, but it was
> pretty scary. Hans Reiser is in prison for life for murder, and there's
> like one person on the Linux kernel development group who maintains
> reiserfs. Ext3/4, on the other hand, is solid, maybe not quite as fast,
> but supported by a dedicated group of developers.

For many years, the kernel person doing most of the reiserfs maintenance
and the one who introduced the previously mentioned data=ordered mode by
default and data=journal mode as an option, was Chris Mason. I believe he
was employed by SuSE, for years the biggest distribution to default to
reiserfs, even before it was in mainline, I believe. I'm not sure if he's
still the official reiserfs maintainer due to his current duties, but he
DEFINITELY groks the filesystem. Those current duties? He's employed by
Oracle now, and is the lead developer of btrfs.

Now reiserfs does have its warts. It's not particularly fast any more,
and has performance issues on multi-core systems due to its design around
the BKL (big kernel lock, deprecated for years with users converting to
other lock methods, no current in-tree users with 2.6.38 and set to be
fully removed with 2.6.39), which tho reiserfs was converted to other
locking a few kernels ago, the single-access-at-a-time assumption and
bottleneck lives on. However, it is and has been quite stable for many
years now, since the intro of data=ordered, to the point that as mentioned
above, I believe it was safer than ext3 during the time ext3 defaulted to
writeback, because reiserfs still had the saner default of data=ordered.

But to each his own. I'd still argue that a data=writeback default is
needlessly risking data, however, and far more dangerous regardless of
whether it's ext3, ext4, or reiserfs, than any of the three themselves
are, otherwise.

> So I'm thinking that this new box will have a couple of
> professional-grade (the 5-year warranty type) 1.5 or 2 TB drives and a
> 3ware card. I still haven't settled on the mainboard, which will have
> to support the 3ware card, a couple of sound cards and a legacy Adaptec
> SCSI card for our ancient but extremely well-built HP scanner. The
> chipset will have to be well supported in Linux. I'll probably build
> the box myself once I decide on the hardware.

FWIW, my RAID is 4x SATA 300 gig Seagates, 5 year warranty I expect now
either expired or soon to. Most of the system is RAID-1 across all four,
however, and I'm backed up to external as well altho I'll admit that
backup's a dated, now. I bought them after having a string of bad luck
with ~1 year failures on both Maxtor (which had previously been quite
dependable for me) and Western Digital (which I had read bad things about
but thought I'd try after Maxtor, only to have the same ~1 year issues).
Obviously, they've long outlasted those, so I've been satisfied.

As I said, I'll keep the 3ware RAID cards in mind.

Mainboard: If a server board fits your budget, I'd highly recommend
getting a Tyan board that's Linux certified. The one I'm running in my
main machine is now 8 years old, /long/ out of warranty and beyond further
BIOS updates, but still running solid. It was a $400 board back then,
reasonable for a dual-socket Opteron. Not only did it come with Linux
certifications for various distributions, but they had Linux specific
support. Further, how many boards do /you/ know of that have a pre-
customized sensors.conf file available for the download? =:^) And when
the dual-cores came out, a BIOS update was made available that supported
them. As a result, while it was $400, that then leading edge dual socket
Opteron board from eight years ago, while it's no longer leading edge by
any means, eight years later still forms the core for an acceptably decent
system, dual-dual-core Opteron 290s @ 2.8 GHz (topped out the sockets),
currently 6 gig RAM, 3x2-gig as I had one stick die that I've not
replaced, but 8 sockets so I could run 16 gig if I wanted, 4xSATA drives,
only SATA-150, but they're RAIDED, Radeon hd4650 AGP (no PCI-E, tho it
does have PCI-X), etc. No PCI-E, limited to SATA-150, and no hardware
virtualization instruction support on the CPUs, so it's definitely dated,
but after all, it's an 8 years' old system!

It's likely to be a decade old by the time I actually upgrade it. Yes,
it's definitely a server-class board and the $400 I paid reflected that,
but 8 years and shooting for 10! And with the official Linux support
including a custom sensors.conf. I'm satisfied that I got my money's
worth.

But I don't believe all Tyan's boards are as completely Linux supported as
that one was, so do your research.

> I'm gonna apply the KISS principle to the OS design for this, and stay
> away from bleeding edge software technologies, although, especially
> after reading your essay, it's very tempting to try some of this stuff
> out to see what the the _really_ smart people are coming up with! I'm
> getting off of the Linux state-of-the art train for a while and go
> walking in the woods. The kernel will have to be low-latency since I
> may use the box for recording work with Jack and Ardour2, and almost
> certainly for audio editing, and maybe video editing at some point.
> That's where my energy is going to go for this one.

Well, save btrfs for a project a couple years down the line, then. But
certainly, investigate md/raid vs lvm2 and make your choice, keeping in
mind that while nowdays they overlap features, md/raid doesn't require an
initr* to run / on it, while lvm2 will likely be pulled in as a dependency
for your X desktop, at least kde/gnome/xfce, by later this year, whether
you actually use its lvm features or not.

And do consider ext4, but regardless of ext3/4, be /sure/ you either
choose data=ordered or can give a good reason why you didn't. (Low-
latency writing just might be a reasonable excuse for data=writeback, but
be sure you keep backed up if you do!) Because /that/ one may well save
your data, someday!

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
 
Old 03-26-2011, 02:57 PM
Lindsay Haisley
 
Default System problems - some progress

On Sat, 2011-03-26 at 08:40 +0000, Duncan wrote:
> By contrast, Linux is still my hobby, tho really, a full time one in that
> I spend hours a day at it, pretty much 7 days a week. I'm thinking I
> might switch to Linux as a job at some point, perhaps soon, but it's not a
> switch I'll make lightly, and it's not something I'll even consider
> "selling my soul for" to take -- it'll be on my terms or I might as well
> stay with Linux as a hobby -- an arrangement that works and that suits me
> fine.

What professional work I've gotten with Linux has been a real lesson in
synergy. It seems as if every time I've gone out and experimented with
some facet of Linux technology - setting up iptables, learning routing
fundamentals, setting up and using OpenVPN, etc., I've been called upon
to use it, and get paid for using it, for a client. My main client, in
return, has increased my understanding of higher level programming stuff
tremendously!

> Slowly, one at a time, I've tackled Bind DNS, NTPD, md/RAID, direct
> netfilter/iptables (which interestingly enough were *SIGNIFICANTLY* easier
> for me to wrap my mind around than the various so-called "easier" firewall
> tools that ultimately use netfilter/iptables at the back end anyway,
> perhaps because I already understood network basics and all the "simple"
> ones simply obscured the situation for me) and other generally considered
> "enterprise" tools.

Yep, I know where you're coming from there. Iptables isn't all that
hard to understand, and I've become pretty conversant with it in the
process of using for my own and others' systems. I'd always rather deal
with the "under the hood" CLI tools than with some GUI tool that does
little more than obfuscate the real issue. That way lies Windows!

> Bottom line, yeah I believe ext4 is safe, but ext3 or ext4, unless you
> really do /not/ care about your data integrity or are going to the extreme
> and already have data=journal, DEFINITELY specify data=ordered, both in
> your mount options, and by setting the defaults via tune2fs.

So does this turn off journaling? What's a good reference on the
advantages of ext4 over ext3, or can you just summarize them for me?

> But if you're basing the initr* on glibc, which would certainly be easier
> and is, now that I think of it, probably the way gentoo handles it, yeah,
> I could see the glibc getting stale in the initrd.

The problem with Gentoo was that because EVMS was an orphaned project, I
believe the ebuild wasn't updated. The initrd file was specific for
EVMS.

> If there's one bit of advice in all these posts that I'd have you take
> away, it's that. It's NOT worth the integrity of your data! Use
> data=ordered unless you really do NOT care, to the same degree that you
> don't put data you care about on RAID-0, without at least ensuring that
> it's backed up elsewhere.

I've never used, or had much use for RAID-0. LVM provides the same
capabilities. For me, RAID is a way of insuring data integrity, and
large drives are getting cheaper and cheaper. I've only used RAID-1 and
RAID-5.

I'm not a speed-freak on disk I/O, and am generally quite willing to
sacrifice a bit of speed for reliability. data=writeback has been a
tweak, and I believe I've read up on it previously and decided against
it for probably the same reasons you cite. data=ordered has been the
default, but apparently upgrading to 2.6.36 I'm going to have to spec
this explicitly in /etc/fstab unless I upgrade to 2.6.38.

> FWIW, my RAID is 4x SATA 300 gig Seagates, 5 year warranty I expect now
> either expired or soon to. Most of the system is RAID-1 across all four,
> however, and I'm backed up to external as well altho I'll admit that
> backup's a dated, now. I bought them after having a string of bad luck
> with ~1 year failures on both Maxtor (which had previously been quite
> dependable for me)

I had a Maxtor drive actually *smoke* on me once, years ago. There was
a "pop", and smoke, and a big burned spot on the circuit board on the
drive! I never bought another Maxtor! It's the smoke inside the little
colored thingies on printed circuit boards that make them work! When
they break, and the smoke gets away, the thingies are useless.

I generally go with Seagates these days too, although the quality of
drives, and which brand is best, seems to change over time. I used to
swear by IBM drives until they had a bad run of them with a high failure
rate, and before this got sorted out they sold their drive biz to
Fujitsu.

> and Western Digital (which I had read bad things about
> but thought I'd try after Maxtor, only to have the same ~1 year issues).
> Obviously, they've long outlasted those, so I've been satisfied.
>
> As I said, I'll keep the 3ware RAID cards in mind.

After having had all kinds of trouble trying to get hardware RAID
working on one of my servers, I discovered the 3ware cards after asking
the advice of the hardware fellow here who works with one of my favorite
tech outfits in Austin, Outernet Connection Strategies. He builds a lot
of servers and doesn't even _try_ to get the native RAID chipsets to
work. He just slaps a 3ware card in them and moves on. It's _real_
RAID, all the useful levels, not "fakeraid".

> Mainboard: If a server board fits your budget, I'd highly recommend
> getting a Tyan board that's Linux certified. The one I'm running in my
> main machine is now 8 years old, /long/ out of warranty and beyond further
> BIOS updates, but still running solid.

Hmmm. I'll look into Tyan. I hadn't heard of them, but it sounds as if
they bend over backwards to work with Linux. That's always a plus.

> It's likely to be a decade old by the time I actually upgrade it. Yes,
> it's definitely a server-class board and the $400 I paid reflected that,
> but 8 years and shooting for 10! And with the official Linux support
> including a custom sensors.conf. I'm satisfied that I got my money's
> worth.
>
> But I don't believe all Tyan's boards are as completely Linux supported as
> that one was, so do your research.

Of course. I like technology that _lasts_! We have a clock in our
house that's about 190 years old, and came to me through my family. The
works are made of wood, and it keeps impeccable time - loses or gains
maybe 30 seconds a week if I wind it every day, which I need to. Some
years ago one of the wooden gears gave out from over a century of
stress. There's a label in the clock that says "warrented if well
used", and since I'd used it very well, I called up the Seth Thomas
company and told them that I had one of their clocks and it was broken,
and since I'd used it very well, I figured that it was still under
warranty. The gal with whom I talked was amused and intrigued, and
turned me on to the Connecticut Clock and Watch museum, run by one
George Bruno. It seems that Mr. Bruno also makes working replicas of
exactly the model of clock I have and was able to send me an exact
replacement part! Try _THAT_ with your 1990's era computer ;-) Every
time this nice old clock strikes the hour it reminds me that although I
work with computers where hardware is out of date in 5 years or so,
there are some things that were built to last!

But this is OT for this forum. Sorry, folks. I couldn't resist telling
a good story.

> Well, save btrfs for a project a couple years down the line, then. But
> certainly, investigate md/raid vs lvm2 and make your choice, keeping in
> mind that while nowdays they overlap features, md/raid doesn't require an
> initr* to run / on it, while lvm2 will likely be pulled in as a dependency
> for your X desktop, at least kde/gnome/xfce, by later this year, whether
> you actually use its lvm features or not.

Thanks, Duncan. Good advice, that.

> And do consider ext4, but regardless of ext3/4, be /sure/ you either
> choose data=ordered or can give a good reason why you didn't. (Low-
> latency writing just might be a reasonable excuse for data=writeback, but
> be sure you keep backed up if you do!) Because /that/ one may well save
> your data, someday!

I'm going to read up on btrfs and ext4, whether or not I use them.

--
Lindsay Haisley |"Windows .....
FMP Computer Services | life's too short!"
512-259-1190 |
http://www.fmp.com | - Brad Johnston
 
Old 04-01-2011, 03:22 AM
Duncan
 
Default System problems - some progress

Lindsay Haisley posted on Sat, 26 Mar 2011 10:57:33 -0500 as excerpted:

> Yep, I know where you're coming from there. Iptables isn't all that
> hard to understand, and I've become pretty conversant with it in the
> process of using for my own and others' systems. I'd always rather deal
> with the "under the hood" CLI tools than with some GUI tool that does
> little more than obfuscate the real issue. That way lies Windows!

Indeed, the MSWindows way is the GUI way. But I wasn't even thinking
about that. I was thinking about the so-called "easier" firewalling CLI/
text-editing tools that have you initially answer a number of questions to
setup the basics, then have you edit files to do any "advanced" tweaking
the questions didn't have the foresight to cover.

But my (first) problem was that while I could answer the questions easy
enough, I lacked sufficient understanding of the real implementation to
properly do the advanced editing. And if I were to properly dig into
that, I might as well have mastered the IPTables/Netfilter stuff on which
it was ultimately based in the first place.

The other problem, when building your own kernel, was that the so-called
simpler tools apparently expect all the necessary Netfilter/IPTable kernel
options to be available as pre-built modules (or built-in) -- IOW, they're
designed for the binary distributions where that's the case. Neither the
questions nor the underlying config file comments mentioned their kernel
module dependencies. One either had to pre-build them all and hope they
either got auto-loaded as needed, or delve into the scripts to figure out
the dependencies and build/load the required modules.

Now keep in mind that I first tried this on Mandrake, where I was building
my own kernel within 90 days of first undertaking the switch, while I was
still booting to MS to do mail and news in MSOE, because I hadn't yet had
time to look at user level apps well enough to make my choices and set
them up. So it's certainly NOT just a Gentoo thing. It's a build-your-
own-kernel thing, regardless of the distro.

The problem ultimately boiled down to having to understand IPTables itself
well enough to know what kernel options to enable, either built-in or as
modules which would then need to be loaded. But if I were to do that, why
would I need the so-called "easier" tool, that only complicated things.
Honestly, the tools made me feel like I was trying to remote-operate some
NASA probe from half-way-across-the-solar-system, latency and all, instead
of using the direct-drive, since what I was operating on was actually
right there next to me!

At that time I simply punted. I had (or could have and did have, by
(wise) choice on MS) a NAPT based router between me and the net anyway,
and already knew how to configure /it/. So I just kept it and ran the
computer itself without a firewall for a number of years. Several years
later, after switching to Gentoo, when I was quite comfortable on Linux in
general, I /did/ actually learn netfilter/iptables, configure my computer
firewall accordingly, and direct-connect for a year or two -- until my
local config changed and I actually had the need for a NAPT device as I
had multiple local devices to connect to the net.

Which brings up a nice point about Gentoo. With Mandrake (and most other
distributions of the era, from what I read), there were enough ports open
by default that having a firewall of /some/ sort, either on-lan NAPT
device or well configured on-computer IPChains/IPTables based, was wise.
IOW, keeping that NAPT device was a good choice, even if it /was/ an MS-
based view of things, because the Linux distros of the time still ran with
various open ports (whether they still do or not I don't know, I suspect
most do, tho they probably do it with an IPTables firewall up now too).

Gentoo's policy by contrast has always (well, since before early 2004,
when I switched to it) been:

1) Just because it's installed does NOT mean it should have its initscript
activated so it runs automatically in the default runlevel -- Gentoo ships
by default with the initscripts for net-active services in /etc/init.d,
but does NOT automatically add them to the default runlevel.

2) Even when a net-active service IS activated, Gentoo's default
configuration normally has it active on the loopback localhost address
only.

3) Gentoo ships X itself with IP-forwarding disabled, only the local Unix
domain socket active.

As such, by the time I actually got around to learning IPTables/netfilter
and setting it up on my Gentoo box, it really wasn't as necessary as it
would be on other distributions, anyway, because firewall or no firewall,
the only open ports were ports I had deliberately opened myself and thus
already knew about.

But of course defense in depth is a VERY important security principle,
correlating as it does with the parallel "never trust yourself not to fat-
finger SOMETHING!" (Now, if the so-called security services HBGary, et.
al., only practiced it! ... I think that's what galled most of the world
most, not that they screwed up a couple things so badly, but that they so
blatantly violated the basic defense-in-depth, or we'd have never read
about the screw-ups in the first place as they'd have not amounted to
anything if the proper layers of defense had been there... and for a
SECURITY firm, no less, to so utterly and completely miss it!) So
regardless of the fact that in theory I didn't actually need the firewall
by then since the only open ports were the ones I intended to be open, I
wasn't going to run direct-connected without /some/ sort of firewall, and
I learned and activated IPTables/netfilter before I did direct-connect.
And now that I have NAPT again, I still keep it running, as that's simply
another layer of that defense in depth, and I can use the NAPT router for
multiplexing several devices on a single IP, not its originally accidental
side-effect of inbound firewalling, tho again, I keep that too as it's
another layer of that defense in depth, I just don't /count/ on it.

>> Bottom line, yeah I believe ext4 is safe, but ext3 or ext4, unless you
>> really do /not/ care about your data integrity or are going to the
>> extreme and already have data=journal, DEFINITELY specify data=ordered,
>> both in your mount options, and by setting the defaults via tune2fs.
>
> So does this turn off journaling? What's a good reference on the
> advantages of ext4 over ext3, or can you just summarize them for me?

No, this doesn't turn off journaling.

Briefly...

There's the actual data, the stuff in the files we care about, and
metadata, the stuff the filesystem tracks behind the scenes so we don't
have to worry about it. Metadata includes stuff like the filename, the
dates (create/modify/access, the latter of which isn't used that much any
more and is often disabled), permissions (both traditional *ix set*/user/
group/world and if active SELinux perms, etc), INODE AND DIRECTORY TABLES
(most important in this context, thus the CAPS, as without them, your data
is effectively reduced to semi-random binary sequences), etc.

It's the metadata, in particular, the inode and directory tables, that fsck
concerns itself with, that's potentially damaged in the event of a chaotic
shutdown, that fsck checks and tries to restore on remount after such a
shutdown, etc.

Because the original purpose of journaling was to shortcut the long fscks
after a chaotic shutdown, traditionally it concerns itself only with
metadata. In practice, however, due to reordered disk operations at both
the OS and disk hardware/firmware level, the result of a recovery with
strict meta-data-only journaling on a filesystem can be perfectly restored
filesystem metadata, but with incorrect real DATA in those files, because
the metadata was already written to disk but the data itself hadn't been,
at the time of the chaotic shutdown.

Due to important security implications (it's possible that the previous
contents of that inode was an unlinked but not secure-erased file
belonging to another user, UNAUTHORIZED DATA LEAK!!!), such restored
metadata-only files where the data itself is questionable, are normally
truncated to zero-length, thus the post-restore zero-length "empty" file
phenomenon common with early journaled filesystems and still occasionally
seen today.

The data= journaling option controls data/metadata handling.

data=writeback is "bare" metadata journaling. It's the fastest but
riskiest in terms of real data integrity for the reasons explained above.
As such, it's often used where performance matters more than strict data
integrity in the event of chaotic shutdown -- where data is backed up and
changes since the backup tend to be trivial and/or easy to recover, where
the data's easily redownloaded from the net (think the gentoo packages
tree, source tarballs, etc), and/or where the filesystem is wiped at boot
anyway (as /tmp is in many installations/). Zeroed out files on recovery
can and do happen in writeback mode.

data=ordered is the middle ground, "good enough" for most people, both in
performance and in data integrity. The system ensures that the commit of
the real data itself is "ordered" before the metadata that indexes it,
telling the filesystem where it's located. This comes at a slight
performance cost as some write-order-optimization must be skipped, but it
GREATLY enhances the integrity of the data in the event of a chaotic
shutdown and subsequent recovery. There are corner-cases where it's still
possible at least in theory to get the wrong behavior, but in practice,
these don't happen very often, and when they do, the loss tends to be that
of reverting to the pre-update version of the file, losing only the
current edit, rather than zeroing out of the file (or worse yet, data
leakage) entirely.

data=journal is the paranoid option. With this you'll want a much larger
journal, because not only the metadata, but the data itself, is
journaled. (And here most people thought that's what journaling did /all/
the time!) Because ALL data is ultimately written TWICE in this mode,
first to the journal and then from there to its ultimate location, by
definition it's a factor of two slower, but provided the hardware is
working correctly, the worst-case in a chaotic shutdown is loss of the
current edit, reverting to the previous edition of the file.

FWIW and rather ironically, my original understanding of all this came
from a series of IBM DeveloperWorks articles written in the early kernel
2.4 series era, explaining the main filesystem choices, many of them then
new, available in kernel 2.4. While the performance data and some
filesystem implementation detail (plus lack of mention of ext4 and btrfs
as this was before their time) is now somewhat dated, the theory and
general filesystem descriptions remain solid, and as such, the series
remains a reasonably good intro to Linux filesystems to this day. As
such, parts of it are still available as linked from the Gentoo
Documentation archived copy of those IBM DeveloperWorks articles. In
particular, two parts covering ext3 and the data= options remain available:

http://www.gentoo.org/doc/en/articles/afig-ct-ext3-intro.xml
http://www.gentoo.org/doc/en/articles/l-afig-p8.xml

The ironic bit is who the author was, one Daniel Robbins, the same DRobbins
who founded the then Enoch Linux, now Gentoo. But I read them long before
I ever considered Gentoo, when I was first switching to Linux and using
Mandrake. It was thus with quite some amazement a number of years later,
after I'd been on Gentoo for awhile, that I discovered that the *SAME*
DRobbins who founded Gentoo (and was still active tho on his way out in
early 2004 when I started on Gentoo), was the guy who wrote the Advanced
Filesystem Implementor's Guide in IBM DeveloperWorks, the guide I'd found
so *INCREDIBLY* helpful years before, when I hadn't a /clue/ who he was or
what distribution I'd chose years later, as I just starting with Mandrake
and trying to figure out what filesystems to choose.

As to the ext3/ext4 differences... AFAIK the (second) biggest one is that
ext4 uses extents by default, thus fragmenting files somewhat less over
time. (Extents are a subject worth their own post, which I won't attempt
as while I understand the basics I don't understand all the implications
thereof myself. But one effect is better efficiency in filesystem layout,
when the filesystem was created with them anyway... it won't help old
files on upgraded-to-ext4-from ext2/3 that much. Google's available for
more. =:^)

There's a lot of smaller improvements as well. ext4 is native large-
filesystem by default. A number of optimizations discovered since ext3
are implemented in ext4 that can't be in ext3 for stability and/or old-
kernel backward compatibility reasons. ext4 has a no-journal option
that's far better on flash-based thumb-drives, etc. There are a number of
options that can make it better on SSDs and flash in general than ext3.

And the biggest advantage is that ext4 is actively supported in the kernel
and supports ext2/3 as well, while ext2/3, as separate buildable kernel
options, are definitely considered legacy, with talk, as I believe I
mentioned, of removing them as separate implementations entirely, relying
on ext4's backward compatibility for ext2/3 support. In that regard, ext3
as a separate option is in worse shape than reiserfs, since it's clearly
legacy and targeted for removal. As part of ext4, support will
*DEFINITELY* continue for YEARS, more likely DECADES, so is in no danger
in that regard (more so than reiserfs support, which will continue to be
supported as well for at least years), but the focus is definitely on ext4
now, and as ext3 becomes more and more legacy, the chances of corner-case
bugs appearing in ext3-only code in the ext4 driver do logically
increase. In that regard, reiserfs could actually be argued to be in
better shape, since it's not implemented as a now out-of-focus older-
brother to a current filesystem, so while it has less focus in general, it
also has less chances of being accidentally affected by a change to the
current-focus code.

Which can be argued to have already happened with the default ext3
switching to data=writeback for a number of kernels, before being switched
back to the data=ordered it always had before. A number of kernels ago
(2.6.29 IIRC), ext4 was either officially just out of or being discussed
for bringing out of experimental. I believe it was Ubuntu that first made
it a rootfs system install option, in that same time period. Shortly
thereafter, a whole slew of Ubuntu on ext4 users, most of whom it turned
out later were using the closed nVidia driver, which was unstable in that
version against that Ubuntu version and kernel, thus provoking many cases
of "chaotic shutdown", a classic worst-case trial-by-fire test for the
then still coming out of experimental ext4, began experiencing the classic
"zeroed out file" problems on reboot after their chaotic shutdowns.

*Greatly* compounding the problem were some seriously ill-advised Gnome
config-file behaviors. Apparently, they were opening config-files for
read-write simply to READ them and get the config in the process of
initializing GNOME. Of course, the unstable nVidia driver was
initializing in parallel to all this, with the predictable-in-hindsight
results... As gnome was only READING the config values, it SHOULD have
opened those files READ-ONLY, if necessary later opening them read-write
to write new values to them. As with the security defense-in-depth
mentioned in the HBGary parenthetical above, this is pretty basic
filesystem principles, but the gnome folks had it wrong. The were opening
the files read/write when they only needed read, and the system was
crashing with them in that state. As a result, these files were open for
writing in the crash, and as is standard security practice as explained
above, the ext4 journaling system, defaulting to write-back mode, restored
them as zeroed out files to prevent any possibility of data leak.
Actually, there were a few other technicalities involved as well (file
renaming on write, failure to call fsync, due in part to ext3's historic
bad behavior on fsync, which it treated as whole-filesystem-sync, etc),
but that's the gist of things.

So due to ext4's data=writeback and the immaturity of the filesystem such
that it didn't take additional precautions, these folks were getting
critical parts of their gnome config zeroed out every time they crashed,
and due to the unstable nVidia drivers, they were crashing frequently!!

*NOT* a good situation, and that's a classic understatement!!

The resulting investigation discovered not only the obvious gnome problem,
but several code tweaks that could be done to ext4 to reduce the
likelihood of this sort of situation in the future.

All fine and good, so far. But they quickly realized that the same sort
of code tweak issues existed with ext3, except that because ext3 defaulted
to data=ordered, only those specifically setting data=writeback were
having problems, and because those using data=writeback were expected to
have /some/ problems anyway, the issues had been attributed to that and
thus hadn't been fully investigated and fixed, all these years.

So they fixed the problems in ext3 as well. Again, all fine and good --
the problems NEEDED fixed. *BUT*, and here's where the controversy comes
in, they decided that data=writeback was now dependable enough for BOTH
ext3 and ext4, thus changing the default for ext3.

To say that was hugely controversial is an understatement (multiple
threads on LKML, LWN, elsewhere where the issue was covered at the time,
often several hundreds of posts long each), and my feelings on
data=writeback should be transparent by now so where I stand on the issue
should be equally transparent, but Linus never-the-less merged the commit
that switched ext3 to data=writeback by default, AFAIK in 2.6.31. (AFAIK,
they discovered the problem in 2.6.29, 2.6.30 contained temporary work-
around-fixes, 2.6.31 contained the permanent fixes and switched ext3 to
data=writeback.)

Here's the critical point. Because reiserfs isn't so closely related to
the ext* family, it retained the data=ordered default it had gotten years
early, the same kernel Chris Mason committed the code for reiserfs to do
data=ordered at all. ext3 got the change due to its relationship with
ext4, despite the fact that it's officially an old and stable filesystem
where arguably such major policy changes should not occur. If the seperate
kernel option for ext3 is removed in ordered to remove the duplicate
functionality already included in ext4 for backward compatibility reasons,
by definition, this sort of change to ext4 *WILL* change the ext3 it also
supports, unless deliberate action is taken to avoid it. That makes such
issues far more likely to occur again in ext3, than in the relatively
obscure ext4.

Meanwhile, as mentioned, with newer kernels (2.6.36, 37, or 38, IDR which,
tho it won't matter for those specifying the data=option either via
filesystem defaults using tune2fs, or via specific mount option), ext3
reverted again to the older and safer default, data=ordered.

And as I said, it's my firm opinion that the data= option has a stronger
effect on filesystem stability than any possibly remaining issues with
ext4, which is really quite stable by now. Thus, ext3, ext4, or reiserfs,
I'd **STRONGLY** recommend data=ordered, regardless of whether it's the
default as it is with old and new (but with a gap) ext3 and reiserfs as it
has been for years, or not, as I believe ext4 still defaults to
data=writeback. If you value your data, "just do it!"

Meanwhile, I believe the default on the definitely still experimental
btrfs is data=writeback too. While I plan on switching to it eventually,
you can be quite sure I'll be examining that default and as of this point,
have no intentions of letting it be data=writeback, when I do.

....

> The problem with Gentoo was that because EVMS was an orphaned project, I
> believe the ebuild wasn't updated. The initrd file was specific for
> EVMS.

That's quite likely, indeed.

> Of course. I like technology that _lasts_! We have a clock in our
> house that's about 190 years old [...] turned me on to the Connecticut
> Clock and Watch museum, run by one George Bruno [who] also makes working
> replicas [and] was able to send me an exact replacement part! Try
> _THAT_ with your 1990's era computer ;-)

That reminds me... I skipped it as irrelevant to the topic at hand, but
due to kernel sensors and ACPI changes, I decided to try the last BIOS
upgrade available for this Tyan, after having run an earlier BIOS for some
years. Along about 2.6.27, I had to start using a special boot parameter
to keep the sensors working, as apparently the sensor address regions
overlap ACPI address regions (not an uncommon issue in boards of that era,
the kernel folks say). The comments on the kernel bug I filed suggested
that a BIOS update might straighten that out (it didn't, BIOS still too
old and board EOLed, even if it is still working), so I decided to try it.

The problem was that I had a bad memory stick. Now the kernel has
detectors for that and I had them active, but the kernel drivers for that
were introduced long after I got the hardware, and while it was logging an
issue with the memory, since it had been doing that since I activated the
kernel drivers for it, I misinterpreted that as simply how it worked, so
wasn't aware of the bad memory it was trying to tell me about.

So I booted to the FreeDOS floppy I used for BIOS upgrades (I've used
FreeDOS for BIOS upgrades for years, without incident before this) and
began the process.

It crashed half-way thru the flash-burn, apparently when it hit that bad
memory!!

Bad situation, but there's supposed to be a failsafe direct-read-recover
mode built-in, that probably would have worked had I known about it.
Unfortunately I didn't, and by the time I figured it out, I'd screwed that
up as well.

But luckily I have a netbook, that I had intended to put Gentoo on but had
never gotten around to at that point (tho it's running Gentoo now, 2.6.38
kernel, kde 4.6.1, fully updated as of mid-March). It was still running
the Linpus Linux it shipped with (first full system I've bought since my
original 486SX25 w/ 2MB memory and 130 MB hard drive in 1993, or so, and
I'd have sooner done without the netbook than pay the MS tax, I DID have
to order it from Canada and have it shipped to the US). I was able to get
online with that, grab a yahoo webmail account since my mail logins were
stuck on the main system without a BIOS, and use that to order a new BIOS
chip shipped to me, the target BIOS pre-installed.

That new BIOS chip rescued my system!

I suspect my feelings after that BIOS chip did the trick rather mirror
yours after that gear did the trick for your clock. The computer might
not be 190 years old, but 2003 is old enough in computer years, and I
suspect I have rather more of my life wound up in that computer than you
do in that clock, 190 years old or not.

Regardless, tho, you'll surely agree,

WHAT A RELIEF TO SEE IT RUNNING AGAIN! =:^)

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
 

Thread Tools




All times are GMT. The time now is 01:29 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org