On 10 October 2010 21:43, Jochen Schulz <ml@well-adjusted.de> wrote:
> Looks like a kernel panic or a kernel oops. Seeing the start of it would
> be helpful.
Any tips on catching it? Will there be useful info in a log somewhere?
> That looks like your filesystem is damaged beyond what one could expect
> from a single crash. I would try to (in that order)
I wonder if the flash card is more sensitive to crashes like this than
the usual spinning platters...?
> - make sure the CF card is ok (badblocks scan in another system)
Just finished one from the USB installer (rescue mode). Didn't seem to do much.
> - make sure the mainboard/CPU/RAM is ok (memtest86, if it runs on your
> system)
That will have to wait until a reinstall - I can only boot in recovery
mode, and I can't install anything because dpkg hits unrecoverable
errors (this time, "syntax error: unknown group 'crontab' in
statoverride file").
I did run memtest86+ a couple of weeks ago (no particular reason) and
it was fine, but a hardware problem is not out of the question. Not
sure how to catch it if memtest can't see it though.
> My guess would be that you have a hardware problem for some time now. I
> only find it curious that it can be reproduced when installing a
> particular package. Is the segfault reproducible after a reinstall as
> well?
Yep, every time. I just wonder if somehow the avahi-daemon package is
the only thing I'm trying to install that, say, requires that much
more memory than any other package for configuration.
> Where do your packages come from (official mirror, local mirror,
> self-burnt disc etc.)?
They're from the iiNet mirror (http://ftp.iinet.net.au/debian/debian)
Thanks,
Jason
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTimgd_sEr-GzXgWPzZsJvPn+rL+uWL=EsVr3qmHO@mail.gmail.com">htt p://lists.debian.org/AANLkTimgd_sEr-GzXgWPzZsJvPn+rL+uWL=EsVr3qmHO@mail.gmail.com
On 11 October 2010 03:03, Jochen Schulz <ml@well-adjusted.de> wrote:
>> Any tips on catching it? Will there be useful info in a log somewhere?
>
> It *could* have made it to /var/log/syslog, but I am not particularly
> optimistic about that.
No, I had checked that already. :/
> May be. Do you have write caching enabled?
How would I know?
>>> - make sure the CF card is ok (badblocks scan in another system)
>>
>> Just finished one from the USB installer (rescue mode). Didn't seem to do much.
>
> What do you mean by that?
This is from memory, but I ran "fsck.ext2 -c -C 0 -p -f /dev/sdb1" and
after running for a while said something like:
So... if there were bad blocks, it didn't make a fuss about it.
I'm going to commence a destructive badblocks scan ("badblocks -b 4096
-s -w /dev/sdb1") - can badblocks be given an entire device instead of
a partition? Or does that not make sense?
>> Yep, every time. I just wonder if somehow the avahi-daemon package is
>> the only thing I'm trying to install that, say, requires that much
>> more memory than any other package for configuration.
>
> Maybe you should try to install and run KDE and Firefox. ;-)
I don't want to start a fire, there's expensive equipment nearby...
- Jason
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTimerxdPXb6Tt2w816ZRfw=4gckYD1=u-Sb62WuV@mail.gmail.com">http://lists.debian.org/AANLkTimerxdPXb6Tt2w816ZRfw=4gckYD1=u-Sb62WuV@mail.gmail.com
On Sun, 10 Oct 2010, Jason Heeris wrote:
> My system is a Helios single-board computer, with specs:
>
> CPU: Vortex86 SoC (800MHz) - I *think* this is pretty much a 486, I
> could be wrong
Yikes. You really need to track this one down, and find out whether it is
any different from a regular 486. The devil IS in the details, and these
older boxes are NOT regularly tested anymore.
In fact, we can't really attest that Debian does work perfectly even on
straight Intel i486 anymore, let alone on 486-alike chips: We depend on
sheer dumb luck that the kernel and gcc have not regressed (very little/no
testing is done on 486s anymore), and some packages might not be
486-compatible at all.
But your issue is NOT an avahi bug (package, compiler or otherwise). It is
either a kernel bug *or* the kernel is simply incompatible with your
vortex86 processor in a way that went undetected by the sanity checks at
kernel startup.
> My problem started when I installed avahi-daemon - the system crashed,
> and I could not recover to a useable state. So I reinstalled the
> system and approached it a little more carefully.
1. Set up a serial console (to capture crash data);
2. Make sure you're in condition to lose data;
3. Reproduce the crash, log *everything* since boot.
4. File a bug on bugzilla.kernel.org with all relevant information. This
does include the kernel config at the very least.
> http://heeris.id.au/stuff/SBCError.jpg
The box objected VERY HEAVILY to the ipv6 multicast operations trigerred by
avahi. Without fixing the underlying bug, your box will be unusable (lots
of other stuff are likely to trigger the same problem).
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101012023640.GB29465@khazad-dum.debian.net">http://lists.debian.org/20101012023640.GB29465@khazad-dum.debian.net
On 11 October 2010 18:36, Henrique de Moraes Holschuh <hmh@debian.org> wrote:
> On Sun, 10 Oct 2010, Jason Heeris wrote:
t>> My system is a Helios single-board computer, with specs:
>>
>> CPU: Vortex86 SoC (800MHz) - I *think* this is pretty much a 486, I
>> could be wrong
>
> Yikes. *You really need to track this one down, and find out whether it is
> any different from a regular 486. *The devil IS in the details, and these
> older boxes are NOT regularly tested anymore.
According to http://www.vortex86.com/index2.html
Vortex86 family integrates a high-performance processor that
supports x86 instruction set with 3 integer units, 3-way
superscalar architecture, and a fully pipelined floating point unit.
Hmm. According to /proc/cpinfo on the installer,
> 1. Set up a serial console (to capture crash data);
> 2. Make sure you're in condition to lose data;
> 3. Reproduce the crash, log *everything* since boot.
What do you mean by "everything"? Every command? That's not hard. All
the kernel messages? That's harder... The logs in /var/log do not seem
to contain *anything* about the session during which the crash occurs.
> 4. File a bug on bugzilla.kernel.org with all relevant information. *This
> does include the kernel config at the very least.
It's just the Debian stock kernel config.
> The box objected VERY HEAVILY to the ipv6 multicast operations trigerred by
> avahi.
Given this, can you think of another way I might be able to trigger
the bug? If so, I might be able to do it without needing to reinstall
the system every time.
> Without fixing the underlying bug, your box will be unusable (lots
> of other stuff are likely to trigger the same problem).
That's what I was afraid of. Thanks for the info, I'll see what I can do.
Cheers,
Jason
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTimJN1p1iYnVwH4DTW=NQwa_SGxY7rAqP8x8w80T@mail .gmail.com">http://lists.debian.org/AANLkTimJN1p1iYnVwH4DTW=NQwa_SGxY7rAqP8x8w80T@mail .gmail.com
On 11 October 2010 18:36, Henrique de Moraes Holschuh<hmh@debian.org> wrote:
[snip]
The box objected VERY HEAVILY to the ipv6 multicast operations trigerred by
avahi.
Given this, can you think of another way I might be able to trigger
the bug? If so, I might be able to do it without needing to reinstall
the system every time.
Without fixing the underlying bug, your box will be unusable (lots
of other stuff are likely to trigger the same problem).
That's what I was afraid of. Thanks for the info, I'll see what I can do.
My 1st thought was whether you need IPv6...
--
Seek truth from facts.
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
2010/10/12 Ron Johnson <ron.l.johnson@cox.net>:
> My 1st thought was whether you need IPv6...
Well, no, and if I can't sort this out then I'll recompile without it
and see if the crash goes away (or... can I black list it, or is IPv6
compiled right in?). But this might be a good opportunity to find a
bug before I go down that path.
Cheers,
Jason
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTinKLQArnKsH_-_bRf-EsENOoZot6dgMb0A8Zn3L@mail.gmail.com">http://lists.debian.org/AANLkTinKLQArnKsH_-_bRf-EsENOoZot6dgMb0A8Zn3L@mail.gmail.com
On 12 October 2010 10:59, Jason Heeris <jason.heeris@gmail.com> wrote:
>> 4. File a bug on bugzilla.kernel.org with all relevant information. Â*This
>> does include the kernel config at the very least.
>
> It's just the Debian stock kernel config.
Should I recompile it with any kind of debugging information enabled,
or does the Debian kernel already contain it?
— Jason
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTinh05FUzXjJZf13ACinuY9cGTrnoaoWtkbxRZKE@mail .gmail.com">http://lists.debian.org/AANLkTinh05FUzXjJZf13ACinuY9cGTrnoaoWtkbxRZKE@mail .gmail.com
Jason Heeris <jason.heeris@gmail.com> writes:
> Should I recompile it with any kind of debugging information enabled,
> or does the Debian kernel already contain it?
It depends on the architecture and debian version. Please post a
proper bug report with reportbug that shows all the relevant version
information. Can you make this crash happen under an emulator like
qemu (hint: you can sudo chmod a+r /dev/hda && qemu -snapshot -hda
/dev/hda to temporarily test the current installation under qemu
without writing anything to disk).
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 84tykrapli.fsf@sauna.l.org">http://lists.debian.org/84tykrapli.fsf@sauna.l.org
On 11 October 2010 23:44, Jochen Schulz <ml@well-adjusted.de> wrote:
> [ I guess Henrique's interpretation of the problem is better than mine,
> *I just wanted to follow-up on this specific question. ]
>> How would I know?
>
> $ dmesg | grep "Write cache"
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>
> I don't whether disabling this would stress the CF card more or less,
> but you should experience less filesystem corruption after a crash.
> AFAICT you need hdparm to disable it, though.
I just sorted this out not ten seconds ago
hdparm -W 0 /dev/sda
...did the trick (and yes, it was on). I'm hoping to find the call
that makes avahi trigger the crash, so I can do it a little more
simply.
Cheers,
Jason
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTimiSAM7050vdmZpjUJc7y3OOcqC4gsO6ztugQ+s@mail .gmail.com">http://lists.debian.org/AANLkTimiSAM7050vdmZpjUJc7y3OOcqC4gsO6ztugQ+s@mail .gmail.com