FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > ArchLinux > ArchLinux General Discussion

 
 
LinkBack Thread Tools
 
Old 07-13-2010, 02:26 PM
"David C. Rankin"
 
Default kernel26-2.6.34.1 - won't boot - stuck at "Setting up UTF-8 mode" or craters with kernel NULL pointer

On 07/12/2010 02:57 AM, David C. Rankin wrote:

I've tried rebuilding the initramfs or whatever you call it with:

Normal Kernel:

/sbin/mkinitcpio -k 2.6.34-ARCH -c /etc/mkinitcpio.conf -g /boot/kernel26.img


Fallback Kernel:

/sbin/mkinitcpio -k 2.6.34-ARCH -c /etc/mkinitcpio.conf -g
/boot/kernel26-fallback.img -S autodetect


and each time it completes successfully. But then it will either boot once OK,
then fail to boot the very next time I try to boot the box -- or -- it will
never boot. It looks like it is blowing up on the loading modules line or it
just gets stuck on the setting up UTF-8 mode line. No doubt this bug is also
what is causing compiz to white-screen when I do get one of these new kernels to
boot, but with no logging on during the boot process when it blows up, I'm not
sure what to do.

What say the gurus?



Can anyone think of the possible mechanism that would cause a kernel to
boot once after rebuilding the initramfs, but then be corrupt for every boot
thereafter?? As mentioned in the title on the 2nd boot attempt (and all
subsequent attempts), the boot process either hard-locks when the "Setting up
UTF-8 mode" message is displayed --or-- a kernel NULL Pointer message is
displayed and then I get 3 screens of garbage before the box either locks or a
ctrl+c kills that part of the boot process and booting proceeds until it craters
4-10 steps later.


(Memory can be ruled OUT as a problem, it memtests fine and I'm working from the
same box right now and it will boot the LTS kernel and the opensuse kernel's
fine each and every time)


Moreover, I have 8-10 Arch boxes running 2.6.34.1 happily, but this laptop
exhibits the "boot once then fail" behavior every time. I would like to help
find out what is causing this problem, but I have exhausted my shallow pool of
Arch boot sequence knowledge so I'm looking for some help. Even something as
simple as:


When the boot fails try this ....

What does file XYZ contain?, etc...

I have posted the dmidecode information for the box in case the problem is
related some weird hardware or hardware where a regression has occurred between
2.6.33 and 2.6.34.


I don't know what else to do except wait until the next kernel release and
pray that one will work. All Arch and suse and gparted kernels have worked fine
on the box until the past two 2.6.34 kernels. Somehow just doing nothing and
waiting seems less than scientific and an approach that is unlikely to help Arch
or my present situation.


I don't know if you guys would rather me open a ticket on this one or just
sit-tight and see if we can get some better information here before doing so?
Dunno -- that's why I'm asking...


I'll even take your best swag at this point Let me know what the best
way to pursue this on is. Thanks.




--
David C. Rankin, J.D.,P.E.
Rankin Law Firm, PLLC
510 Ochiltree Street
Nacogdoches, Texas 75961
Telephone: (936) 715-9333
Facsimile: (936) 715-9339
www.rankinlawfirm.com
 
Old 07-13-2010, 03:24 PM
"David C. Rankin"
 
Default kernel26-2.6.34.1 - won't boot - stuck at "Setting up UTF-8 mode" or craters with kernel NULL pointer

On 07/13/2010 09:26 AM, David C. Rankin wrote:

Can anyone think of the possible mechanism that would cause a kernel to
boot once after rebuilding the initramfs, but then be corrupt for every boot
thereafter?? As mentioned in the title on the 2nd boot attempt (and all
subsequent attempts), the boot process either hard-locks when the "Setting up
UTF-8 mode" message is displayed --or-- a kernel NULL Pointer message is
displayed and then I get 3 screens of garbage before the box either locks or a
ctrl+c kills that part of the boot process and booting proceeds until it craters
4-10 steps later.


Could the Null Pointer blow up be due to incorrect gpu handling by the Arch
kernel causing the blow-up when the modules are loaded (about the same time the
KMS magic is taking place?


I say this because I have one of ATI's less common gpu's in this Toshiba laptop.
The video card is:


Radeon X1250 Graphics(690G Chipset), RS690M, RV410 Graphics Core. This uses the
onboard PCIe bus interface and has API support for DirectX 9.0b and OpenGL 2.0.
For some reason the kernel crashes 'smell' like a mishandling of the gpu
subsystem in the 2.6.34 kernels (Note: this is just a 'gut feel', and I can't
point to anything in particular). Of all things that could have changed for the
past 2 kernels, the KMS magic and a possible bug slipping in for this card seems
like one of the likely areas to start looking.


The lspci -vv data for the card are as follows (I have opensuse running at the
moment - thus the fglrx driver is shown):


01:05.0 VGA compatible controller: ATI Technologies Inc RS690M [Radeon X1200
Series] (prog-if 00 [VGA controller])

Subsystem: Toshiba America Info Systems Device ff00
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-

Latency: 64, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 18
Region 0: Memory at f0000000 (64-bit, prefetchable) [size=128M]
Region 2: Memory at f8100000 (64-bit, non-prefetchable) [size=64K]
Region 4: I/O ports at 9000 [size=256]
Region 5: Memory at f8000000 (32-bit, non-prefetchable) [size=1M]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)

Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [80] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0
Enable-

Address: 0000000000000000 Data: 0000
Kernel driver in use: fglrx_pci
Kernel modules: fglrx

I don't know why the card is reporting as an X1200 in lspci. The Core Clock for
this gpu is 400 MHz and according to AMD, that means this is the 1250 and not
the 1200 because the Core Clock on the 1200 is 350 MHz.


I don't know what, if any, changes took place in KMS or in gpu initialization
for the 2.6.34 kernel, but this card always sucked when using the ATI driver
which prevented me from moving to Arch sooner on this box. Then with the 2.6.32
& 2.6.33 kernels, it was like somebody turned on a light-switch in the kernel
and I was getting Blazing fast performance out of the xf86-video-ati driver on
Arch, compiz was working great, and the gpu subsystem was working better than
ever before in Arch with just the 'radeon' driver.


When I updated to 2.6.34-2 I ran into the problem with compiz "whitescreening"
and video performance 'tanked' when I had the system running on 'first boot'
which would boot.


Then on every attempt to boot thereafter - the boot would fail and either hang
of blow-up with the kernel NULL Pointer error.


That has me thinking that this problem has to be related to some module
rearrangement/updating that takes place after you boot the box for the first
time -- thus preventing the next boot from working.


I don't know how to verify or check this out, but this is what my gut tells me
is going on.


Arch gurus -- any way to test this hypothesis??

--
David C. Rankin, J.D.,P.E.
Rankin Law Firm, PLLC
510 Ochiltree Street
Nacogdoches, Texas 75961
Telephone: (936) 715-9333
Facsimile: (936) 715-9339
www.rankinlawfirm.com
 
Old 07-13-2010, 03:47 PM
Mauro Santos
 
Default kernel26-2.6.34.1 - won't boot - stuck at "Setting up UTF-8 mode" or craters with kernel NULL pointer

On 07/13/2010 04:24 PM, David C. Rankin wrote:

> Could the Null Pointer blow up be due to incorrect gpu handling by the
> Arch kernel causing the blow-up when the modules are loaded (about the
> same time the KMS magic is taking place?
>
> I say this because I have one of ATI's less common gpu's in this Toshiba
> laptop. The video card is:
>
> Radeon X1250 Graphics(690G Chipset), RS690M, RV410 Graphics Core. This
> uses the onboard PCIe bus interface and has API support for DirectX 9.0b
> and OpenGL 2.0. For some reason the kernel crashes 'smell' like a
> mishandling of the gpu subsystem in the 2.6.34 kernels (Note: this is
> just a 'gut feel', and I can't point to anything in particular). Of all
> things that could have changed for the past 2 kernels, the KMS magic and
> a possible bug slipping in for this card seems like one of the likely
> areas to start looking.

That could be it, try to disable KMS or use early KMS. I see this happen
once in a while when I boot my desktop pc from a usb drive with arch.

Most of the times it boots just fine but every once in a while it will
hang at that exact place, "Setting console font ...." but I've never
seen a kernel panic or had to clean spaghetti of my screen.

I could never figure out exactly what was wrong, starting with a cold
boot sometimes it hangs, most of the times it works.

--
Mauro Santos
 
Old 07-13-2010, 07:10 PM
Isaac Dupree
 
Default kernel26-2.6.34.1 - won't boot - stuck at "Setting up UTF-8 mode" or craters with kernel NULL pointer

On 07/13/10 10:26, David C. Rankin wrote:

Can anyone think of the possible mechanism that would cause a kernel to
boot once after rebuilding the initramfs, but then be corrupt for every
boot thereafter??


Do you rebuild the initramfs on 2.6.32?

Do you let the machine sit for a minute, shut-down, between each boot?

Yes, I can think of a mechanism. I'll tell it by example: My machine
has a built-in webcam that the OS has to upload firmware to on every
boot. Sometimes when I boot it ends up in a screwed-up state somehow
(so the webcam doesn't work), and sometimes rebooting doesn't help:
shutting down and waiting a few minutes sometimes helps: booting into
MacOSX then shutting down also can change things a bit, often for the
better (after all this hardware and OSX were made for each other). I
believe it has some sort of volatile memory that decays randomly and
slowly when not powered (like RAM does). I guess that when it boots
with its memory containing partly corrupted firmware, it causes some
kind of trouble depending on the exact state of the memory that
interferes with just fixing it by uploading new firmware.


That's an example of how something could possibly persist across
reboots. Maybe if you build on 2.6.32, the actual effect is that you
were just booted into a good kernel that initialized some piece of
hardware into some reasonable state, and this state is likely to persist
across a reboot, but 2.6.34 screws up the state such that the next boot
of 2.6.34 doesn't like it but 2.6.32 is a good enough kernel to
nevertheless re-initialize it properly. (It could be non-volatile
memory too, and the randomness could be part of linux boot process being
nondeterministic as it is)


Or...maybe the explanation is entirely different.

-Isaac
 

Thread Tools




All times are GMT. The time now is 12:10 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org