Kernel panic - not syncing: CPU context corrupt
Hi,
Is there someone on this mailing list who could/want help me figure out this issue? We do not know where to look to solve this. --- Description --- This is a brand new server, which has been tested for days with FreeBSD in our office, and a few days with Windows on the site of our hardware distributor. Now customer wants CentOS, which we installed, but after few days we get a kernel panic. Last night at 2:08 it gave the same kernel panic. Please tell me what information I should give you and most important how to get it from the system, because we do not have experience with CentOS (only FreeBSD). I would be very surprised if this is hardware related. We use the same hardware for several years, and run FreeBSD on it very successfully. It is a SuperMicro PDSMI+ motherboard with 3ware raid controller (8006-2LP). CPU is Xeon 3040 1.8 Ghz EM64 2MB 1066FSB (65W). Memory is DDR 2 Trancend 2048MB ECC Unbuffered 800. Error message on console is in "Additional Information". I am hoping that I should switch off some setting in CentOS to fix this, but I cannot find much useful information about this issue on Google. --- Additional Information --- CentOS release 5 (Final) Kernel 2.6.18-53.1.21.el5 on an i686 ws174 login: CPU 1: Machine Check Exception: 0000000000000005 CPU 0: Machine Check Exception: 0000000000000004 Bank 3: f62000020002010a at 0000000032c93500 Bank 5: f20000300c000e0f Kernel panic - not syncing: CPU context corrupt Bank 3: f62000020002010a --- Attachments --- 19-06-2008 16-03-31.png (Screenshot of console) With kind regards, Alwin Roosen _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
Kernel panic - not syncing: CPU context corrupt
On Fri, 2008-06-20 at 14:40 +0200, Alwin Roosen wrote:
> Hi, > > > Is there someone on this mailing list who could/want help me figure out > this issue? We do not know where to look to solve this. ... > I would be very surprised if this is hardware related. A google on "Machine Check Exception" "Kernel panic - not syncing: CPU context corrupt" turns up 50 results (including your CentOS BZ request referring you to this list), many of which point to hardware problems - CPU, MB (bad caps), chipset, are all listed as possible problems. I'd go back to the hardware vendor if still under warranty. Phil _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
Kernel panic - not syncing: CPU context corrupt
2008/6/20 Alwin Roosen <alwin.roosen@webline.be>:
Hi, Is there someone on this mailing list who could/want help me figure out this issue? We do not know where to look to solve this. If your installation is standard CentOS with no thirdparty software, and configurations, I would first run the vendor hardware checks several times, as they are usually not good with intermittent or hard to find problems, run extenisve memtest also if possible * regards * Walid _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
Kernel panic - not syncing: CPU context corrupt
On 6/20/08, Alwin Roosen <alwin.roosen@webline.be> wrote:
<snip> > CentOS release 5 (Final) > Kernel 2.6.18-53.1.21.el5 on an i686 > > ws174 login: CPU 1: Machine Check Exception: 0000000000000005 > CPU 0: Machine Check Exception: 0000000000000004 > Bank 3: f62000020002010a at 0000000032c93500 > Bank 5: f20000300c000e0f > Kernel panic - not syncing: CPU context corrupt > Bank 3: f62000020002010a > Phil or someone else: Do the three (3) "Bank" lines above indicate RAM problems? If not, what do they refer to? Alwin wrote that this is brand new HW, so he suspects that it is OK, but it doesn't seem to be OK? Lanny _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
Kernel panic - not syncing: CPU context corrupt
Lanny Marcus wrote:
On 6/20/08, Alwin Roosen <alwin.roosen@webline.be> wrote: <snip> CentOS release 5 (Final) Kernel 2.6.18-53.1.21.el5 on an i686 ws174 login: CPU 1: Machine Check Exception: 0000000000000005 CPU 0: Machine Check Exception: 0000000000000004 Bank 3: f62000020002010a at 0000000032c93500 Bank 5: f20000300c000e0f Kernel panic - not syncing: CPU context corrupt Bank 3: f62000020002010a Phil or someone else: Do the three (3) "Bank" lines above indicate RAM problems? If not, what do they refer to? Alwin wrote that this is brand new HW, so he suspects that it is OK, but it doesn't seem to be OK? Lanny _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos I have the same issue, unresolved. However I am using old desktop hardware (Compaq Persario, and HP something or another). Maybe it is memory, or CPU, or some kind of incompatibility with something. I was just making a list of the hardware that should be purchased to run a low-end SME server using CentOS. Rack mountable case, with Power Supply and fans included. MotherBoard, mid-range processor. 2 Gb RAM USB Drive 1 Tb Two 500Gb or four 300 Gb internal hardrives (HW Raid would be nice) CD/DVD R/W drive and so on.......... But I don't want to get into the situation above, where I purchase NEW hardware, and CentOS doesn't like it, and furthermore the resolution is elusive. What is the best HW environment for CentOS? Brand, MFG, chipset rev, and so on.... -- Michael Anderson, J3k Solutions Sr.Systems Programmer/Analyst 832.515.3868 _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
Kernel panic - not syncing: CPU context corrupt
Michael wrote:
> But I don't want to get into the situation above, where I purchase NEW > hardware, and CentOS doesn't like it, and furthermore the resolution is > elusive. > > What is the best HW environment for CentOS? > Brand, MFG, chipset rev, and so on.... Easiest is to buy from a vendor that can test on your OS of choice, there are lots of vendors out there that can do it. Two such companies I have bought from that do this include http://www.siliconmechanics.com/ (HQ in Seattle, WA area) http://www.asaservers.com/ (HQ in San Fransisco, CA area) Both specialize in Supermicro/Tyan-based systems(as to most other "whitebox" vendors). nate _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
Kernel panic - not syncing: CPU context corrupt
On 6/20/08, nate <centos@linuxpowered.net> wrote:
<snip> > Easiest is to buy from a vendor that can test on your OS of choice, > there are lots of vendors out there that can do it. > > Two such companies I have bought from that do this include > http://www.siliconmechanics.com/ (HQ in Seattle, WA area) > http://www.asaservers.com/ (HQ in San Fransisco, CA area) > > Both specialize in Supermicro/Tyan-based systems(as to most other > "whitebox" vendors). That, IMHO, is the best way to go. Another way, if the HW is available, is to test it with a Live CD for CentOS, before purchasing, to see if CentOS will run properly on the HW. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
Kernel panic - not syncing: CPU context corrupt
on 6-20-2008 8:23 AM Lanny Marcus spake the following:
On 6/20/08, Alwin Roosen <alwin.roosen-AcEhIOVMebKZIoH1IeqzKA@public.gmane.org> wrote: <snip> CentOS release 5 (Final) Kernel 2.6.18-53.1.21.el5 on an i686 ws174 login: CPU 1: Machine Check Exception: 0000000000000005 CPU 0: Machine Check Exception: 0000000000000004 Bank 3: f62000020002010a at 0000000032c93500 Bank 5: f20000300c000e0f Kernel panic - not syncing: CPU context corrupt Bank 3: f62000020002010a Phil or someone else: Do the three (3) "Bank" lines above indicate RAM problems? If not, what do they refer to? Alwin wrote that this is brand new HW, so he suspects that it is OK, but it doesn't seem to be OK? Lanny As most of us have found out at some time; brand new does not always equal OK. I have had plenty of hardware that was dead on arrival or dead in days. Check the obvious of re-seating all removable parts like memory and cards, and also any option cards for second processors if they are included. Shipping or moving equipment can loosen things. Also look at the memory to see if it is on the recommended list for the motherboard. -- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't!!!! _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
Kernel panic - not syncing: CPU context corrupt
On 6/20/08, Scott Silva <ssilva@sgvwater.com> wrote:
<snip> > As most of us have found out at some time; > brand new does not always equal OK. > I have had plenty of hardware that was dead on arrival or dead in days. > Check > the obvious of re-seating all removable parts like memory and cards, and > also > any option cards for second processors if they are included. Shipping or > moving equipment can loosen things. > > Also look at the memory to see if it is on the recommended list for the > motherboard. The HW is using Memory Banking? Three (3) Banks have problems? How many Banks are there? _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
Kernel panic - not syncing: CPU context corrupt
On 6/20/08, Alwin Roosen <alwin.roosen@webline.be> wrote:
Hi, CentOS release 5 (Final) Kernel 2.6.18-53.1.21.el5 on an i686 ws174 login: CPU 1: Machine Check Exception: 0000000000000005 CPU 0: Machine Check Exception: 0000000000000004 Bank 3: f62000020002010a at 0000000032c93500 Bank 5: f20000300c000e0f Kernel panic - not syncing: CPU context corrupt Bank 3: f62000020002010a Alwin --> I would be very, very "surprised" *IF* this wasn't hardware related. Dave Jones wrote a nice little program to help decode this: $ parsemce -b 3 -s f62000020002010a -e 5 -a 0000000032c93500 Status: (5) Machine Check in progress. Restart IP valid. parsebank(3): f62000020002010a @ 32c93500 ******* External tag parity error ******* CPU state corrupt. Restart not possible ******* Address in addr register valid ******* Error enabled in control register ******* Error not corrected. ******* Error overflow ******* Memory hierarchy error ******* Request: Generic error ******* Transaction type : Generic ******* Memory/IO : I/O and: $ parsemce -b 5 -s f20000300c000e0f -e 4 -a 0 Status: (4) Machine Check in progress. Restart IP invalid. parsebank(5): f20000300c000e0f @ 0 ******* External tag parity error ******* CPU state corrupt. Restart not possible ******* Error enabled in control register ******* Error not corrected. ******* Error overflow ******* Bus and interconnect error ******* Participation: Generic ******* Timeout: Request did not timeout ******* Request: Generic error ******* Transaction type : Invalid ******* Memory/IO : Other Dag's Repo has the new memtest86+ 2.01 RPM.* I'd pull it and let it run overnight.* While memtest86+ is good, I've recently had cases where is didn't find (obvious) memory errors. I've also seen things like SATA disks drive cause MCEs. This one looks like you're taking memory parity errors somewhere in the path to the CPU.* On you BIOS, check you Events log for any "interesting" entries, too. Hope this helps ... ** -rak- _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
| All times are GMT. The time now is 07:10 AM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.