Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Debian ISP (http://www.linux-archive.org/debian-isp/)
-   -   Processor context corrupt. (http://www.linux-archive.org/debian-isp/611181-processor-context-corrupt.html)

Marc Aymerich 12-17-2011 11:36 AM

Processor context corrupt.
 
Hi dear all,

We have a dell server (poweredge 1950) running in production over a 3
years from now. But since the last 4 months the server begins to
randomly crash, sometimes once a month, sometimes two times a week.
This is the error log:
http://imageshack.us/f/714/screenshotat20111209210.png/

Any hint on this?
Thanks!

--
Marc


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CA+DCN_sr0CFcySvtezS=xHrXr7DmBVfBwid8dU_cASpwqN+aR Q@mail.gmail.com">http://lists.debian.org/CA+DCN_sr0CFcySvtezS=xHrXr7DmBVfBwid8dU_cASpwqN+aR Q@mail.gmail.com

Marcos Lorenzo de Santiago 12-17-2011 11:44 AM

Processor context corrupt.
 
It seems to be a hardware problem as it warns. You should run the HW tests on the BIOS to check the processor's health and RAM memory.

Regards.

El 17/12/2011, a las 13:36, Marc Aymerich <glicerinu@gmail.com> escribió:

> Hi dear all,
>
> We have a dell server (poweredge 1950) running in production over a 3
> years from now. But since the last 4 months the server begins to
> randomly crash, sometimes once a month, sometimes two times a week.
> This is the error log:
> http://imageshack.us/f/714/screenshotat20111209210.png/
>
> Any hint on this?
> Thanks!
>
> --
> Marc
>
>
> --
> To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
> Archive: http://lists.debian.org/CA+DCN_sr0CFcySvtezS=xHrXr7DmBVfBwid8dU_cASpwqN+aR Q@mail.gmail.com
>


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 9683B1F7-DCC2-4CE8-A1AA-B9B5C7D49CC5@gmail.com">http://lists.debian.org/9683B1F7-DCC2-4CE8-A1AA-B9B5C7D49CC5@gmail.com

Marc Aymerich 12-19-2011 03:15 PM

Processor context corrupt.
 
Yes, It seems a hardware problem, we already have done a memory check
(it took 3 or 4 days to complete) but we found nothing. Unfortunately
our bios doesn't have any cpu test. It is posible to run a cpu test
like memtest86 does? Also are these hardware test 100% reliable? I
mean, I can discart a memory problem after succesfully passing
memtest86?


On Sat, Dec 17, 2011 at 1:44 PM, Marcos Lorenzo de Santiago
<fraga.muerete@gmail.com> wrote:
> It seems to be a hardware problem as it warns. You should run the HW tests on the BIOS to check the processor's health and RAM memory.
>
> Regards.
>
> El 17/12/2011, a las 13:36, Marc Aymerich <glicerinu@gmail.com> escribió:
>
>> Hi dear all,
>>
>> We have a dell server (poweredge 1950) running in production over a 3
>> years from now. But since the last 4 months the server begins to
>> randomly crash, sometimes once a month, sometimes two times a week.
>> This is the error log:
>> http://imageshack.us/f/714/screenshotat20111209210.png/
>>
>> Any hint on this?
>> Thanks!
>>
>> --
>> Marc
>>
>>
>> --
>> To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
>> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
>> Archive: http://lists.debian.org/CA+DCN_sr0CFcySvtezS=xHrXr7DmBVfBwid8dU_cASpwqN+aR Q@mail.gmail.com
>>



--
Marc


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CA+DCN_tOAaLvvr9y14B9U8HzbhO5tTknx=tj9vCWn4+JYn6+U Q@mail.gmail.com">http://lists.debian.org/CA+DCN_tOAaLvvr9y14B9U8HzbhO5tTknx=tj9vCWn4+JYn6+U Q@mail.gmail.com

Sven Hartge 12-19-2011 06:03 PM

Processor context corrupt.
 
Marc Aymerich <glicerinu@gmail.com> wrote:

> Yes, It seems a hardware problem, we already have done a memory check
> (it took 3 or 4 days to complete) but we found nothing. Unfortunately
> our bios doesn't have any cpu test. It is posible to run a cpu test
> like memtest86 does? Also are these hardware test 100% reliable? I
> mean, I can discart a memory problem after succesfully passing
> memtest86?

If the CPU throws an Machine Check Exception (MCE), there is not much
you can do, as the hardware already told you "I'm dead, Jim!".

In all my cases of a MCE in the past, this was _always_ caused by faulty
hardware, split roghly 50:50 between broken mainboard or broken CPU.

If you have a valid support contract with your vendor, tell them the
system is MCEing and they will replace the hardware. If they refuse to
comply: never buy from that vendor again.

Grüße,
Sven.

--
Sigmentation fault. Core dumped.


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: s8dqmfba5jv8@mids.svenhartge.de">http://lists.debian.org/s8dqmfba5jv8@mids.svenhartge.de

Henrique de Moraes Holschuh 12-20-2011 05:17 PM

Processor context corrupt.
 
On Sat, 17 Dec 2011, Marc Aymerich wrote:
> We have a dell server (poweredge 1950) running in production over a 3
> years from now. But since the last 4 months the server begins to
> randomly crash, sometimes once a month, sometimes two times a week.
> This is the error log:
> http://imageshack.us/f/714/screenshotat20111209210.png/

Update the system firmware and all of the baseboard firmware. If the
problem persists, contact the vendor for system repair... and good luck.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111220181706.GB2759@khazad-dum.debian.net">http://lists.debian.org/20111220181706.GB2759@khazad-dum.debian.net

Robert L Mathews 12-22-2011 06:52 PM

Processor context corrupt.
 
Sven Hartge wrote:


In all my cases of a MCE in the past, this was _always_ caused by faulty
hardware, split roghly 50:50 between broken mainboard or broken CPU.


As another data point, I recently saw a repeatable MCE error on a
machine with faulty memory during a new hardware burn-in test. Replacing
the memory fixed it.


--
Robert L Mathews, Tiger Technologies, http://www.tigertech.net/


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4EF38A6F.4020508@tigertech.com">http://lists.debian.org/4EF38A6F.4020508@tigertech.com

Sven Hartge 12-22-2011 09:44 PM

Processor context corrupt.
 
Robert L Mathews <lists@tigertech.com> wrote:
> Sven Hartge wrote:

>> In all my cases of a MCE in the past, this was _always_ caused by
>> faulty hardware, split roghly 50:50 between broken mainboard or
>> broken CPU.

> As another data point, I recently saw a repeatable MCE error on a
> machine with faulty memory during a new hardware burn-in test.
> Replacing the memory fixed it.

Might happen with the newer CPUs with integrated memory controller.

Grüße,
Sven.

--
Sigmentation fault. Core dumped.


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 08e30ok22fv8@mids.svenhartge.de">http://lists.debian.org/08e30ok22fv8@mids.svenhartge.de

Marc Aymerich 12-23-2011 05:49 PM

Processor context corrupt.
 
On Thu, Dec 22, 2011 at 11:44 PM, Sven Hartge <sven@svenhartge.de> wrote:
> Robert L Mathews <lists@tigertech.com> wrote:
>> Sven Hartge wrote:
>
>>> In all my cases of a MCE in the past, this was _always_ caused by
>>> faulty hardware, split roghly 50:50 between broken mainboard or
>>> broken CPU.
>
>> As another data point, I recently saw a repeatable MCE error on a
>> machine with faulty memory during a new hardware burn-in test.
>> Replacing the memory fixed it.
>
> Might happen with the newer CPUs with integrated memory controller.
>

Thank you all for your help :)
Yesterday we decide to remove two memory sims, since these two sims
have been plugged into the server just few months ago, during a
general memory upgrade. Hope that there is no mceing anymore :)

--
Marc


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CA+DCN_uLMm-T-UAHXaYtys5gzmrjS2NVBdriFARJckyRv1yupg@mail.gmail.c om">http://lists.debian.org/CA+DCN_uLMm-T-UAHXaYtys5gzmrjS2NVBdriFARJckyRv1yupg@mail.gmail.c om

Marc Aymerich 01-04-2012 08:18 PM

Processor context corrupt.
 
On Fri, Dec 23, 2011 at 7:49 PM, Marc Aymerich <glicerinu@gmail.com> wrote:
> On Thu, Dec 22, 2011 at 11:44 PM, Sven Hartge <sven@svenhartge.de> wrote:
>> Robert L Mathews <lists@tigertech.com> wrote:
>>> Sven Hartge wrote:
>>
>>>> In all my cases of a MCE in the past, this was _always_ caused by
>>>> faulty hardware, split roghly 50:50 between broken mainboard or
>>>> broken CPU.
>>
>>> As another data point, I recently saw a repeatable MCE error on a
>>> machine with faulty memory during a new hardware burn-in test.
>>> Replacing the memory fixed it.
>>
>> Might happen with the newer CPUs with integrated memory controller.
>>
>
> Thank you all for your help :)
> Yesterday we decide to remove two memory sims, since these two sims
> have been plugged into the server just few months ago, during a
> general memory upgrade. Hope that there is no mceing anymore :)
>

two weeks have passed without see any context corrupt. Seems that it
was a memory problem after all. Thank you guys!




--
Marc


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CA+DCN_sUz18BAQ6Hv9vrMP5gKjoPRcKsHqaJ_yS3zar3uS1c0 w@mail.gmail.com">http://lists.debian.org/CA+DCN_sUz18BAQ6Hv9vrMP5gKjoPRcKsHqaJ_yS3zar3uS1c0 w@mail.gmail.com


All times are GMT. The time now is 05:56 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.