Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Gentoo User (http://www.linux-archive.org/gentoo-user/)
-   -   My PC died. What should I try? (http://www.linux-archive.org/gentoo-user/694882-my-pc-died-what-should-i-try.html)

Alex Schuster 08-17-2012 07:50 AM

My PC died. What should I try?
 
Hi there!

Two days ago, my PC suddenly died, after working fine for half a year. I
used myrtcwake as usual to suspend to RAM, and it woke up in the
morning. But after two minutes, the screen went blank and nothing, even
SysRq, gave a reaction. I tried booting a couple of times again, and
sometimes it did not even reach KDM. Now, I cannot even run Grub (from
my USB stick) any more, I only see a "GRUB" string at the top right,
then nothing happens.


Booting with SystemRescueCD also freezes sometimes. If not, I can make
it freeze after seconds by running 'memtester'.


Booting good old memtest86 ran for an hour and only found one error,
then I aborted, removed three of my four memory modules (4GB each), and
tried different ones in the first bank. Memtest86 again did not find
much errors, but froze once. Running memtester after booting from
SystemrescueCD again makes the thing freeze in seconds. It once also
froze while being in the BIOs setup.


What could be the problem? CPU, board, or even the PSU? I do not think
it has to do with bad memory. I removed most of the other stuff (hard
drives, PCI cards). I have no similar hardware so I cannot simply
exchange things, the question is what to buy and try. How would you proceed?


The fan is still working, the cooler does not become hot, and in the
BIOS there are not high temperatures begin reported. But one thing was
strange: I updated Calligra from 2.4 to 2.5 (I think), and it took ages,
at least 8 hours. I thought there may b something strange with the build
process of this new version, forcing MAKEOPTS=-j1 and such, but still
this is very long. But when working with it, I did not notice anything
strange like sluggish reactions, and videos played fine. But I did not
use it as much as I normally do, and maybe even when overheated and
throttled down it would have been fast enough for me to not notice this.
I watch the syslog normally, but maybe I just did not look closely that
day, I was busy doing other stuff.


CPUs don't just die, do they? Even when overheating, I think these days
throttle down, so no permanent harm should be done? So maybe it's the
board? It looks okay, no bent or leaking capacitors.


This is really annoying. Of course most of my passwords are in my KDE
wallet I cannot access. There's also Wiki, CVS and Git repositories, not
needed every day, but still important. And the timinig is very bad, I
just started my new job the day the problem happened, and I do not have
much time for this now. Before, I was working at home, so I would have
had all day to diagnose and try things.


It's an AMD FX-4100 Quad-Core CPU, and an ASRock 880GMH/U3S3 board.

Wonko

08-17-2012 08:25 AM

My PC died. What should I try?
 
Hi Alex,

...shot in the dark:
Remove as much as possible of the cards, addons, connections etc
from the PC ... make in as much "bare bone" as possible.

Check All coolers (the little ones also) for dust. Remove all
dust even if it is not completly covered with it.

Dont forget the internals of the power supply. Detach all cables.
Remove the power supply. Go outside ;) and blow the dust inside away.

Put the power supply back into the PC again an attach the cables.

Remove all RAM, carefully clean the contacts, insert as less RAM as
possible.

Remove even the HD if it is possible to get into the BIOS
without any HD attached.

Remove the BIOS battery, wait at least a day and insert it again.

Start the PC and go directly into the BIOS. Check the date/time.
If it shows the current date/time, the battery wasn't removed
long enough. Check the battery voltage. Reinsert the battery.
If your board has a BIOS reset: Reset the BIOS.

Then: In the BIOS enter a page which "does something"
(reports continously temperatures for example).

If this is possible, let the PC run for a
while that BIOS page and see, whether it
hangs again or not.

If all went fine, add ONE component and try it again.
Add the HD at last to sort out hardware from software bugs...

May be one of the components and not the CPU or motherboard
causes the problem and you will be able to identify it by
this procedure...

HTH!

GOOD LUCK!

Best regard,
mcc

Alex Schuster <wonko@wonkology.org> [12-08-17 09:56]:
> Hi there!
>
> Two days ago, my PC suddenly died, after working fine for half a year.
> I used myrtcwake as usual to suspend to RAM, and it woke up in the
> morning. But after two minutes, the screen went blank and nothing, even
> SysRq, gave a reaction. I tried booting a couple of times again, and
> sometimes it did not even reach KDM. Now, I cannot even run Grub (from
> my USB stick) any more, I only see a "GRUB" string at the top right,
> then nothing happens.
>
> Booting with SystemRescueCD also freezes sometimes. If not, I can make
> it freeze after seconds by running 'memtester'.
>
> Booting good old memtest86 ran for an hour and only found one error,
> then I aborted, removed three of my four memory modules (4GB each), and
> tried different ones in the first bank. Memtest86 again did not find
> much errors, but froze once. Running memtester after booting from
> SystemrescueCD again makes the thing freeze in seconds. It once also
> froze while being in the BIOs setup.
>
> What could be the problem? CPU, board, or even the PSU? I do not think
> it has to do with bad memory. I removed most of the other stuff (hard
> drives, PCI cards). I have no similar hardware so I cannot simply
> exchange things, the question is what to buy and try. How would you
> proceed?
>
> The fan is still working, the cooler does not become hot, and in the
> BIOS there are not high temperatures begin reported. But one thing was
> strange: I updated Calligra from 2.4 to 2.5 (I think), and it took
> ages, at least 8 hours. I thought there may b something strange with
> the build process of this new version, forcing MAKEOPTS=-j1 and such,
> but still this is very long. But when working with it, I did not notice
> anything strange like sluggish reactions, and videos played fine. But I
> did not use it as much as I normally do, and maybe even when overheated
> and throttled down it would have been fast enough for me to not notice
> this. I watch the syslog normally, but maybe I just did not look
> closely that day, I was busy doing other stuff.
>
> CPUs don't just die, do they? Even when overheating, I think these days
> throttle down, so no permanent harm should be done? So maybe it's the
> board? It looks okay, no bent or leaking capacitors.
>
> This is really annoying. Of course most of my passwords are in my KDE
> wallet I cannot access. There's also Wiki, CVS and Git repositories,
> not needed every day, but still important. And the timinig is very bad,
> I just started my new job the day the problem happened, and I do not
> have much time for this now. Before, I was working at home, so I would
> have had all day to diagnose and try things.
>
> It's an AMD FX-4100 Quad-Core CPU, and an ASRock 880GMH/U3S3 board.
>
> Wonko
>

Volker Armin Hemmann 08-17-2012 09:07 AM

My PC died. What should I try?
 
Am Freitag, 17. August 2012, 09:50:40 schrieb Alex Schuster:

sounds like a power problem.

Either psu is gone bad (get a new one)
or your mainboard's power circuitry gone bad (if replacement of psu does not
help, get a new one).

But first thing first: disconnect your hdds! No reason to risk them.
--
#163933

Alex Schuster 08-17-2012 09:39 AM

My PC died. What should I try?
 
meino.cramer@gmx.de writes:


...shot in the dark:
Remove as much as possible of the cards, addons, connections etc
from the PC ... make in as much "bare bone" as possible.


Done already.


Check All coolers (the little ones also) for dust. Remove all
dust even if it is not completly covered with it.


They are clean.


Dont forget the internals of the power supply. Detach all cables.
Remove the power supply. Go outside ;) and blow the dust inside away.


I did not remove it yet... but if it's a temperature problem, it should
not happen right after 30 seconds, when Grub already fails.
The voltages reported in the BIOS are okay, but I don't know it this
information is accurate and reliable.



Put the power supply back into the PC again an attach the cables.


If I only could find a spare one... I have it, but I don't know where.


Remove all RAM, carefully clean the contacts, insert as less RAM as
possible.


Did that, using only 4 of 16 GB, and I switched the modules.


Remove even the HD if it is possible to get into the BIOS
without any HD attached.


I also did that, only the CD-ROM is attached.


Remove the BIOS battery, wait at least a day and insert it again.


That's worth a try. My old PC had a jumper which I could short circuit
to instantly drain it, not sure if this was normal.



Start the PC and go directly into the BIOS. Check the date/time.
If it shows the current date/time, the battery wasn't removed
long enough. Check the battery voltage. Reinsert the battery.
If your board has a BIOS reset: Reset the BIOS.

Then: In the BIOS enter a page which "does something"
(reports continously temperatures for example).

If this is possible, let the PC run for a
while that BIOS page and see, whether it
hangs again or not.


Okay, I will do this.


If all went fine, add ONE component and try it again.
Add the HD at last to sort out hardware from software bugs...


Nah, I cannot even boot from my USB stick any more. I don't have a boot
partition on my hard drive, so it is not involved there.



May be one of the components and not the CPU or motherboard
causes the problem and you will be able to identify it by
this procedure...


I hope it's the power supply, this would mean the least effort. I'd
simply buy a new one, and I would not have to think about what board or
which CPU I would like to get.



HTH!

GOOD LUCK!


Thanks! I can need it.

Wonko

08-17-2012 09:40 AM

My PC died. What should I try?
 
Hello!

On Fri, 17 Aug 2012 09:50:40 +0200
Alex Schuster <wonko@wonkology.org> wrote:

> Hi there!
>
> Two days ago, my PC suddenly died, after working fine for half a
> year. I used myrtcwake as usual to suspend to RAM, and it woke up in
> the morning. But after two minutes, the screen went blank and
> nothing, even SysRq, gave a reaction. I tried booting a couple of
> times again, and sometimes it did not even reach KDM. Now, I cannot
> even run Grub (from my USB stick) any more, I only see a "GRUB"
> string at the top right, then nothing happens.
>
> Booting with SystemRescueCD also freezes sometimes. If not, I can
> make it freeze after seconds by running 'memtester'.
>
> Booting good old memtest86 ran for an hour and only found one error,
> then I aborted, removed three of my four memory modules (4GB each),
> and tried different ones in the first bank. Memtest86 again did not
> find much errors, but froze once. Running memtester after booting
> from SystemrescueCD again makes the thing freeze in seconds. It once
> also froze while being in the BIOs setup.
>
If the system behaves in such an unpredictable way (freezing at a
random point), I usually check the following things:
- RAM;
- bloated capacitors on the Motherboard;
- bloated or dried capacitors in the power supply unit;

If your PC is only half a year old, it is unlikely that the
capacitors dried. But they could easily bloat, especially if they were
of bad quality or situated near some hot surface like heat sinks.
Testing the power supply needs not only visual analysis. It would be
good to attach the oscilloscope to the output and see the voltage
level. It should not have large peaks (voltage jumps). But this is
usually true for the old units with dried capacitors, as I said.

If I were you, I'd tried to temporarily replace the memory with a 100%
working module, and if it does not help - replace the power supply
unit (if you do not have the necessary equipment to test it thoroughly).

And one more simple test: turn on the PC, enter the BIOS setup
utility and keep it running in this state. If it runs ok for some time
(like a couple of hours), I'd say the problem is in RAM.

Regards,
Vladimir


-----
<v_2e@ukr.net>

Dale 08-17-2012 12:33 PM

My PC died. What should I try?
 
Alex Schuster wrote:
> Hi there!
>
> Two days ago, my PC suddenly died, after working fine for half a year.
> I used myrtcwake as usual to suspend to RAM, and it woke up in the
> morning. But after two minutes, the screen went blank and nothing,
> even SysRq, gave a reaction. I tried booting a couple of times again,
> and sometimes it did not even reach KDM. Now, I cannot even run Grub
> (from my USB stick) any more, I only see a "GRUB" string at the top
> right, then nothing happens.
>
> Booting with SystemRescueCD also freezes sometimes. If not, I can make
> it freeze after seconds by running 'memtester'.
>
> Booting good old memtest86 ran for an hour and only found one error,
> then I aborted, removed three of my four memory modules (4GB each),
> and tried different ones in the first bank. Memtest86 again did not
> find much errors, but froze once. Running memtester after booting from
> SystemrescueCD again makes the thing freeze in seconds. It once also
> froze while being in the BIOs setup.
>
> What could be the problem? CPU, board, or even the PSU? I do not think
> it has to do with bad memory. I removed most of the other stuff (hard
> drives, PCI cards). I have no similar hardware so I cannot simply
> exchange things, the question is what to buy and try. How would you
> proceed?
>
> The fan is still working, the cooler does not become hot, and in the
> BIOS there are not high temperatures begin reported. But one thing was
> strange: I updated Calligra from 2.4 to 2.5 (I think), and it took
> ages, at least 8 hours. I thought there may b something strange with
> the build process of this new version, forcing MAKEOPTS=-j1 and such,
> but still this is very long. But when working with it, I did not
> notice anything strange like sluggish reactions, and videos played
> fine. But I did not use it as much as I normally do, and maybe even
> when overheated and throttled down it would have been fast enough for
> me to not notice this. I watch the syslog normally, but maybe I just
> did not look closely that day, I was busy doing other stuff.
>
> CPUs don't just die, do they? Even when overheating, I think these
> days throttle down, so no permanent harm should be done? So maybe it's
> the board? It looks okay, no bent or leaking capacitors.
>
> This is really annoying. Of course most of my passwords are in my KDE
> wallet I cannot access. There's also Wiki, CVS and Git repositories,
> not needed every day, but still important. And the timinig is very
> bad, I just started my new job the day the problem happened, and I do
> not have much time for this now. Before, I was working at home, so I
> would have had all day to diagnose and try things.
>
> It's an AMD FX-4100 Quad-Core CPU, and an ASRock 880GMH/U3S3 board.
>
> Wonko
>
>
Just my two cents here. Problems like this are usually the power
supply. Could it be the mobo, yes it could but the power supply is more
likely, usually cheaper to replace and easier to. I had a friends puter
that was acting weird, random reboots and such, it was the power
supply. A bad power supply can cause all sorts of weird problems.

If you can, unplug everything including the CD/DVD drive. No hard
drives either. Just play with the BIOS. Basically, don't try to boot
anything, just look at the BIOS itself. If it acts weird, start with
the power supply. If you have to, go to a local place and pick up a
cheap power supply. Put it in just long enough to see if that is the
problem. If it works, then order you a real good power supply. Just
keep the cheapy for testing purposes. If the cheapy power supply
presents the same problem, then it could be the mobo.

Random problems are hard to fix sometimes. You just have to swap things
until you find the bad part. I would put the odds at 80% that it is the
power supply tho.

While at it, do you know what brand and the wattage of your power
supply? It could be that someone on here as experience with that
particular brand or even that exact model.

Dale

:-) :-)

--
I am only responsible for what I said ... Not for what you understood or how you interpreted my words!

Alex Schuster 08-17-2012 05:25 PM

My PC died. What should I try?
 
v_2e@ukr.net writes:


If the system behaves in such an unpredictable way (freezing at a
random point), I usually check the following things:
- RAM;
- bloated capacitors on the Motherboard;
- bloated or dried capacitors in the power supply unit;

If your PC is only half a year old, it is unlikely that the
capacitors dried. But they could easily bloat, especially if they were
of bad quality or situated near some hot surface like heat sinks.
Testing the power supply needs not only visual analysis. It would be
good to attach the oscilloscope to the output and see the voltage
level. It should not have large peaks (voltage jumps). But this is
usually true for the old units with dried capacitors, as I said.


The power supply is older, I re-used it from the PC I had before this
one. I hope it causes the trouble, and will try another one this
evening. Thanks for this information, this strengthens my confidence
that I do not have to buy a new board or CPU. Now I am driving home with
a bag of three PSUs I had lent to a friend (and already forgotten).



If I were you, I'd tried to temporarily replace the memory with a 100%
working module, and if it does not help - replace the power supply
unit (if you do not have the necessary equipment to test it thoroughly).


I wish I had :) The RAM is okay, I think, I cannot imagine different
memory modules to suddenly go bad all at once. And memtest86 found one
error only after an hour, while the crashes happen after a few minutes
already.



And one more simple test: turn on the PC, enter the BIOS setup
utility and keep it running in this state. If it runs ok for some time
(like a couple of hours), I'd say the problem is in RAM.


It once crashed after ten minutes. That was not reproducable, but I did
not try that often.


Wonko

Mark Knecht 08-17-2012 05:54 PM

My PC died. What should I try?
 
On Fri, Aug 17, 2012 at 12:50 AM, Alex Schuster <wonko@wonkology.org> wrote:
> Hi there!
>
> Two days ago, my PC suddenly died, after working fine for half a year. I
> used myrtcwake as usual to suspend to RAM, and it woke up in the morning.
> But after two minutes, the screen went blank and nothing, even SysRq, gave a
> reaction. I tried booting a couple of times again, and sometimes it did not
> even reach KDM. Now, I cannot even run Grub (from my USB stick) any more, I
> only see a "GRUB" string at the top right, then nothing happens.
>
> Booting with SystemRescueCD also freezes sometimes. If not, I can make it
> freeze after seconds by running 'memtester'.
>
> Booting good old memtest86 ran for an hour and only found one error, then I
> aborted, removed three of my four memory modules (4GB each), and tried
> different ones in the first bank. Memtest86 again did not find much errors,
> but froze once. Running memtester after booting from SystemrescueCD again
> makes the thing freeze in seconds. It once also froze while being in the
> BIOs setup.
>
> What could be the problem? CPU, board, or even the PSU? I do not think it
> has to do with bad memory. I removed most of the other stuff (hard drives,
> PCI cards). I have no similar hardware so I cannot simply exchange things,
> the question is what to buy and try. How would you proceed?
>
> The fan is still working, the cooler does not become hot, and in the BIOS
> there are not high temperatures begin reported. But one thing was strange: I
> updated Calligra from 2.4 to 2.5 (I think), and it took ages, at least 8
> hours. I thought there may b something strange with the build process of
> this new version, forcing MAKEOPTS=-j1 and such, but still this is very
> long. But when working with it, I did not notice anything strange like
> sluggish reactions, and videos played fine. But I did not use it as much as
> I normally do, and maybe even when overheated and throttled down it would
> have been fast enough for me to not notice this. I watch the syslog
> normally, but maybe I just did not look closely that day, I was busy doing
> other stuff.
>
> CPUs don't just die, do they? Even when overheating, I think these days
> throttle down, so no permanent harm should be done? So maybe it's the board?
> It looks okay, no bent or leaking capacitors.
>
> This is really annoying. Of course most of my passwords are in my KDE wallet
> I cannot access. There's also Wiki, CVS and Git repositories, not needed
> every day, but still important. And the timinig is very bad, I just started
> my new job the day the problem happened, and I do not have much time for
> this now. Before, I was working at home, so I would have had all day to
> diagnose and try things.
>
> It's an AMD FX-4100 Quad-Core CPU, and an ASRock 880GMH/U3S3 board.
>
> Wonko
>

Hi Alex,
Sorry for the problems.

I've read most of the responses so it seems you're getting good
info. A few things:

1) You asked "CPUs don't just die, do they?". The answer is 'yes, they
do.' It can happen at any time:

http://en.wikipedia.org/wiki/Bathtub_curve

2) If I understand your post, along with the other discussions, it
seems that you can remove all cards and all memory except 1 DIMM and
boot the machine to BIOS. Is that correct? If so then your CPU isn't
completely dead.

3) As you are seeing some memory problems it might be that memory
died. (see bathtub curve again - it applies to everything.) However it
seems very unlikely that all memory died at the same time. More likely
is the the chipset. If you change DIMMs but keep plugging it into the
same memory channel then it might be that channel in the chipset
that's having trouble. If it's your chipset, you're sunk. Get a new
MB.

As others have suggested the PSU is a potential common problem.
With everything else out of the box, memory swapped but the same
problem occurring, and the ability to at least get into BIOS, it's
likely either the PSU or the MB.

Good luck,
Mark

Alex Schuster 08-17-2012 06:16 PM

My PC died. What should I try?
 
Volker Armin Hemmann writes:


sounds like a power problem.

Either psu is gone bad (get a new one)


Well, I got three old ones instead :)


or your mainboard's power circuitry gone bad (if replacement of psu does not
help, get a new one).


It did not help :( Too bad, I probably need a new mainboard. And I
cannot get one before monday evening, I have to go to a wedding tomorrow
(not mine) and I doubt I will have time to find a hardware store there.



But first thing first: disconnect your hdds! No reason to risk them.


I did that soon. I already had trouble with one two weeks ago, it had
bad blocks on the home partition. The replacement drive also had bad
blocks, I had to get yet another one. It's a good thing to have recent
backups :)


And there, it just crashed while in the BIOS setup.

Wonko

Paul Hartman 08-17-2012 07:12 PM

My PC died. What should I try?
 
On Fri, Aug 17, 2012 at 1:16 PM, Alex Schuster <wonko@wonkology.org> wrote:
> And there, it just crashed while in the BIOS setup.

If you are using a video card (instead of built-in/on-board video) I
would try a different video card, if you have an old or spare one. I
have had lots of video cards die from overheating and power spikes.

I only had one motherboard ever die, a computer I gave to my father
died after a few months... it was ASRock brand but I'm sure that is a
coincidence. :) It had blown/cracked capacitors all over the
motherboard. It did not die completely at once. It would "kind of"
work, but started to crash randomly and became worse and worse until
finally it wouldn't boot at all. I replaced the MB, but kept the same
CPU, RAM everything else, and it has been working ever since. That was
after we bought a new power supply that didn't make any difference.


All times are GMT. The time now is 11:47 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.