Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Fedora User (http://www.linux-archive.org/fedora-user/)
-   -   testing hardware - use what software ? (http://www.linux-archive.org/fedora-user/93999-testing-hardware-use-what-software.html)

David Timms 05-23-2008 01:35 PM

testing hardware - use what software ?
 
I have a PC that is failing intermittently. What are the best choice for
attempting to pinpoint a hardware fault - in terms of either fedora
packaged or other open sourcesoftware ?


Perhaps there is something that can both perform one of and 'to death'
tests of various PC components, and components in combination, and also
provide peak workload for all subsystems at once, in an attempt to
generate a slow to occur failure.


I know about memtest, and read-only disk tests from fsck. Any others ?

DaveT.

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Tim 05-23-2008 01:51 PM

testing hardware - use what software ?
 
On Fri, 2008-05-23 at 23:35 +1000, David Timms wrote:
> I have a PC that is failing intermittently. What are the best choice
> for attempting to pinpoint a hardware fault

By checking the hardware, directly, rather than playing with software.

Have you tried the obvious? Like unplugging and replugging everything
that's not soldered into place. Connectors are the number one hardware
fault in equipment. Failing power supplies are probably next down the
list, followed by hard drives. Cleaning out fluff and dust from
heatsinks, and making sure that fans spin, are other easy things to
check. Oh, and foreign objects in the computer - one loose screw
somewhere in the chassis can be a recipe for disaster. And you can test
particular cards by removing non-essential ones.

--
Don't send private replies to my address, the mailbox is ignored.
I read messages from the public lists.

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

David Timms 05-23-2008 02:08 PM

testing hardware - use what software ?
 
Tim wrote:

On Fri, 2008-05-23 at 23:35 +1000, David Timms wrote:

I have a PC that is failing intermittently. What are the best choice
for attempting to pinpoint a hardware fault


By checking the hardware, directly, rather than playing with software.
Yes, and shoot it with the heat gun and so on. But is there some
software designed to do stress testing ?


I'm talking about a machine that has had swaps to power supply, disk,
ram, ram slots, is clean, reinsertion of cards etc, yet nothing
definitive is showing up.


--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Alan Cox 05-23-2008 02:45 PM

testing hardware - use what software ?
 
On Fri, 23 May 2008 23:35:31 +1000
David Timms <dtimms@iinet.net.au> wrote:

> I have a PC that is failing intermittently. What are the best choice for
> attempting to pinpoint a hardware fault - in terms of either fedora
> packaged or other open sourcesoftware ?
>
> Perhaps there is something that can both perform one of and 'to death'
> tests of various PC components, and components in combination, and also
> provide peak workload for all subsystems at once, in an attempt to
> generate a slow to occur failure.
>
> I know about memtest, and read-only disk tests from fsck. Any others ?

You can certainly stress bits of the system - eg boot into text mode and
run dbench tests to trash the disk subsystem.

I actually usually start from the hardware end with a misbehaving PC

- Reseat all the cards
- Remove and reinsert all the RAM (taking care to earth yourself first)
- Blow dust out of the fans
- Optionally unplug/replug all the cables although that is rarely a
problem

Run it with the lid off for a bit and see if anything seems oddly hot

memtest86 and smart reporting from the drive are good useful starters as
a positive bad result saves a lot of guessing. Its often hard to pin down
a hardware fault however that is more complex or subtle (or to tell it
from software)

Alan

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Tim 05-23-2008 03:47 PM

testing hardware - use what software ?
 
On Sat, 2008-05-24 at 00:08 +1000, David Timms wrote:
> Yes, and shoot it with the heat gun and so on. But is there some
> software designed to do stress testing ?

I've often asked something similar from PC shops, as their testing
seemed to comprise of just seeing if it'll boot and stay running for
half an hour...

I think that only something from the board manufacturer could do that.
Only they'd know exactly how their combination of hardware should
perform, and I can't seem them releasing something that probed their
hardware completely, it'd just aid in reverse engineering.

> I'm talking about a machine that has had swaps to power supply, disk,
> ram, ram slots, is clean, reinsertion of cards etc, yet nothing
> definitive is showing up.

Tends to point the finger at motherboard or CPU...

I hope you always took anti-static precautions during handling. But
it's just as easy for it have been zapped before you ever touched it,
and it took "this long" for the problem to appear. Static damage is
like that - mostly not instant, just weakening something for a later
catastrophe.

--
(This box runs Centos 5.0, my others still run FC 4, 5, 6, & 7, in case that's
important to the thread.)

Don't send private replies to my address, the mailbox is ignored.
I read messages from the public lists.

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

"max bianco" 05-27-2008 03:00 PM

testing hardware - use what software ?
 
On Fri, May 23, 2008 at 11:47 AM, Tim <ignored_mailbox@yahoo.com.au> wrote:
> On Sat, 2008-05-24 at 00:08 +1000, David Timms wrote:
>> Yes, and shoot it with the heat gun and so on. But is there some
>> software designed to do stress testing ?
>
> I've often asked something similar from PC shops, as their testing
> seemed to comprise of just seeing if it'll boot and stay running for
> half an hour...
>
Yes, people do not realize how hard it is to pinpoint a hardware
problem. Many are under the impression there is some magic involved
and results should be instant and/or provide instant "Star Trek" style
solutions. We are not quite there yet , especially as far down the
totem pole as your average pc repair shop. I try to use the computer
as much as possible but time is money and you can easily run up a bill
that exceeds the cost of a cheap machine quite quickly. However if you
feel you have a genuine hardware problem then I would do the
following. The order will vary depending on where you think, based on
your observations, the problem lies.

0. Always observe proper static handling precautions!! I cannot stress
this enough. You will cook your board and never feel a thing.

1. Do not expect instant star trek style results, this isn't the
twenty-third century

2. Alan's recipe is as good a way to start as any I have seen or heard.

3. Make sure to start as close to bare board as you can get. Unplug
everything you don't need to boot your OS. So everything but the
harddrive,video card, RAM and CPU ( if you have on board video , use
that instead of the high end 512mb nvidia/ati card for testing, video
cards these days suck up alot of power and mine ruined a power supply
for me because i did not check the spec and the card needed more
power than my 500watt was capable of delivering over a 12V rail.)

4. I have seen many people say they ran memtest86 for one to three
hours. I personally like to leave it running all night, though of
course people will say they are not in a hurry and then call two hours
later to see if its fixed yet. I have noted many of the more
experienced folks on this list recommending 24 hours or longer for
memtest86. It seems like a long time but peace of mind is priceless so
keep that in mind.

5. Reconnect devices one at a time and use the reattached device i.e
cd burner to read and burn cd's, a bad cdrom can hold up the boot
process or cause other weird behavior. Remember it could easily be a
failing IDE controller on the motherboard. Try, if you have one, a
separate controller card. Same goes for USB. These boards where
absolutely everything is integrated onto the board may make for a
slight speed advantage but can be a real pain in the ass when
something starts to fail.

6. Be careful if you find alot of dust built up inside. Dust can cause
static damage and reckless removal of dust can kill your machine quite
quickly. Sometimes the dust is the only thing holding it together : )
Never keep the compressed air closer than 6 or 8 inches, blow the dust
off at odd angles, you don't have to get all the dust out in one go.
Let things settle or at least do not blow the air directly onto the
board from 3 inches away. Do not blow out a massive amount of dust and
plug it all back in and power on. Let it settle, a little patience
goes a long way here.

7. Of course swap out RAM and CPU if possible. Most bioses have menus
where you can check temperature. Intermittent problems are often
caused as the temp increases. Maybe remove, clean, reapply thermal
tape or paste to heatsink. If your using alot of high end gear, like
sound cards or especially video cards, do not stack them in adjacent
PCI slots if you can avoid it. If the air in the case isn't moving,
the temperature climbs quickly. Many low end machines do not include a
separate case fan and often have 24 pin power connectors on the board
but the powersupply is only using a twenty pin plug. I have seen this
in some eMachines models, it probably won't be an issue until you
start adding more RAM, upgrading processors,adding drives etc.

8. Remember that your powersupply says 500 watts but that is peak
output, you don't run a race car wide open all the time.

9. Always observe proper static handling precautions!!


> I think that only something from the board manufacturer could do that.
> Only they'd know exactly how their combination of hardware should
> perform, and I can't seem them releasing something that probed their
> hardware completely, it'd just aid in reverse engineering.
>
They do make POST code readers. I like the one I have access to but it
is less useful than I originally thought it would be. Sometimes
finding the codes for your bios&board can be a real pain. Also they
are most useful when the board won't POST which usually means they are
good only for confirming suspicion because if board components are
failing your better off, financially anyway, buying a new board.

>> I'm talking about a machine that has had swaps to power supply, disk,
>> ram, ram slots, is clean, reinsertion of cards etc, yet nothing
>> definitive is showing up.
>
> Tends to point the finger at motherboard or CPU...
>
> I hope you always took anti-static precautions during handling. But
> it's just as easy for it have been zapped before you ever touched it,
> and it took "this long" for the problem to appear. Static damage is
> like that - mostly not instant, just weakening something for a later
> catastrophe.

Excellent point that many do not realize. There is a mistaken notion
that if it works then everything is tip top.

All comments, criticisms, questions, pointing out of incorrect info
welcome and appreciated.


Max

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

David Timms 05-31-2008 04:43 AM

testing hardware - use what software ?
 
max bianco wrote:

Yes, people do not realize how hard it is to pinpoint a hardware
problem. Many are under the impression there is some magic involved
and results should be instant and/or provide instant "Star Trek" style

...

Excellent point that many do not realize. There is a mistaken notion
that if it works then everything is tip top.

All comments, criticisms, questions, pointing out of incorrect info
welcome and appreciated.

Thanks for the tips, Max.

DaveT.

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Robin Laing 06-02-2008 07:17 PM

testing hardware - use what software ?
 
max bianco wrote:

On Fri, May 23, 2008 at 11:47 AM, Tim <ignored_mailbox@yahoo.com.au> wrote:

On Sat, 2008-05-24 at 00:08 +1000, David Timms wrote:

Yes, and shoot it with the heat gun and so on. But is there some
software designed to do stress testing ?

I've often asked something similar from PC shops, as their testing
seemed to comprise of just seeing if it'll boot and stay running for
half an hour...


Yes, people do not realize how hard it is to pinpoint a hardware
problem. Many are under the impression there is some magic involved
and results should be instant and/or provide instant "Star Trek" style
solutions. We are not quite there yet , especially as far down the
totem pole as your average pc repair shop. I try to use the computer
as much as possible but time is money and you can easily run up a bill
that exceeds the cost of a cheap machine quite quickly. However if you
feel you have a genuine hardware problem then I would do the
following. The order will vary depending on where you think, based on
your observations, the problem lies.




SNIP.




All comments, criticisms, questions, pointing out of incorrect info
welcome and appreciated.


Max



Very good points.

Back in the 386/486 days, I had an ISA board that would run a bunch of
hardware and software tests to check the hardware. Not perfect but sure
helped.


A good digital volt meter to measure the voltage rails. A power supply
that is close to being out of limits could drift enough to cause the
computer to freeze at strange times. The BIOS voltage readings are not
always that accurate.


Also, when cleaning out the dust. Make sure that you know where all the
jumper settings are on the motherboard. Cost me many hours when one of
the jumper shorting connectors came off on my computer.


Also confirm that the latest BIOS is installed. Even on new
motherboards. This fixed a freezing issue on a new computer for me.
Worked okay with 4Gig of ram but not 8 gig. Memtest worked great.


--
Robin Laing

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Roger Heflin 06-03-2008 07:41 PM

testing hardware - use what software ?
 
Robin Laing wrote:

max bianco wrote:
On Fri, May 23, 2008 at 11:47 AM, Tim <ignored_mailbox@yahoo.com.au>
wrote:

On Sat, 2008-05-24 at 00:08 +1000, David Timms wrote:

Yes, and shoot it with the heat gun and so on. But is there some
software designed to do stress testing ?

I've often asked something similar from PC shops, as their testing
seemed to comprise of just seeing if it'll boot and stay running for
half an hour...


Yes, people do not realize how hard it is to pinpoint a hardware
problem. Many are under the impression there is some magic involved
and results should be instant and/or provide instant "Star Trek" style
solutions. We are not quite there yet , especially as far down the
totem pole as your average pc repair shop. I try to use the computer
as much as possible but time is money and you can easily run up a bill
that exceeds the cost of a cheap machine quite quickly. However if you
feel you have a genuine hardware problem then I would do the
following. The order will vary depending on where you think, based on
your observations, the problem lies.




SNIP.




All comments, criticisms, questions, pointing out of incorrect info
welcome and appreciated.


Max



Very good points.

Back in the 386/486 days, I had an ISA board that would run a bunch of
hardware and software tests to check the hardware. Not perfect but sure
helped.


A good digital volt meter to measure the voltage rails. A power supply
that is close to being out of limits could drift enough to cause the
computer to freeze at strange times. The BIOS voltage readings are not
always that accurate.


Also, when cleaning out the dust. Make sure that you know where all the
jumper settings are on the motherboard. Cost me many hours when one of
the jumper shorting connectors came off on my computer.


Also confirm that the latest BIOS is installed. Even on new
motherboards. This fixed a freezing issue on a new computer for me.
Worked okay with 4Gig of ram but not 8 gig. Memtest worked great.




Compiling up something called HPL (with something called MPI) at least does
nicely at finding that you have a memory/overheat/internal CPU issue. If the
results corrupt or the machine crashes something is really wrong, typically it
won't tell you what is wrong, but if it successfully runs for a long time then
you can expect most things to be correct. Generally it will at least crash
the machine several times faster than most other applications.


It won't find IO/PCI/Video issues unless they are really severe, though
generally most of the issues fall into what it does test.


Roger

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

"max bianco" 06-03-2008 08:52 PM

testing hardware - use what software ?
 
On Tue, Jun 3, 2008 at 3:41 PM, Roger Heflin <rogerheflin@gmail.com> wrote:
> Robin Laing wrote:
>>
>> max bianco wrote:
>>>
>>> On Fri, May 23, 2008 at 11:47 AM, Tim <ignored_mailbox@yahoo.com.au>
>>> wrote:
>>>>
>>>> On Sat, 2008-05-24 at 00:08 +1000, David Timms wrote:
>>>>>
>>>>> Yes, and shoot it with the heat gun and so on. But is there some
>>>>> software designed to do stress testing ?
>>>>
>>>> I've often asked something similar from PC shops, as their testing
>>>> seemed to comprise of just seeing if it'll boot and stay running for
>>>> half an hour...
>>>>
>>> Yes, people do not realize how hard it is to pinpoint a hardware
>>> problem. Many are under the impression there is some magic involved
>>> and results should be instant and/or provide instant "Star Trek" style
>>> solutions. We are not quite there yet , especially as far down the
>>> totem pole as your average pc repair shop. I try to use the computer
>>> as much as possible but time is money and you can easily run up a bill
>>> that exceeds the cost of a cheap machine quite quickly. However if you
>>> feel you have a genuine hardware problem then I would do the
>>> following. The order will vary depending on where you think, based on
>>> your observations, the problem lies.
>>>
>>>
>>
>> SNIP.
>>
>>
>>>
>>> All comments, criticisms, questions, pointing out of incorrect info
>>> welcome and appreciated.
>>>
>>>
>>> Max
>>>
>>
>> Very good points.
>>
>> Back in the 386/486 days, I had an ISA board that would run a bunch of
>> hardware and software tests to check the hardware. Not perfect but sure
>> helped.
>>
>> A good digital volt meter to measure the voltage rails. A power supply
>> that is close to being out of limits could drift enough to cause the
>> computer to freeze at strange times. The BIOS voltage readings are not
>> always that accurate.
>>
>> Also, when cleaning out the dust. Make sure that you know where all the
>> jumper settings are on the motherboard. Cost me many hours when one of the
>> jumper shorting connectors came off on my computer.
>>
>> Also confirm that the latest BIOS is installed. Even on new motherboards.
>> This fixed a freezing issue on a new computer for me. Worked okay with 4Gig
>> of ram but not 8 gig. Memtest worked great.
>>
>
> Compiling up something called HPL (with something called MPI) at least does
> nicely at finding that you have a memory/overheat/internal CPU issue. If
> the results corrupt or the machine crashes something is really wrong,
> typically it won't tell you what is wrong, but if it successfully runs for a
> long time then you can expect most things to be correct. Generally it
> will at least crash the machine several times faster than most other
> applications.
>
> It won't find IO/PCI/Video issues unless they are really severe, though
> generally most of the issues fall into what it does test.
>

Do you happen to know what the latest version is? I have turned up a
version 1.0a dated Jan 20, 2004. Do you know if that is the latest
version available.

Max

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


All times are GMT. The time now is 01:30 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.