|
| Notices |
DriverHeaven is currently recruiting for the AWOMO Beta Test / Elite Op Team. AWOMO is a digital download service for games, and we're looking to expand the beta team. If you're interested. Sign up as a member here at DriverHeaven and then head HERE to submit your details. Thanks
For more info on AWOMO visit their site HERE
Welcome to the DriverHeaven.net forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact contact us. |
 |
Jan 23, 2008, 01:43 PM
|
#1
|
|
DriverHeaven Junior Member
Join Date: Dec 2004
Location: Canadian West Coast
Posts: 76
Rep Power: 0
|
Need help diagnosing a hardware/random reboot problem (Long read)
Hi everyone,
First of all, I’d just like to express my appreciation for the helpful atmosphere of this community. Members are very supportive of newbish questions and don’t have any elitist attitudes that one might expect from a community with such a high level of technical expertise. I’ve learned a lot simply from lurking around here reading other users’ comments and links to articles. Kudos to all of you!
Now, as the subject of my post indicates, I’ve been trying to discover the cause of a system stability problem that I now highly suspect to be a hardware problem. My system specs are in my profile to the left. I purchased most of these in June 2003, so it is a fairly old system. I originally had a Radeon 9600pro and paid a pretty penny to upgrade to x800pro as soon as it came out.
My system also includes:
-a SoundBlaster Audigy 2 card
-2 P-ATA hard-drives
-1 floppy drive
-1 CD/DVD RW IDE drive
-1 external USB hard-drive
-1 USB game pad
-2 case fans
Nothing is overclocked.
As for the PSU, I originally had a Landmark 350W PSU, but it died so quickly I just rushed out to get a new one. I’m not quite of the PSU specs, as I can’t find the receipt, and I need to yank it out of my system to see the brand and wattage. I believe that it is an Enermax 350W, but I will need to pull it out tonight to be certain… as I now believe it is the root of my problem and need to be replaced. The explanation of why will follow below.
I consider myself to be slightly more computer literate than the average Joe, and have done what I consider to be due diligence in the process of elimination of the cause of my problem. I’d like to present what I’ve done for critique and suggestions in case there are other things that my limited knowledge was not able to deduce. Now, it is a very long read (even up to this point), and I must apologize for the verboseness of this post . However, I feel it is necessary to mention all the steps I’ve gone through so that I can get confirmation that my rationales for the elimination of each factor are sound.
[COLOR=blue]1. Background:[/COLOR]
I’ve had ongoing system stability problems going back several months (at least back to October-November) whereby my system will spontaneously reset (no BSOD or anything). At first, this only occurred when my computer was doing something intensive (normally gaming, which is what I use my rig for, and sometimes when I’m watching movies). I find that a lot of times (but not all), this spontaneous reset coincides with disk access when I’m saving the progress of a game. This often (again, not always) is accompanied by a distortion or repeat of the last audio that played through the speakers.
[COLOR=blue]2. Formatting and fresh installation of WinXP:[/COLOR]
At first it was a minor annoyance that I lived with and attributed to some driver error (I had some problems with driver installation during my last fresh install of WinXP, where the drivers included with the hardware CD conflicted with the ones from the manufacturer website, for both my SoundBlaster and Radeon cards). It wasn’t occurring enough for me to invest time into fixing. In December, however, the resets got so bad that started to suspect my computer had some sort of malware on there, even though I have a lot of security precautions (in my limited knowledge, anyway) in my setup: disabled potentially insecure and unnecessary windows services, DLink router NAT firewall (fully stealthed, changed default admin password), latest version of ZoneAlarm, AVG anti-virus, SpywareBlaster (actively blocks installation of browser-based malware), and Spybot - Search & Destroy’s “TeaTimer” (actively kills all know bad processes).
I proceeded with a full system scan with both AVG and Spybot to see if there were any security breaches, but the spontaneous reset would always happen partway through the scan. I hastily concluded that this now might be some sort of malware defensive mechanism. I decided then, since I have no way of diagnosing whether this is a software problem due to the resets, that I would do a fresh system install.
[COLOR=blue]3. Elimination of driver/software issues:[/COLOR]
Unfortunately, the problem did not end with a fresh installation. I was now able to do a full system scan with AVG and Spybot without a problem, but the system resets were still occurring. I started then to investigate the possibility that these may be driver issues. I had the same resets whether with the default drivers from the CDs that came with my SoundBlaster and Radeon cards, or updated drivers from the manufacturer’s websites.
I also had a problem with recognition of my Radeon x800pro AGP upon installation of the newest drivers, and found out here on DH.net that there were problems with the latest drivers. I tried the Omegas, the pre-v7.7Catalysts, the v6.x versions, and now I’m running the v8.1hotfix – nothing prevented the spontaneous resets.
I also saw a thread here on DH.net about a squeal of death caused by an ATI driver conflicts with sound drivers, and tried that hotfix, which again did not resolve this problem.
I have ASUS probe monitor my temperature and fans, and ATI tool to monitor my GPU temperature. None have ever set off alarms in these last couple of months. I even did an artifact scan in ATI Tool and it showed that my GPU was stable.
[COLOR=blue]4. Minidump analysis[/COLOR]
I was now getting desperate and looked to analyzing my WinXP crash memory minidumps to determine if any single driver or program is causing this error. I downloaded Microsoft’s Windows debugging system and set it up to download proper symbols for analyzing memory dumps.
Unfortunately, the spontaneous reboots often do not even give the system enough time to even write a minidump to the hard-drive. The reboots are actually so bad now that even doing something as mundane as starting Firefox might cause it.
Of the multitude of random reboots I’ve had since Dec 22, I only have 9 minidumps. I analyzed them using the debugging tools, and no one singular program or driver was cited as the cause. I’ve had 2 windows system DLLs come up once each, and drivers for AVG, ATI, and ZoneAlarm come up once each. TeaTimer was cited twice. 2 of the dump analyses indicated that the offending driver or process could not be read, and cited that this was likely due to a memory corruption.
Since no singular pattern of processes or drivers was implicated by the dump analyses as the offending factor of the crashes, I decided to investigate the possibility that my memory modules were indeed failing.
[COLOR=blue]Memory testing:[/COLOR]
At first, I tried to use Memtest86+ ([COLOR=#800080]www.memtest.org[/COLOR]). Unfortunately, all the floppies I had were dead, I didn’t have a USB key, and I didn’t want to waste a DVD (I had no writable CDs left) as a boot disk for such a small program.
I looked into a free windows based memory testing program, and found another program called Memtest ([COLOR=#0000ff]http://hcidesign.com/memtest/[/COLOR]) which could run in WinXP, but obviously could not scan the memory occupied by the OS. I scanned it for malware and was reasonable sure it was clean (correct me if I’m wrong…). As soon as I ran it, it immediately spurted that I have memory corruption errors.
The next day, I borrowed a floppy from an IT tech at work, and used it to boot up with Memtest86+. Astonishingly, 2 hours of scanning did not find any errors at all. However, when I booted into XP and used the windows-based Memtest, again I got errors. In fact, after running this memtest for more than 20 minutes or so, the system would spontaneously reboot…
I’ve read on the Memtest86+ FAQ that memory failures might only occur when your machine is under load. This leads me to think that I can only reproduce the errors when I am running XP.
[COLOR=blue]Singling out the PSU:[/COLOR]
I decided to take out two of my four 512mb modules and then boot up in XP to test them, and try to eliminate the ones that are failing. Unfortunately, after taking out the modules, my system would no longer boot up (blank screen on monitor). I could only boot up again once I remove power to a lot of components like a hard-disk, CD/DVD and floppy drives. I’ve had this problem before, and this alone had led me to believe in the past my PSU was on its last legs. Usually, once I got it to boot up again, I can plug in my other components again and it would run smoothly until the next time I pull the plug on the entire system (as opposed to the soft off that I usually leave it with power still going to the mobo).
What was odd is that the every time the system recovers from this boot error, the BIOS would indicate that overclocking has failed and has to reboot right away, even though I don’t have anything overclocked (My mobo BIOS has ASUS’s AI overclocking utility which I never use).
I now decided to boot up in XP with only the essential components connected (graphics card, hard-drive, only 1 case fan, and run the memory test in XP. This time, there were no errors found after 3 hours of testing, and there was no spontaneous reboot. I concluded now that the memory corruption was due to the PSU failing to supply adequate power to the modules to maintain data integrity. If you have read up to this point and would conclude the same, please let me know!
[COLOR=blue]Future considerations:[/COLOR]
Now, after all that testing (whew), I point my finger squarely at the PSU for my troubles (J’accuse!). Please outline your reasons if you disagree. I want to be absolutely sure before I shell out my cash for a new PSU. I suppose the only thing I haven’t tried that I can think of is to test my CPU with Prime95…
I am seriously looking to replace this system, but I want to be able to leave it to my sister since it is vastly superior to hers. I want to make sure it is stable beforehand as there is no way in hell she can debug a system to save her life.
There aren’t any games on the market in the foreseeable future that I really see myself want to purchase until Starcraft 2 is out, so I was hoping this system will last me until then, when I’d shell out probably 2000 bucks for a brand spanking new system.
If you do agree with my analysis above that the PSU is the culprit, well then, I need help figuring out what to replace this PSU with. I want to err on the side of caution, so I was thinking of either a 450W or 500W. I have absolutely no clue as to what to look for in a PSU, which is why I ended up with this crappy one in the first place.
I don’t really want to spend more than $60 to fix this aging system, but I don’t think that is possible. I don’t really want to buy anything off the web like newegg either, and would much rather shop at [COLOR=#800080]http://www.anitec.ca/subcategory/78/power_supplies/[/COLOR] since they are only about 10 minutes drive away. If you have time to take a look and recommend one, please do.
And here’s where I thank anyone who had spent the time to read my post. If you’ve read up to here, you already have my gratitude. Thanks!
|
|
|
Jan 23, 2008, 03:02 PM
|
#2
|
|
Driverheaven's Freerunner
Join Date: Jan 2007
Location: United Kingdom
Posts: 3,703
|
This does sound like a bad memory module if you ask me.
Sorry if i didn't read the whole post but it was starting to give me a headache lol.
A while back a memory module went bad in my system and i got all kinds of problems from general instability, BSODs, corrupt files, all sorts.
Sometimes the system would randomly restart without a BSOD, Sometimes it would show a BSOD.
Try running the memory modules 1 at a time if you haven't already (and if you already have then ignore me, but i was seriously getting a headache from all the reading lol), this should help you diagnose any bad modules.
Edit - Read more of the post, if the PSU is going bad and not supplying enough power then i'd relatively safely conclude that it's dying out, get it RMA'd and possibly go for a more powerful one?
Unforatunately dude, that's way it goes with computer hardware, you get what you pay for. So aim high and don't go budget like i did.
|
|
|
Jan 23, 2008, 03:07 PM
|
#3
|
|
DriverHeaven Granddaddy
Join Date: May 2002
Location: Georgia, USA
Posts: 12,336
|
A few things to consider:
1. Pulling the plug entirely from the system and then getting some BIOS warning may indicate that the CMOS battery on the motherboard is dead or weak. You may want to get a new one. Usually only a $2-3 item.
2. PSU 'may' be the culprit if you're getting reboots without any BSODs. That usually means the power has cut out and caused the system to shutdown. So, it sounds like you're on the right track there. But, if you replace the PSU, go for at least a 450W High Quality PSU. The el cheapos just don't cut it.
3. Have you tried booting with only one RAM module at a time? and then testing the RAM?
Good luck!
|
|
|
Jan 23, 2008, 03:25 PM
|
#4
|
|
DH's oldest Geek?
Join Date: May 2003
Location: Cincinnati, OH
Posts: 1,538
|
The closest thing to what your are looking for would be the Corsair 450VX from Newegg. Newegg.com - CORSAIR CMPSU-450VX ATX12V V2.2 450W Power Supply 90 - 264 V UL, CUL, CE, CB, FCC Class B, TUV, CCC, C-tick - Retail
However, that ignores your wish to get it locally. So, looking at what they have at the link you supplied, I'd take a good look at this one:
Anitec.ca - Antec EarthWatts 500W High Efficiency 80 Plus Power Supply
Like the Corsair, it's also made by Seasonic, and jonnyGURU rates it quite highly on his site.
You are almost definately going to have to spend more than 60.00 for a good quality PSU, but if your sister is going to be using the system for quite some time in the future it will be a good investment.
|
|
|
Jan 25, 2008, 04:43 PM
|
#5
|
|
DriverHeaven Junior Member
Join Date: Dec 2004
Location: Canadian West Coast
Posts: 76
Rep Power: 0
|
Thanks for the responses.
Ok. I ran some more memory tests and now was sure that my PSU was the problem. I pulled out my PSU and it turned out to be a 460W - no brand could be found. I remember buying it quite hastily as it was the cheapest I could find at $50. I've now installed a new OCZ 500w Stealth Xstream, and ran some games for a while. The system seems to be stable now.
However, my asus probe hardware monitor shows that the vcore drops below 95% of the 1.4v requirement on load. My old PSU never had such a low vcore. Is this a big issue? At what voltage will my CPU become unstable? is within 10% acceptable?
|
|
|
Jun 16, 2008, 11:02 AM
|
#6
|
|
DriverHeaven Newbie
Join Date: Jun 2008
Posts: 1
Rep Power: 0
|
I've got an ASUS A8R-MVP mainboard and have also seen similar random reset issues. Very annoying for me as I'm a programmer and would happen when doing a full rebuild of our software.
If you google for 'ASUS random crash' there are a lot of people seeing similar issues so I suspect it might be a common problem with these motherboards.
I have a 550W PSU and a meter which shows I'm only pulling about 250W so pretty sure it's nothing to do with PSU.
I *think* I've fixed my problems by doing the following:
1. tighten up heatsink - got rid of annoying 'cpu fan error' and kept CPU about 5'C cooler.
2. increase memory voltage to 3v in the bios - seems to have stopped the sudden cut off.
3. removed 2 gig of ram - actually before the voltage change and this seemed to help a lot but I'm going to experiment with putting this back in now the voltage is boosted. I thought it was faulty but now i'm convinced it was the mainboard not giving it enough power.
Based on this experience with ASUS I will be avoiding them like the plague in future.
|
|
|
Jun 16, 2008, 01:51 PM
|
#7
|
|
Driverheaven's Freerunner
Join Date: Jan 2007
Location: United Kingdom
Posts: 3,703
|
Quote:
Originally Posted by Xajin
I've got an ASUS A8R-MVP mainboard and have also seen similar random reset issues. Very annoying for me as I'm a programmer and would happen when doing a full rebuild of our software.
If you google for 'ASUS random crash' there are a lot of people seeing similar issues so I suspect it might be a common problem with these motherboards.
I have a 550W PSU and a meter which shows I'm only pulling about 250W so pretty sure it's nothing to do with PSU.
I *think* I've fixed my problems by doing the following:
1. tighten up heatsink - got rid of annoying 'cpu fan error' and kept CPU about 5'C cooler.
2. increase memory voltage to 3v in the bios - seems to have stopped the sudden cut off.
3. removed 2 gig of ram - actually before the voltage change and this seemed to help a lot but I'm going to experiment with putting this back in now the voltage is boosted. I thought it was faulty but now i'm convinced it was the mainboard not giving it enough power.
Based on this experience with ASUS I will be avoiding them like the plague in future.
|
The ASUS Boards now are really good, i've got a P5W DH Deluxe and it's been 100% stable as far as i can see. the only issue i had was when i tried booting with the FSB set at whatever to try pushing my E4500 to 3GHZ and it kinda crapped out on me and went "Aww hell naw" 
|
|
|
Jun 16, 2008, 04:08 PM
|
#8
|
|
ZooooM!
Join Date: Apr 2007
Location: USA, Missouri
Posts: 581
Rep Power: 12
|
Yeah some of the older ASUS boards have voltage issues in which they need to be manually set. Thats what i had to do with an old socket A setup i had previously. I had some crashing issues which i traced back to the voltages when set on AUTO would undervolt some compnents randomly causing a crash. Incrasing the voltage to the high side of acceptable for the components fixed the varying voltage.
|
|
|
Jun 17, 2008, 12:40 PM
|
#9
|
|
4870X2 Anyone??
Join Date: Nov 2006
Location: New York
Posts: 2,111
|
After reading, it really does sound like a PSU problem...I helped a friend of mine a couple months back with a similar problem and it was definitely a PSU problem.
I would suggest replacing it, and seeing if your problem persists.
Last edited by ChaosMinionX; Jun 21, 2008 at 12:38 PM.
|
|
|
|
|
|