Saturday, July 18, 2009

Still... Never Fold right after you finish playing L4D, always reboot.

Here's my latest log from Folding @ Home:

[10:37:15] + Processing work unit
[10:37:15] Core required: FahCore_11.exe
[10:37:15] Core found.
[10:37:15] Working on queue slot 03 [July 17 10:37:15 UTC]
[10:37:15] + Working ...
[10:37:15]
[10:37:15] *------------------------------*
[10:37:15] Folding@Home GPU Core - Beta
[10:37:15] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:37:15]
[10:37:15] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[10:37:15] Build host: amoeba
[10:37:15] Board Type: Nvidia
[10:37:15] Core :
[10:37:15] Preparing to commence simulation
[10:37:15] - Looking at optimizations...
[10:37:15] - Files status OK
[10:37:15] - Expanded 98690 -> 492276 (decompressed 498.8 percent)
[10:37:15] Called DecompressByteArray: compressed_data_size=98690 data_size=492276, decompressed_data_size=492276 diff=0
[10:37:15] - Digital signature verified
[10:37:15]
[10:37:15] Project: 5750 (Run 4, Clone 97, Gen 338)
[10:37:15]
[10:37:15] Assembly optimizations on if available.
[10:37:15] Entering M.D.
[10:37:22] Will resume from checkpoint file
[10:37:22] Working on Protein
[10:37:27] Client config found, loading data.
[10:37:27] Starting GUI Server
[10:37:28] Resuming from checkpoint
[10:37:28] Verified work/wudata_03.log
[10:37:28] Verified work/wudata_03.edr
[10:37:28] Verified work/wudata_03.xtc
[10:37:28] Completed 84%
[10:38:42] Completed 85%
[10:39:55] Completed 86%
[10:41:09] Completed 87%
[10:42:23] Completed 88%
[10:43:37] Completed 89%
[10:44:51] Completed 90%
[10:46:04] Completed 91%
[10:47:18] Completed 92%
[10:48:32] Completed 93%
[10:49:46] Completed 94%
[10:51:00] Completed 95%
[10:52:14] Completed 96%
[10:53:27] Completed 97%
[10:54:41] Completed 98%
[10:55:55] Completed 99%
[10:57:09] Completed 100%
[10:57:10] Successful run
[10:57:10] DynamicWrapper: Finished Work Unit: sleep=10000
[10:57:20] Reserved 111976 bytes for xtc file; Cosm status=0
[10:57:20] Allocated 111976 bytes for xtc file
[10:57:20] - Reading up to 111976 from "work/wudata_03.xtc": Read 111976
[10:57:20] Read 111976 bytes from xtc file; available packet space=786318488
[10:57:20] xtc file hash check passed.
[10:57:20] Reserved 33528 33528 786318488 bytes for arc file= Cosm status=0
[10:57:20] Allocated 33528 bytes for arc file
[10:57:20] - Reading up to 33528 from "work/wudata_03.trr": Read 33528
[10:57:20] Read 33528 bytes from arc file; available packet space=786284960
[10:57:20] trr file hash check passed.
[10:57:20] Allocated 560 bytes for edr file
[10:57:20] Read bedfile
[10:57:20] edr file hash check passed.
[10:57:20] Allocated 24899 bytes for logfile
[10:57:20] Read logfile
[10:57:20] GuardedRun: success in DynamicWrapper
[10:57:20] GuardedRun: done
[10:57:20] Run: GuardedRun completed.
[10:57:25] - Writing 171475 bytes of core data to disk...
[10:57:25] Done: 170963 -> 153395 (compressed to 89.7 percent)
[10:57:25] ... Done.
[10:57:25] - Shutting down core
[10:57:25]
[10:57:25] Folding@home Core Shutdown: FINISHED_UNIT
[10:57:33] CoreStatus = 64 (100)
[10:57:33] Sending work to server
[10:57:33] Project: 5750 (Run 4, Clone 97, Gen 338)


[10:57:33] + Attempting to send results [July 17 10:57:33 UTC]
[10:57:35] + Results successfully sent
[10:57:35] Thank you for your contribution to Folding@Home.
[10:57:35] + Number of Units Completed: 1145

[10:57:39] - Preparing to get new work unit...
[10:57:39] + Attempting to get work packet
[10:57:39] - Connecting to assignment server
[10:57:40] - Successful: assigned to (171.67.108.11).
[10:57:40] + News From Folding@Home: Welcome to Folding@Home
[10:57:40] Loaded queue successfully.
[10:57:40] + Closed connections
[10:57:40]
[10:57:40] + Processing work unit
[10:57:40] Core required: FahCore_11.exe
[10:57:40] Core found.
[10:57:40] Working on queue slot 04 [July 17 10:57:40 UTC]
[10:57:40] + Working ...
[10:57:40]
[10:57:40] *------------------------------*
[10:57:40] Folding@Home GPU Core - Beta
[10:57:40] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:57:40]
[10:57:40] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[10:57:40] Build host: amoeba
[10:57:40] Board Type: Nvidia
[10:57:40] Core :
[10:57:40] Preparing to commence simulation
[10:57:40] - Looking at optimizations...
[10:57:40] - Created dyn
[10:57:40] - Files status OK
[10:57:40] - Expanded 45430 -> 251112 (decompressed 552.7 percent)
[10:57:40] Called DecompressByteArray: compressed_data_size=45430 data_size=251112, decompressed_data_size=251112 diff=0
[10:57:40] - Digital signature verified
[10:57:40]
[10:57:40] Project: 5771 (Run 4, Clone 227, Gen 226)
[10:57:40]
[10:57:40] Assembly optimizations on if available.
[10:57:40] Entering M.D.
[10:57:47] Working on Protein
[10:57:49] Client config found, loading data.
[10:57:49] mdrun_gpu returned
[10:57:49] NANs detected on GPU
[10:57:49]
[10:57:49] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:58:01] CoreStatus = 7A (122)
[10:58:01] Sending work to server
[10:58:01] Project: 5771 (Run 4, Clone 227, Gen 226)
[10:58:01] - Error: Could not get length of results file work/wuresults_04.dat
[10:58:01] - Error: Could not read unit 04 file. Removing from queue.
[10:58:01] - Preparing to get new work unit...
[10:58:01] + Attempting to get work packet
[10:58:01] - Connecting to assignment server
[10:58:01] - Successful: assigned to (171.67.108.11).
[10:58:01] + News From Folding@Home: Welcome to Folding@Home
[10:58:01] Loaded queue successfully.
[10:58:01] + Closed connections
[10:58:06]
[10:58:06] + Processing work unit
[10:58:06] Core required: FahCore_11.exe
[10:58:06] Core found.
[10:58:06] Working on queue slot 05 [July 17 10:58:06 UTC]
[10:58:06] + Working ...
[10:58:06]
[10:58:06] *------------------------------*
[10:58:06] Folding@Home GPU Core - Beta
[10:58:06] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:58:06]
[10:58:06] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[10:58:06] Build host: amoeba
[10:58:06] Board Type: Nvidia
[10:58:06] Core :
[10:58:06] Preparing to commence simulation
[10:58:06] - Looking at optimizations...
[10:58:06] - Created dyn
[10:58:06] - Files status OK
[10:58:06] - Expanded 45427 -> 251112 (decompressed 552.7 percent)
[10:58:06] Called DecompressByteArray: compressed_data_size=45427 data_size=251112, decompressed_data_size=251112 diff=0
[10:58:06] - Digital signature verified
[10:58:06]
[10:58:06] Project: 5771 (Run 6, Clone 56, Gen 864)
[10:58:06]
[10:58:06] Assembly optimizations on if available.
[10:58:06] Entering M.D.
[10:58:13] Working on Protein
[10:58:14] Client config found, loading data.
[10:58:14] Starting GUI Server
[10:58:15] mdrun_gpu returned
[10:58:15] NANs detected on GPU
[10:58:15]
[10:58:15] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:58:27] CoreStatus = 7A (122)
[10:58:27] Sending work to server
[10:58:27] Project: 5771 (Run 6, Clone 56, Gen 864)
[10:58:27] - Error: Could not get length of results file work/wuresults_05.dat
[10:58:27] - Error: Could not read unit 05 file. Removing from queue.
[10:58:27] - Preparing to get new work unit...
[10:58:27] + Attempting to get work packet
[10:58:27] - Connecting to assignment server
[10:58:27] - Successful: assigned to (171.67.108.11).
[10:58:27] + News From Folding@Home: Welcome to Folding@Home
[10:58:27] Loaded queue successfully.
[10:58:27] + Closed connections
[10:58:32]
[10:58:32] + Processing work unit
[10:58:32] Core required: FahCore_11.exe
[10:58:32] Core found.
[10:58:32] Working on queue slot 06 [July 17 10:58:32 UTC]
[10:58:32] + Working ...
[10:58:33]
[10:58:33] *------------------------------*
[10:58:33] Folding@Home GPU Core - Beta
[10:58:33] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:58:33]
[10:58:33] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[10:58:33] Build host: amoeba
[10:58:33] Board Type: Nvidia
[10:58:33] Core :
[10:58:33] Preparing to commence simulation
[10:58:33] - Looking at optimizations...
[10:58:33] - Created dyn
[10:58:33] - Files status OK
[10:58:33] - Expanded 45468 -> 251112 (decompressed 552.2 percent)
[10:58:33] Called DecompressByteArray: compressed_data_size=45468 data_size=251112, decompressed_data_size=251112 diff=0
[10:58:33] - Digital signature verified
[10:58:33]
[10:58:33] Project: 5771 (Run 5, Clone 56, Gen 828)
[10:58:33]
[10:58:33] Assembly optimizations on if available.
[10:58:33] Entering M.D.
[10:58:39] Working on Protein
[10:58:41] Client config found, loading data.
[10:58:41] Starting GUI Server
[10:58:41] mdrun_gpu returned
[10:58:41] NANs detected on GPU
[10:58:41]
[10:58:41] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:58:47] CoreStatus = 7A (122)
[10:58:47] Sending work to server
[10:58:47] Project: 5771 (Run 5, Clone 56, Gen 828)
[10:58:47] - Error: Could not get length of results file work/wuresults_06.dat
[10:58:47] - Error: Could not read unit 06 file. Removing from queue.
[10:58:47] - Preparing to get new work unit...
[10:58:47] + Attempting to get work packet
[10:58:47] - Connecting to assignment server
[10:58:47] - Successful: assigned to (171.67.108.11).
[10:58:47] + News From Folding@Home: Welcome to Folding@Home
[10:58:47] Loaded queue successfully.
[10:58:48] + Closed connections
[10:58:53]
[10:58:53] + Processing work unit
[10:58:53] Core required: FahCore_11.exe
[10:58:53] Core found.
[10:58:53] Working on queue slot 07 [July 17 10:58:53 UTC]
[10:58:53] + Working ...
[10:58:53]
[10:58:53] *------------------------------*
[10:58:53] Folding@Home GPU Core - Beta
[10:58:53] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:58:53]
[10:58:53] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[10:58:53] Build host: amoeba
[10:58:53] Board Type: Nvidia
[10:58:53] Core :
[10:58:53] Preparing to commence simulation
[10:58:53] - Looking at optimizations...
[10:58:53] - Created dyn
[10:58:53] - Files status OK
[10:58:53] - Expanded 45426 -> 251112 (decompressed 552.7 percent)
[10:58:53] Called DecompressByteArray: compressed_data_size=45426 data_size=251112, decompressed_data_size=251112 diff=0
[10:58:53] - Digital signature verified
[10:58:53]
[10:58:53] Project: 5770 (Run 9, Clone 32, Gen 713)
[10:58:53]
[10:58:53] Assembly optimizations on if available.
[10:58:53] Entering M.D.
[10:59:00] Working on Protein
[10:59:01] Client config found, loading data.
[10:59:01] Starting GUI Server
[10:59:01] mdrun_gpu returned
[10:59:01] NANs detected on GPU
[10:59:01]
[10:59:01] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:59:13] CoreStatus = 7A (122)
[10:59:13] Sending work to server
[10:59:13] Project: 5770 (Run 9, Clone 32, Gen 713)
[10:59:13] - Error: Could not get length of results file work/wuresults_07.dat
[10:59:13] - Error: Could not read unit 07 file. Removing from queue.
[10:59:13] - Preparing to get new work unit...
[10:59:13] + Attempting to get work packet
[10:59:13] - Connecting to assignment server
[10:59:13] - Successful: assigned to (171.67.108.11).
[10:59:13] + News From Folding@Home: Welcome to Folding@Home
[10:59:13] Loaded queue successfully.
[10:59:14] + Closed connections
[10:59:19]
[10:59:19] + Processing work unit
[10:59:19] Core required: FahCore_11.exe
[10:59:19] Core found.
[10:59:19] Working on queue slot 08 [July 17 10:59:19 UTC]
[10:59:19] + Working ...
[10:59:19]
[10:59:19] *------------------------------*
[10:59:19] Folding@Home GPU Core - Beta
[10:59:19] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[10:59:19]
[10:59:19] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[10:59:19] Build host: amoeba
[10:59:19] Board Type: Nvidia
[10:59:19] Core :
[10:59:19] Preparing to commence simulation
[10:59:19] - Looking at optimizations...
[10:59:19] - Created dyn
[10:59:19] - Files status OK
[10:59:19] - Expanded 46743 -> 252912 (decompressed 541.0 percent)
[10:59:19] Called DecompressByteArray: compressed_data_size=46743 data_size=252912, decompressed_data_size=252912 diff=0
[10:59:19] - Digital signature verified
[10:59:19]
[10:59:19] Project: 5767 (Run 9, Clone 131, Gen 552)
[10:59:19]
[10:59:19] Assembly optimizations on if available.
[10:59:19] Entering M.D.
[10:59:26] Working on Protein
[10:59:27] Client config found, loading data.
[10:59:27] mdrun_gpu returned
[10:59:27] NANs detected on GPU
[10:59:27]
[10:59:27] Folding@home Core Shutdown: UNSTABLE_MACHINE
[10:59:41] CoreStatus = 7A (122)
[10:59:41] Sending work to server
[10:59:41] Project: 5767 (Run 9, Clone 131, Gen 552)
[10:59:41] - Error: Could not get length of results file work/wuresults_08.dat
[10:59:41] - Error: Could not read unit 08 file. Removing from queue.
[10:59:41] EUE limit exceeded. Pausing 24 hours.
[16:37:14] + Working...


As you can see, playing L4D still can cause problems if you don't reboot the system. ALWAYS reboot your system after playing L4D before launching anything that's sensitive to GPU and GPU memory stability. I've never gotten so many UNSTABLE_MACHINEs in a row. L4D must be the cause of this, since I'm folding fine now.

Also, I'd like to add that I've tested 50 iterations of MemtestG80 before all of this (right when I got the card and again very recently after all the NANs occured) and have gotten absolutely no errors, which means the hardware is fine. And all these failures specifically start after finishing up L4D.

Edit (July 20th): Vice versa of this is true too. Never Fold intensively then go into a game of something. It might be due to the Graphics overclock or something completely different.

3 Comments:

Blogger Unknown said...

Learn how to print-screen bro! Damn.

Sat Aug 01, 09:44:00 PM PDT  
Blogger Jack Zhang said...

I wanted to show the whole log up until it stopped.

Anyways, I think the issue was with the core/shader/memory clocks always shifting up and down. This caused the instability in all the drivers I tried. A simple registry edit to lock in the performance 3D clocks might have solved that and I'm monitoring for problems.

Sun Aug 02, 07:54:00 AM PDT  
Blogger Jack Zhang said...

Nope, it always happens after the GPU is used intensely, whether it's any application on drivers 186.16 and above. 185.85 had no problems other than busted S-Video output.

Tue Nov 24, 02:09:00 AM PST  

Post a Comment

<< Home