Thread 'Unusual Compute Errors'

Author	Message
Greg Project administrator Send message Joined: 26 Jun 08 Posts: 651 Credit: 512,825,862 RAC: 0	Message 1091 - Posted: 6 Feb 2013, 1:21:27 UTC - in response to Message 1090. The skewness error was due to bad parameters on some of the workunits. Those have been corrected. The asm_modinv32 error is simply a sign that we are pushing the limits of the Windows sieve binary. I will investigate further in the coming weeks. ID: 1091 · Rating: 0 · rate: / Reply Quote

Neil Polson Send message Joined: 16 Sep 09 Posts: 6 Credit: 618,986 RAC: 0	Message 1118 - Posted: 16 Apr 2013, 5:28:26 UTC Last modified: 16 Apr 2013, 5:32:30 UTC I've had an error "TD sieve on side 0: 1626047 does not divide" on this task. Looks like it's happening on windows only as it takes a Linux host to complete it as here. Seen quite a few of these around. ID: 1118 · Rating: 0 · rate: / Reply Quote

Dirk Broer Send message Joined: 11 Dec 12 Posts: 19 Credit: 34,960,519 RAC: 6,590	Message 1219 - Posted: 27 Dec 2013, 21:57:15 UTC Client state Compute error Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED ID: 1219 · Rating: 0 · rate: / Reply Quote

Greg Project administrator Send message Joined: 26 Jun 08 Posts: 651 Credit: 512,825,862 RAC: 0	Message 1220 - Posted: 28 Dec 2013, 0:44:19 UTC - in response to Message 1219. Client state Compute error Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED That's one I see only rarely. It indicates that BOINC estimated the completion time of the workunit to be much later than the deadline. This perhaps could happen if BOINC is running in a VM that's getting very little CPU time at the beginning of the workunit. If that's not the case and the error persists, try resetting the project. ID: 1220 · Rating: 0 · rate: / Reply Quote

Fritzr Send message Joined: 5 Jun 10 Posts: 1 Credit: 1,051,446 RAC: 0	Message 1266 - Posted: 24 Jan 2014, 7:47:32 UTC Last modified: 24 Jan 2014, 7:49:31 UTC I'll try the reset on the project. However I have just completed 8 "15e Lattice Sieve v1.08 (notphenomiix6)" tasks 4 completed and validated 4 exited with error 197. Debugging info adds that an "unhandled exception" occurred as well. These all finished on Jan 23 2014 and all had the same deadline of Jan 27 2014 31785400 28310074 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Error while computing 15,195.74 1,528.27 --- 15e Lattice Sieve v1.08 (notphenomiix6) 31785397 28310071 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Completed and validated 11,040.94 1,854.83 44.00 15e Lattice Sieve v1.08 (notphenomiix6) 31785396 28310070 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Error while computing 15,195.53 1,545.02 --- 15e Lattice Sieve v1.08 (notphenomiix6) 31785394 28310068 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Completed and validated 11,122.22 1,868.94 44.00 15e Lattice Sieve v1.08 (notphenomiix6) 31785391 28310065 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Error while computing 15,195.29 1,512.05 --- 15e Lattice Sieve v1.08 (notphenomiix6) 31785389 28310063 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Completed and validated 11,436.74 1,930.84 44.00 15e Lattice Sieve v1.08 (notphenomiix6) 31785385 28310059 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Completed and validated 11,315.50 1,909.84 44.00 15e Lattice Sieve v1.08 (notphenomiix6) 31785379 28310053 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Error while computing 15,195.96 1,556.77 --- 15e Lattice Sieve v1.08 (notphenomiix6) ID: 1266 · Rating: 0 · rate: / Reply Quote

Ananas Send message Joined: 13 Aug 13 Posts: 9 Credit: 2,500,034 RAC: 0	Message 1267 - Posted: 26 Jan 2014, 21:50:34 UTC - in response to Message 1220. Last modified: 26 Jan 2014, 22:04:48 UTC Client state Compute error Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED That's one I see only rarely. It indicates that BOINC estimated the completion time of the workunit to be much later than the deadline. ... No, this is not the reason for this error, it is not deadline related. There is a value RSC_FPOPS_BOUND configured for the work units. If the used FPOps pass this FPOps limit, the core client aborts the workunit. Afaik. the values are converted into CPU time, as the core client does not actually know the used-up FPOps. I.e., when the result starts, the RSC_FPOPS_BOUNC value together with the benchmark results lead to a maximum CPU time that the core client will allow for that workunit. So, if the computer runs the BOINC benchmark on a turbo core but the calculation runs not in turbo mode OR if part of the calculation ran in a clocked-down (powersaving) mode, there is a high risk of violating the FPOps limit. Increase RSC_FPOPS_BOUND if you want to eliminate those errors - it isn't actually thought to limit the execution time for a "healthy" result, it is just thought to catch applications stuck in endless loops. Set it much higher than RSC_FPOPS_EST, at least 5 times as high or even 10 times. ID: 1267 · Rating: 0 · rate: / Reply Quote

Sabroe_SMC Send message Joined: 13 Nov 09 Posts: 3 Credit: 10,039,368 RAC: 0	Message 1281 - Posted: 1 Mar 2014, 11:54:00 UTC - in response to Message 1118. Last modified: 1 Mar 2014, 11:54:57 UTC I've had an error "TD sieve on side 0: 1626047 does not divide" on this task. Looks like it's happening on windows only as it takes a Linux host to complete it as here. Seen quite a few of these around. I got this error too http://escatter11.fullerton.edu/nfs/result.php?resultid=32583006 http://escatter11.fullerton.edu/nfs/result.php?resultid=32582970 http://escatter11.fullerton.edu/nfs/result.php?resultid=32582964 http://escatter11.fullerton.edu/nfs/result.php?resultid=32582963 An other error on an other machine http://escatter11.fullerton.edu/nfs/result.php?resultid=32583060 ID: 1281 · Rating: 0 · rate: / Reply Quote

Ananas Send message Joined: 13 Aug 13 Posts: 9 Credit: 2,500,034 RAC: 0	Message 1282 - Posted: 3 Mar 2014, 18:52:06 UTC - in response to Message 1281. This one (your 32582964) has been completed by a Win x64 machine, so it is not necessarily a Linux vs. Windows thing. ID: 1282 · Rating: 0 · rate: / Reply Quote

Carlos Pinho Volunteer moderator Send message Joined: 26 Sep 09 Posts: 232 Credit: 27,645,063 RAC: 0	Message 1283 - Posted: 4 Mar 2014, 15:50:48 UTC Last modified: 4 Mar 2014, 15:53:21 UTC I suppose you don't have enough memory available to run the 16e wu's. You need at least 1.5GB/thread. Although you are running a mix of 14e, 15e and 16e, imagine if, for example, your 12 core i7-4930K tries to run all 16e wu's. 16 GB is not enough. On this particular machine you should set to only run the 14e and the 15e wu's. Carlos ID: 1283 · Rating: 0 · rate: / Reply Quote

Dirk Broer Send message Joined: 11 Dec 12 Posts: 19 Credit: 34,960,519 RAC: 6,590	Message 1380 - Posted: 2 May 2014, 14:16:39 UTC - in response to Message 1220. Client state Compute error Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED That's one I see only rarely. It indicates that BOINC estimated the completion time of the workunit to be much later than the deadline. This perhaps could happen if BOINC is running in a VM that's getting very little CPU time at the beginning of the workunit. If that's not the case and the error persists, try resetting the project. Can it have to do with my GPUs running OpenCL applications in the meantime? I see the computation times of other CPU projects increase dramatically when running e.g. Collatz on my GPUs. ID: 1380 · Rating: 0 · rate: / Reply Quote