Unusual Compute Errors

log in

Advanced search

Message boards : Questions/Problems/Bugs : Unusual Compute Errors

Author Message
Profile ritterm
Avatar
Send message
Joined: 9 Oct 09
Posts: 4
Credit: 538,265
RAC: 0
Message 1090 - Posted: 3 Feb 2013, 3:20:47 UTC

I've recently gotten a couple fo compute errors that I've never seen before:

Task 24765613 / 14e Task / Stderr out includes "Please set a skewness" / One occurrence on host 27343

Task 24717868 / 16e Task / Stderr out includes "Bad args to asm_modinv32" / Several occurrences on host 29160
____________

Greg
Project administrator
Send message
Joined: 26 Jun 08
Posts: 582
Credit: 223,912,432
RAC: 23,750
Message 1091 - Posted: 6 Feb 2013, 1:21:27 UTC - in response to Message 1090.

The skewness error was due to bad parameters on some of the workunits. Those have been corrected. The asm_modinv32 error is simply a sign that we are pushing the limits of the Windows sieve binary. I will investigate further in the coming weeks.

Profile ritterm
Avatar
Send message
Joined: 9 Oct 09
Posts: 4
Credit: 538,265
RAC: 0
Message 1092 - Posted: 7 Feb 2013, 11:35:55 UTC - in response to Message 1091.

The skewness error was due to bad parameters on some of the workunits. Those have been corrected. The asm_modinv32 error is simply a sign that we are pushing the limits of the Windows sieve binary. I will investigate further in the coming weeks.

Thanks for the feedback, Greg... :-)
____________

Profile Neil Polson
Avatar
Send message
Joined: 16 Sep 09
Posts: 6
Credit: 618,986
RAC: 0
Message 1118 - Posted: 16 Apr 2013, 5:28:26 UTC
Last modified: 16 Apr 2013, 5:32:30 UTC

I've had an error "TD sieve on side 0: 1626047 does not divide" on this task. Looks like it's happening on windows only as it takes a Linux host to complete it as here. Seen quite a few of these around.
____________

Dirk Broer
Send message
Joined: 11 Dec 12
Posts: 9
Credit: 2,731,097
RAC: 746
Message 1219 - Posted: 27 Dec 2013, 21:57:15 UTC

Client state Compute error
Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED


____________

Greg
Project administrator
Send message
Joined: 26 Jun 08
Posts: 582
Credit: 223,912,432
RAC: 23,750
Message 1220 - Posted: 28 Dec 2013, 0:44:19 UTC - in response to Message 1219.

Client state Compute error
Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED


That's one I see only rarely. It indicates that BOINC estimated the completion time of the workunit to be much later than the deadline. This perhaps could happen if BOINC is running in a VM that's getting very little CPU time at the beginning of the workunit. If that's not the case and the error persists, try resetting the project.

Fritzr
Send message
Joined: 5 Jun 10
Posts: 1
Credit: 355,366
RAC: 259
Message 1266 - Posted: 24 Jan 2014, 7:47:32 UTC
Last modified: 24 Jan 2014, 7:49:31 UTC

I'll try the reset on the project.
However I have just completed 8 "15e Lattice Sieve v1.08 (notphenomiix6)" tasks
4 completed and validated
4 exited with error 197. Debugging info adds that an "unhandled exception" occurred as well.
These all finished on Jan 23 2014 and all had the same deadline of Jan 27 2014

31785400 28310074 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Error while computing 15,195.74 1,528.27 --- 15e Lattice Sieve v1.08 (notphenomiix6)
31785397 28310071 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Completed and validated 11,040.94 1,854.83 44.00 15e Lattice Sieve v1.08 (notphenomiix6)
31785396 28310070 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Error while computing 15,195.53 1,545.02 --- 15e Lattice Sieve v1.08 (notphenomiix6)
31785394 28310068 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Completed and validated 11,122.22 1,868.94 44.00 15e Lattice Sieve v1.08 (notphenomiix6)
31785391 28310065 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Error while computing 15,195.29 1,512.05 --- 15e Lattice Sieve v1.08 (notphenomiix6)
31785389 28310063 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Completed and validated 11,436.74 1,930.84 44.00 15e Lattice Sieve v1.08 (notphenomiix6)
31785385 28310059 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Completed and validated 11,315.50 1,909.84 44.00 15e Lattice Sieve v1.08 (notphenomiix6)
31785379 28310053 267598 23 Jan 2014, 22:28:08 UTC 24 Jan 2014, 7:27:03 UTC Error while computing 15,195.96 1,556.77 --- 15e Lattice Sieve v1.08 (notphenomiix6)

Ananas
Send message
Joined: 13 Aug 13
Posts: 9
Credit: 2,500,034
RAC: 0
Message 1267 - Posted: 26 Jan 2014, 21:50:34 UTC - in response to Message 1220.
Last modified: 26 Jan 2014, 22:04:48 UTC

Client state Compute error
Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED


That's one I see only rarely. It indicates that BOINC estimated the completion time of the workunit to be much later than the deadline. ...

No, this is not the reason for this error, it is not deadline related.

There is a value RSC_FPOPS_BOUND configured for the work units.

If the used FPOps pass this FPOps limit, the core client aborts the workunit.

Afaik. the values are converted into CPU time, as the core client does not actually know the used-up FPOps. I.e., when the result starts, the RSC_FPOPS_BOUNC value together with the benchmark results lead to a maximum CPU time that the core client will allow for that workunit.

So, if the computer runs the BOINC benchmark on a turbo core but the calculation runs not in turbo mode OR if part of the calculation ran in a clocked-down (powersaving) mode, there is a high risk of violating the FPOps limit.

Increase RSC_FPOPS_BOUND if you want to eliminate those errors - it isn't actually thought to limit the execution time for a "healthy" result, it is just thought to catch applications stuck in endless loops. Set it much higher than RSC_FPOPS_EST, at least 5 times as high or even 10 times.

Profile Sabroe_SMC
Send message
Joined: 13 Nov 09
Posts: 1
Credit: 3,705,488
RAC: 0
Message 1281 - Posted: 1 Mar 2014, 11:54:00 UTC - in response to Message 1118.
Last modified: 1 Mar 2014, 11:54:57 UTC

I've had an error "TD sieve on side 0: 1626047 does not divide" on this task. Looks like it's happening on windows only as it takes a Linux host to complete it as here. Seen quite a few of these around.

I got this error too
http://escatter11.fullerton.edu/nfs/result.php?resultid=32583006
http://escatter11.fullerton.edu/nfs/result.php?resultid=32582970
http://escatter11.fullerton.edu/nfs/result.php?resultid=32582964
http://escatter11.fullerton.edu/nfs/result.php?resultid=32582963

An other error on an other machine
http://escatter11.fullerton.edu/nfs/result.php?resultid=32583060

Ananas
Send message
Joined: 13 Aug 13
Posts: 9
Credit: 2,500,034
RAC: 0
Message 1282 - Posted: 3 Mar 2014, 18:52:06 UTC - in response to Message 1281.

This one (your 32582964) has been completed by a Win x64 machine, so it is not necessarily a Linux vs. Windows thing.

Profile Carlos Pinho [TSBTs Pirate]
Volunteer moderator
Send message
Joined: 26 Sep 09
Posts: 162
Credit: 7,723,521
RAC: 0
Message 1283 - Posted: 4 Mar 2014, 15:50:48 UTC
Last modified: 4 Mar 2014, 15:53:21 UTC

I suppose you don't have enough memory available to run the 16e wu's. You need at least 1.5GB/thread. Although you are running a mix of 14e, 15e and 16e, imagine if, for example, your 12 core i7-4930K tries to run all 16e wu's. 16 GB is not enough.
On this particular machine you should set to only run the 14e and the 15e wu's.

Carlos

Dirk Broer
Send message
Joined: 11 Dec 12
Posts: 9
Credit: 2,731,097
RAC: 746
Message 1380 - Posted: 2 May 2014, 14:16:39 UTC - in response to Message 1220.

Client state Compute error
Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED


That's one I see only rarely. It indicates that BOINC estimated the completion time of the workunit to be much later than the deadline. This perhaps could happen if BOINC is running in a VM that's getting very little CPU time at the beginning of the workunit. If that's not the case and the error persists, try resetting the project.


Can it have to do with my GPUs running OpenCL applications in the meantime? I see the computation times of other CPU projects increase dramatically when running e.g. Collatz on my GPUs.
____________

Message boards : Questions/Problems/Bugs : Unusual Compute Errors


Home | My Account | Message Boards