Page 1 of 1

Overcommitted?

Posted: Mon Jun 12, 2006 1:50 pm
by UBT - PaulC
What circumstances make boinc go in to this "overcommitted" stop fetching work mode? Is it do to with long / short term debt? I just noticed this appearing in the messages, yet haven't changed anything. About 4 or so hours later i see it has returned to round robin style.

Any ideas?

Paul.

Re: Overcommitted?

Posted: Mon Jun 12, 2006 1:59 pm
by Temujin
UBT - PaulC wrote:What circumstances make boinc go in to this "overcommitted" stop fetching work mode? Is it do to with long / short term debt? I just noticed this appearing in the messages, yet haven't changed anything. About 4 or so hours later i see it has returned to round robin style.

Any ideas?

Paul.
AFAIK its not something to worry about.
All it means is that you have sufficient work to last your connect every xx setting.
So, if your connect every seting is 2 days, you already have 2 days worth of work and it doesn't need to download any more work. No doubt in a couple of hours, when you've (maybe) processed a WU, it will download some more work.

Posted: Mon Jun 12, 2006 3:27 pm
by UBT - JohnR
I have noticed this with Rosetta.  On a machine with no work it can download 3 days WU and then during the download go into a strop and say the machine is overcommitted and process the WU in date order.  Also if it for instance downloads 10 - 4 hour WU and the first takes 6 hours to process the first one, it sets the rest to 6 hour WU and goes in a strop again. It can do this even though the first WU has 6 days to be returned in and the work is completed in 3 days.

Posted: Mon Jun 12, 2006 4:45 pm
by UBT - Halifax-lad
It is to do with LTD & STD as you have already stated and also to do with deadlines if BOINC has too much work and it thinks it won't make the cut off date it will stop the work fetch and start on the WU's that have the shortest date on them

Posted: Tue Jun 13, 2006 5:16 pm
by UBT - PaulC
It's still going mad
12/06/06 00:27:55|SETI@home|Finished download of file 24mr99ab.21115.28448.897174.3.99
12/06/06 00:27:55|SETI@home|Throughput 113623 bytes/sec
12/06/06 00:27:56||Rescheduling CPU: files downloaded
12/06/06 00:27:59||Suspending work fetch because computer is overcommitted.
12/06/06 02:36:14||Allowing work fetch again.
12/06/06 02:39:59||Suspending work fetch because computer is overcommitted.
12/06/06 02:42:39||Allowing work fetch again.
12/06/06 03:27:56||Resuming round-robin CPU scheduling.
It appears to think it was over committed at 00:27, then allows work 9 mins later. Then it thinks its overcomitted again.

This is most bizzare

Paul.

Posted: Tue Jun 13, 2006 6:09 pm
by UBT - Timbo
UBT - PaulC wrote:It's still going mad......This is most bizzare

Paul.

Can I ask the following:
What other projects are you crunching for on this PC?
What is your WU cache set to? (1 day?, 5 days?, etc)
What is the normal "To completion time" you usually have?
Are you crunching 24/7?

regards,

Tim

Posted: Tue Jun 13, 2006 6:25 pm
by Temujin
UBT - PaulC wrote:It appears to think it was over committed at 00:27, then allows work 9 mins later. Then it thinks its overcomitted again.
2 hours and 9 minutes later
This is most bizzare
don't worry about it mate, it happens all the time

Posted: Tue Jun 13, 2006 6:52 pm
by UBT - PaulC
UBT - Timbo wrote:
Can I ask the following:
What other projects are you crunching for on this PC?
What is your WU cache set to? (1 day?, 5 days?, etc)
What is the normal "To completion time" you usually have?
Are you crunching 24/7?

regards,

Tim
Projects = Seti, Einstein, QMC & uFluids
Cache = 0.1 days
Completion time = variable depends on project
Crunching 24/7 = yes


Paul

Posted: Tue Jun 13, 2006 8:53 pm
by UBT - PaulC
After looking at the boinc wiki i found some good pages which explains all. Worth a read if you have a similar problem.

http://boinc-wiki.ath.cx/index.php?title=Work_Scheduler
http://boinc-wiki.ath.cx/index.php?titl ... rcommitted

Paul.

Posted: Wed Jun 14, 2006 12:43 am
by UBT - Timbo
UBT - PaulC wrote:Projects = Seti, Einstein, QMC & uFluids
Cache = 0.1 days
Completion time = variable depends on project
Crunching 24/7 = yes

Paul
OK Paul.

Some things I've realised about BOINC.

BOINC doesn't always understand that you are crunching for multiple projects. So, with at least 3 of these projects now creating WU's that takes many hours to complete, I've found on some of my boxes, that BOINC doesn't d/l "just enough WU, in order to NOT create problems.

I "expected" BOINC to auto-manage the projects, such that if I decided that each project is to have a particular share of my resource, that each project would receive precisely enough WU's to enable it to complete what it already has to do, and have enough WU's to keep it going so the cache is not drained.

Seems this doesn't happen and BOINC has to "play around" with the WU's in order to manage the different "to completion" times for each project as well as taking into account the "CPU crunching power" you have.

So, as long as you don't actually run out of work, then obviously this issue is all about why BOINC keeps getting it's knickers in a twist.

For instance, I have my SETI cache set at 1.5 days. And I have 72 hours of WU's (based on the "to completion" times), so as you see, I receive 2x 36 hours = 72 hours.....!

As you have 0.1 as the cache, then you should receive 2x 0.1 = 0.2 or 4.8 hours worth of work. But it might do this for each of the four projects. And depending on your CPU speed, you may appear to have too much work to finish in time. Which is why it thinks you are over-committed.

For now, I'd let things carry on, and leave BOINC to settle down. Don't suspend or resume any projects for now and let it resolve itself, depending on your chosen resource shares and whether any WU's are actually available from the 4 projects. It is a pain, and sometimes BOINC does really stupid things.

When it does VERY stupid things, I just set all but one project to "no new tasks" and then let the cache drain and give BOINC a chance to sort itself out.  Then I switch a 2nd project back on, and wait a few more days., etc etc.

regards,

Tim

Posted: Wed Jun 14, 2006 10:48 am
by UBT - Halifax-lad
BOINC works perfectly fine here with 24 projects, it manages them correctly