log in

Very long post processing

Message boards : NFS Discussion : Very long post processing
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile [AF>Dell>LesDelliens]La ...

Send message
Joined: 7 Sep 09
Posts: 5
Credit: 37,812
RAC: 0
Message 74 - Posted: 17 Sep 2009, 11:09:20 UTC

Hi all!

I noticed on front page that post processing might take nearly 20 days for each number! that is more than the time needed to process it (with boinc). Won't we if we continue like this come to a situation where too many numbers have been processed by us but still need to be post processed by your team? or will processing time rise to "balance" this?

Thanks, sorry if question is not quite clear (french :D)

La frite
ID: 74 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greg
Project administrator

Send message
Joined: 26 Jun 08
Posts: 640
Credit: 433,350,798
RAC: 332,873
Message 75 - Posted: 17 Sep 2009, 17:14:04 UTC - in response to Message 74.  

The postprocessing time will always be much greater than the time required for the community to sieve the number. This will get worse, not better, as we move to larger numbers. However, I have access to or can readily recruit the resources to do the postprocessing on 20-25 numbers at once. So as long as I can keep the (postprocessing time)/(sieving time) < 20, there will be no problem. With the current targets and parameters, we are at about 19/3.5 = 5.4. As the project grows and that ratio approaches 20, I can tweak the parameters to make sieving a bit harder but the postprocessing easier.
ID: 75 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
verstapp

Send message
Joined: 23 Sep 09
Posts: 3
Credit: 1,734,906
RAC: 0
Message 84 - Posted: 24 Sep 2009, 8:42:03 UTC - in response to Message 75.  

Or you could try subcontracting [boinc] the postprocessing too.
ID: 84 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greg
Project administrator

Send message
Joined: 26 Jun 08
Posts: 640
Credit: 433,350,798
RAC: 332,873
Message 87 - Posted: 24 Sep 2009, 18:19:02 UTC - in response to Message 84.  

I wish I could. Unfortunately, the post-processing involves solving a large sparse matrix, which requires very high-bandwidth, low-latency communication between nodes. In fact, on a single computer this part of the computation scales more with memory speed than CPU speed. (Core i7's with DDR3 are great at it!) This is exactly the type of problem that BOINC can't do.
ID: 87 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bigred

Send message
Joined: 14 Sep 09
Posts: 1
Credit: 1,001,793
RAC: 0
Message 88 - Posted: 25 Sep 2009, 10:55:17 UTC

What about doing the post processing using a GPU application? It could be either Cuda(Nvidia) or Cal(ATI). Even better would be an application for both.
ID: 88 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greg
Project administrator

Send message
Joined: 26 Jun 08
Posts: 640
Credit: 433,350,798
RAC: 332,873
Message 90 - Posted: 25 Sep 2009, 19:53:56 UTC - in response to Message 88.  

Memory requirements are too high. It requires at least 5 gigabytes of memory. No GPUs have that much memory at the moment, and transfers to/from the host to the GPU kill the speed of the application. Post-processing isn't feasible for BOINC, though, because the calculation requires the complete matrix, which is a few hundred megabytes in size.
ID: 90 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Incognito II

Send message
Joined: 19 Nov 09
Posts: 4
Credit: 4,687,684
RAC: 0
Message 248 - Posted: 26 Nov 2009, 16:55:19 UTC

Greg, who is doing most of the post processing? What type of machines/cluster are being used? Just curious.
ID: 248 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bdodson*

Send message
Joined: 2 Oct 09
Posts: 50
Credit: 111,128,218
RAC: 0
Message 249 - Posted: 26 Nov 2009, 17:21:32 UTC - in response to Message 248.  

Greg, who is doing most of the post processing? What type of machines/cluster are being used? Just curious.


Greg's always done most of the post processing. Only the
intensive sparse matrix calculation has been farmed out on
recent numbers. One of my other friends reports having done
one of the recent November matrices, and waiting for R269
(a number of larger "difficulty", with a matrix that will
take longer). I'd be interested to hear more, as well, but
am not sure how soon Greg will get back online with the
local holiday(s). -Bruce
ID: 249 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Incognito II

Send message
Joined: 19 Nov 09
Posts: 4
Credit: 4,687,684
RAC: 0
Message 250 - Posted: 26 Nov 2009, 17:50:23 UTC

Thanks for the reply Bruce.

I was just curious, I ran the old NFSNet projct on a number of machines way back years ago. IIRC, The project leaders at the time were a bit more informative with the details of what was happening in the background/post processing.

Actually, as long as you have been working on the Cunningham Tables and with the hardware you have available, I'm surprized you are not helping in the post processing, then again maybe you are doing some of your own?
ID: 250 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bdodson*

Send message
Joined: 2 Oct 09
Posts: 50
Credit: 111,128,218
RAC: 0
Message 251 - Posted: 26 Nov 2009, 18:32:29 UTC - in response to Message 250.  

..., I ran the old NFSNet projct on a number of machines way back years ago. IIRC, The project leaders at the time were a bit more informative with the details of what was happening in the background/post processing.


Good to hear; was that back when they had stats, and automated task
distribution? Once the stats went down, most of the sieving was either
me here or Greg. In either case, a lot smaller group, with different
interest/tolerance in hearing the details. Also, despite the huge progress
in Wanted numbers, NFS@Home is still quite new. I'm not sure that Greg's
set a firm protocal for who's available for that one intensive step, the
matrix computation. Still a work-in-progress.


Actually, as long as you have been working on the Cunningham Tables and with the hardware you have available, I'm surprized you are not helping in the post processing, then again maybe you are doing some of your own?


I'm usually only able to run matrices on our newest clusters, often
with best results before they're quite open to our users. I ran a bunch
on our old compute server with Greg (the one still listed with 32 cores).
Not sure how long the new Xeons will stay useful for matrix work; I've
been running smaller projects with Batalov.

Almost all of our hardware is exclusively run under a UWisc scheduler
called condor; no user logins or job submission. Something in the
range of 200+ linux x86-64s, which I use for nfs sieving projects (most
recently M941, about half of that computation). Then a pc/grid of
windows machines mostly and some 32-bit linux on which I run ecm.

The volumn and quality of the NFS@Home factorizations seems to me to represent
a new era for Cunningham numbers, for all but the most exclusive projects
using .com or .gov (or both) resources. Those would include the two record
computations, M1039 for snfs and RSA200 for gnfs; still somewhat past
our present range. -Bruce
ID: 251 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Incognito II

Send message
Joined: 19 Nov 09
Posts: 4
Credit: 4,687,684
RAC: 0
Message 252 - Posted: 26 Nov 2009, 19:34:07 UTC

Yes, it was back when NFSNet had the automated task system setup. Back when Intel P3's and Athlon T-Birds were top of the line, maybe a few P4's in the mix?

Honestly I'm surprized at the number of people running NFS now, but I guess that is BOINC's appeal, you set it up and they will come! :)

Anyway, I'm glad to see it doing so well, I just had to give it a shot, at least for a while.
ID: 252 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greg
Project administrator

Send message
Joined: 26 Jun 08
Posts: 640
Credit: 433,350,798
RAC: 332,873
Message 253 - Posted: 26 Nov 2009, 20:47:42 UTC

Sure, it's not secret. :-)

Once most of the workunit results have come in (I typically don't wait for the 0.2% of relations at the end of the "long tail"), I transfer them from the BOINC server to our large memory (32-core 64 GB) computer for filtering. I then use msieve to do an in-memory filtering run. This usually produces a somewhat better matrix than using disk-based passes on a smaller memory computer. It is then ready for linear algebra (LA).

LA requires at least 8GB of memory. The speed of msieve's linear algebra is bound by main memory bandwidth, so Intel Core 2 and especially Core i7 processors are perfect. I currently have access to five Core 2 Quad's with sufficient memory (this should be 11, but I'm still waiting on a memory upgrade). I run as many locally as I can just to avoid the off-campus transfers, but I also keep a list of kind people who have volunteered to run a 3-5 week LA calculation for no BOINC credit.

If I don't have room for a number locally, I contact someone from the list a few days in advance to see if their computer is free. Depending on their wishes and transfer bandwidth, I then transfer the entire data set (10-25 GB typically) or just the matrix (3-4 GB typically) to them. If they have the entire data set, they can then perform both the LA and square roots, and report the factors back to me. If they have only the matrix, they send the solutions (100-200 MB typically) back to me, and I run the square roots locally.

Finally, I report the factorization both here and at MersenneForum (I'm frmky there), and send an email to Sam Wagstaff.

Not too complicated, really. It just involves transferring a lot of data around. I'm planning to get a student involved this spring, but for now I'm doing it all myself and I've found that it doesn't take much time. And I'm enjoying it, which helps! :-)
ID: 253 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gigacruncher [TSBTs Pirate]
Volunteer moderator

Send message
Joined: 26 Sep 09
Posts: 212
Credit: 22,038,363
RAC: 13,173
Message 254 - Posted: 26 Nov 2009, 23:53:37 UTC
Last modified: 26 Nov 2009, 23:55:50 UTC

Incognito,

I ran the post-processing of 12,233- for Greg on a quad-core with 6 GB. It was very iterative until we manage to have a decent matrix to run on it. I downloaded something like 40 GB of data until the final files!!!

As Greg said you must have at least 8 GB of free memory or even more if your machine isn't dedicated to crunching. I had that problem because you can feel the machine slower when you have a client using 5 GB on a 6 GB machine even with 20 GB of virtual memory.

Until I get a new machine or add more memory I can't help Greg with the actual size of the numbers so if you have a fast machine with lots of free memory please consider helping Greg, he deserves.

Carlos
ID: 254 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Incognito II

Send message
Joined: 19 Nov 09
Posts: 4
Credit: 4,687,684
RAC: 0
Message 255 - Posted: 27 Nov 2009, 1:13:56 UTC

Greg, thanks for the reply .. very informative and interesting!

Carlos, I do have a couple of I7's with 6GB RAM and a couple of Phenom II's with 8GB RAM. The problem for me would be the hugh file transfers, 40gb's a one time and my ISP would shut me off. :(
ID: 255 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bdodson*

Send message
Joined: 2 Oct 09
Posts: 50
Credit: 111,128,218
RAC: 0
Message 257 - Posted: 3 Dec 2009, 16:48:21 UTC - in response to Message 251.  



Actually, as long as you have been working on the Cunningham Tables and with the hardware you have available, I'm surprized you are not helping in the post processing, then again maybe you are doing some of your own?


...Almost all of our hardware is exclusively run under a UWisc scheduler
called condor; no user logins or job submission. Something in the
range of 200+ linux x86-64s, which I use for nfs sieving projects (most
recently M941, about half of that computation). ... -Bruce


The large number sieved here (entirely) before M941 has just been
completed by Greg as c274 = p62*p100*p113. This is a new "Champion"
Cunningham factorization, second place:
Special number field sieve by SNFS difficulty:
5501	c307	2,1039-	K.Aoki+J.Franke+T.Kleinjung+A.K.Lenstra+D.A.Osvik
5787	c274	5,398+	G.Childers+B.Dodson
5739	c228	12,256+	T.Womack+B.Dodson 

At 280-digits, M941 will take over second place when the matrix step
finishes, about six weeks from now. -Bruce
ID: 257 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Chris S

Send message
Joined: 1 Dec 09
Posts: 2
Credit: 4,064
RAC: 0
Message 258 - Posted: 3 Dec 2009, 17:46:09 UTC

Hi! Any possibility of optimised apps here in the future?
ID: 258 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greg
Project administrator

Send message
Joined: 26 Jun 08
Posts: 640
Credit: 433,350,798
RAC: 332,873
Message 259 - Posted: 3 Dec 2009, 20:00:04 UTC - in response to Message 258.  
Last modified: 3 Dec 2009, 20:03:36 UTC

The current apps are optimized apps. They are based on the mature gnfs-lasieve code (32-bit and 64-bit) written by Jens Franke and Thorsten Kleinjung, and all include assembly optimizations.
ID: 259 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Chris S

Send message
Joined: 1 Dec 09
Posts: 2
Credit: 4,064
RAC: 0
Message 260 - Posted: 3 Dec 2009, 20:14:17 UTC

Thanks for the reply Greg, I hadn't realised that. :-)
ID: 260 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : NFS Discussion : Very long post processing


Home | My Account | Message Boards