Very long post processing
Message boards :
NFS Discussion :
Very long post processing
Message board moderation
Author | Message |
---|---|
Send message Joined: 7 Sep 09 Posts: 5 Credit: 37,812 RAC: 0 |
Hi all! I noticed on front page that post processing might take nearly 20 days for each number! that is more than the time needed to process it (with boinc). Won't we if we continue like this come to a situation where too many numbers have been processed by us but still need to be post processed by your team? or will processing time rise to "balance" this? Thanks, sorry if question is not quite clear (french :D) La frite |
Send message Joined: 26 Jun 08 Posts: 644 Credit: 462,223,808 RAC: 135,625 |
The postprocessing time will always be much greater than the time required for the community to sieve the number. This will get worse, not better, as we move to larger numbers. However, I have access to or can readily recruit the resources to do the postprocessing on 20-25 numbers at once. So as long as I can keep the (postprocessing time)/(sieving time) < 20, there will be no problem. With the current targets and parameters, we are at about 19/3.5 = 5.4. As the project grows and that ratio approaches 20, I can tweak the parameters to make sieving a bit harder but the postprocessing easier. |
Send message Joined: 23 Sep 09 Posts: 3 Credit: 1,734,906 RAC: 0 |
Or you could try subcontracting [boinc] the postprocessing too. |
Send message Joined: 26 Jun 08 Posts: 644 Credit: 462,223,808 RAC: 135,625 |
I wish I could. Unfortunately, the post-processing involves solving a large sparse matrix, which requires very high-bandwidth, low-latency communication between nodes. In fact, on a single computer this part of the computation scales more with memory speed than CPU speed. (Core i7's with DDR3 are great at it!) This is exactly the type of problem that BOINC can't do. |
Send message Joined: 14 Sep 09 Posts: 1 Credit: 1,001,793 RAC: 0 |
What about doing the post processing using a GPU application? It could be either Cuda(Nvidia) or Cal(ATI). Even better would be an application for both. |
Send message Joined: 26 Jun 08 Posts: 644 Credit: 462,223,808 RAC: 135,625 |
Memory requirements are too high. It requires at least 5 gigabytes of memory. No GPUs have that much memory at the moment, and transfers to/from the host to the GPU kill the speed of the application. Post-processing isn't feasible for BOINC, though, because the calculation requires the complete matrix, which is a few hundred megabytes in size. |
Send message Joined: 19 Nov 09 Posts: 4 Credit: 4,687,684 RAC: 0 |
Greg, who is doing most of the post processing? What type of machines/cluster are being used? Just curious. |
Send message Joined: 2 Oct 09 Posts: 50 Credit: 111,128,218 RAC: 0 |
Greg, who is doing most of the post processing? What type of machines/cluster are being used? Just curious. Greg's always done most of the post processing. Only the intensive sparse matrix calculation has been farmed out on recent numbers. One of my other friends reports having done one of the recent November matrices, and waiting for R269 (a number of larger "difficulty", with a matrix that will take longer). I'd be interested to hear more, as well, but am not sure how soon Greg will get back online with the local holiday(s). -Bruce |
Send message Joined: 19 Nov 09 Posts: 4 Credit: 4,687,684 RAC: 0 |
Thanks for the reply Bruce. I was just curious, I ran the old NFSNet projct on a number of machines way back years ago. IIRC, The project leaders at the time were a bit more informative with the details of what was happening in the background/post processing. Actually, as long as you have been working on the Cunningham Tables and with the hardware you have available, I'm surprized you are not helping in the post processing, then again maybe you are doing some of your own? |
Send message Joined: 2 Oct 09 Posts: 50 Credit: 111,128,218 RAC: 0 |
..., I ran the old NFSNet projct on a number of machines way back years ago. IIRC, The project leaders at the time were a bit more informative with the details of what was happening in the background/post processing. Good to hear; was that back when they had stats, and automated task distribution? Once the stats went down, most of the sieving was either me here or Greg. In either case, a lot smaller group, with different interest/tolerance in hearing the details. Also, despite the huge progress in Wanted numbers, NFS@Home is still quite new. I'm not sure that Greg's set a firm protocal for who's available for that one intensive step, the matrix computation. Still a work-in-progress.
I'm usually only able to run matrices on our newest clusters, often with best results before they're quite open to our users. I ran a bunch on our old compute server with Greg (the one still listed with 32 cores). Not sure how long the new Xeons will stay useful for matrix work; I've been running smaller projects with Batalov. Almost all of our hardware is exclusively run under a UWisc scheduler called condor; no user logins or job submission. Something in the range of 200+ linux x86-64s, which I use for nfs sieving projects (most recently M941, about half of that computation). Then a pc/grid of windows machines mostly and some 32-bit linux on which I run ecm. The volumn and quality of the NFS@Home factorizations seems to me to represent a new era for Cunningham numbers, for all but the most exclusive projects using .com or .gov (or both) resources. Those would include the two record computations, M1039 for snfs and RSA200 for gnfs; still somewhat past our present range. -Bruce |
Send message Joined: 19 Nov 09 Posts: 4 Credit: 4,687,684 RAC: 0 |
Yes, it was back when NFSNet had the automated task system setup. Back when Intel P3's and Athlon T-Birds were top of the line, maybe a few P4's in the mix? Honestly I'm surprized at the number of people running NFS now, but I guess that is BOINC's appeal, you set it up and they will come! :) Anyway, I'm glad to see it doing so well, I just had to give it a shot, at least for a while. |
Send message Joined: 26 Jun 08 Posts: 644 Credit: 462,223,808 RAC: 135,625 |
Sure, it's not secret. :-) Once most of the workunit results have come in (I typically don't wait for the 0.2% of relations at the end of the "long tail"), I transfer them from the BOINC server to our large memory (32-core 64 GB) computer for filtering. I then use msieve to do an in-memory filtering run. This usually produces a somewhat better matrix than using disk-based passes on a smaller memory computer. It is then ready for linear algebra (LA). LA requires at least 8GB of memory. The speed of msieve's linear algebra is bound by main memory bandwidth, so Intel Core 2 and especially Core i7 processors are perfect. I currently have access to five Core 2 Quad's with sufficient memory (this should be 11, but I'm still waiting on a memory upgrade). I run as many locally as I can just to avoid the off-campus transfers, but I also keep a list of kind people who have volunteered to run a 3-5 week LA calculation for no BOINC credit. If I don't have room for a number locally, I contact someone from the list a few days in advance to see if their computer is free. Depending on their wishes and transfer bandwidth, I then transfer the entire data set (10-25 GB typically) or just the matrix (3-4 GB typically) to them. If they have the entire data set, they can then perform both the LA and square roots, and report the factors back to me. If they have only the matrix, they send the solutions (100-200 MB typically) back to me, and I run the square roots locally. Finally, I report the factorization both here and at MersenneForum (I'm frmky there), and send an email to Sam Wagstaff. Not too complicated, really. It just involves transferring a lot of data around. I'm planning to get a student involved this spring, but for now I'm doing it all myself and I've found that it doesn't take much time. And I'm enjoying it, which helps! :-) |
Send message Joined: 26 Sep 09 Posts: 218 Credit: 22,841,893 RAC: 1,806 |
Incognito, I ran the post-processing of 12,233- for Greg on a quad-core with 6 GB. It was very iterative until we manage to have a decent matrix to run on it. I downloaded something like 40 GB of data until the final files!!! As Greg said you must have at least 8 GB of free memory or even more if your machine isn't dedicated to crunching. I had that problem because you can feel the machine slower when you have a client using 5 GB on a 6 GB machine even with 20 GB of virtual memory. Until I get a new machine or add more memory I can't help Greg with the actual size of the numbers so if you have a fast machine with lots of free memory please consider helping Greg, he deserves. Carlos |
Send message Joined: 19 Nov 09 Posts: 4 Credit: 4,687,684 RAC: 0 |
Greg, thanks for the reply .. very informative and interesting! Carlos, I do have a couple of I7's with 6GB RAM and a couple of Phenom II's with 8GB RAM. The problem for me would be the hugh file transfers, 40gb's a one time and my ISP would shut me off. :( |
Send message Joined: 2 Oct 09 Posts: 50 Credit: 111,128,218 RAC: 0 |
The large number sieved here (entirely) before M941 has just been completed by Greg as c274 = p62*p100*p113. This is a new "Champion" Cunningham factorization, second place: Special number field sieve by SNFS difficulty: 5501 c307 2,1039- K.Aoki+J.Franke+T.Kleinjung+A.K.Lenstra+D.A.Osvik 5787 c274 5,398+ G.Childers+B.Dodson 5739 c228 12,256+ T.Womack+B.Dodson At 280-digits, M941 will take over second place when the matrix step finishes, about six weeks from now. -Bruce |
Send message Joined: 1 Dec 09 Posts: 2 Credit: 4,064 RAC: 0 |
Hi! Any possibility of optimised apps here in the future? |
Send message Joined: 26 Jun 08 Posts: 644 Credit: 462,223,808 RAC: 135,625 |
|
Send message Joined: 1 Dec 09 Posts: 2 Credit: 4,064 RAC: 0 |
Thanks for the reply Greg, I hadn't realised that. :-) |