Topic: Dual Processor Optimisation in Support

JohnCampbell

Posts: 2526 Sydney

Back to Top

28 Feb 2008 11:42 #2837

Ian,

I'm showing my age with these answers ! The solver I am using is based on a skyline solver, which I think I first got from a paper by Graeme Powell of UCB in about 1976. The bandwidth optimiser is based on the similar methods of Hoit and Sloan in about 1982. Neither method works very well for all problems and I am finding my 'Campbell' algorithm of also sorting the nodes in the x,y or z direction with a quick sort returns the smallest profile in most cases. The problems I solve are probably best described as mid-sized finite element problems, as I still generate the models with my own primitive techniques, using a fortran model generator. The latest problem is 150,000 equations, an average profile of 1,800 and a peak profile of 9,900 equations. Back when I developed most of my FE code these problem sizes would have been considered huge, however modern commercial FE packages now produce problems much larger. I still find a select few problems where my approaches and understanding of the methods still apply.

So the answers to your questions are :

[quote='IanLambley']A couple of questions:

Are you talking about Choleski decomposition and forward/backward substitution? YES
Does your solver implement a profile method to minimise storage? YES for both storage and re-ordering
What is the bandwidth of the stiffness matrix? Typically as above, the latgest problem I've solved is about twice as large and most now are 1gb plus. All use 64bit. I tried 80bit but that gave no significant improvement for a model where round-off looked a problem.

Regards

John

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

29 Feb 2008 3:01 #2842

John,

Now I understand what the problem is. It isn't an odd disk access from time to time, but lots.

If the mainboard in your PC has IDE connectors spare, and the mainboard has RAID support, you can run a RAID 0 array with two old hard disks. Your won't get much of a performance boost, but there will be some. Cost nearly nil. If you have SATA connectors spare, - and again, with mainboard RAID support, you will get a performance boost by running a RAID 0 array. This applies both with SATA I and II hard disks. Cost c. GBP 100. Double or treble if you need a RAID controller card.

Whatever SATA support you have, a fast SATA drive on its own will help. Most modern drives are 7200 rpm. A 10,000 rpm WD Raptor hard drive on its own would give you some improvement: two in a RAID 0 array a lot. They are correspondingly more expensive than 7200 rpm drives - GBP70 for 36 Gb, GBP90 for 74 Gb. (cf GBP 30 for 160 Gb standard drive).Two or more WD Raptor drives in a RAID array will be even faster. My guess here is about half your present tun time, assuming your PC only has a standard PATA or SATA drive.

If you don't have SATA and RAID support, then you can buy a RAID ccontroller card.

To go to 15,000 rpm, you need SCSI. Then you need a SCSI RAID card as well as the SCSI drives. This could get expensive - it's already out of my league. Drives could be GBP500+ each!

I checked prices today with a UK supplier of mail order hardware: www.scan.co.uk and www.aria.co.uk - I imagine real prices are globally equivalent.

Regards

Eddie

IanLambley

Posts: 501 Sunderland

Back to Top

1 Mar 2008 4:16 #2861

John,

Have you tried a wavefront/frontal solver?

Regards

Ian

JohnCampbell

Posts: 2526 Sydney

Back to Top

2 Mar 2008 12:00 #2864

Ian,

I did most of this development work a long time ago. I never liked frontal solvers as I saw them to be more complex and with little computational benefit. There was certainly more benefit to be gained from a good bandwidth or profile optimiser. For itterative solutions, especially shifted subspace eigen solvers, where there are many load cases to repeadidly solve, I don't think the fromtal solver is as suitable, whereas having a reduced profile stiffness matrix is more suitable. The other big drawback of frontal solvers was once the 'triangle' of active equations could not be stored in memory, it all got a bit slow, whereas with the profile solver, having 2 blocks or 10 blocks to store the active triangle makes little difference; you just cycle through the earlier blocks. Back in the 70's and 80's when overlays were used and there was only space for limited amounts of code in memory, it was good to have a compact reduction code for a pc. I think some packages persist with frontal solvers, but to me any of their benefits were very limited. It was more of an advertising gimic to say you used a frontal solver. There are probably people out there who may disagree, but for the range of beam/shell problems I solved, frontal solvers appeared to provide limited benefit. I must admit, that seeing what modern packages can do with itterative solvers and large numbers of equations, makes my direct skyline solver show it's age.

Regards John

JohnHorspool

Posts: 260 Gloucestershire UK

Back to Top

2 Mar 2008 12:24 #2867

I thought that these days the major FE systems use a multi-frontal solver when confronted with large problems to solve, as here with Lusas,

see:-

http://www.lusas.org/products/options/fast_solvers.html

It was my understanding that both Abaqus and MSC/Nastran also use this solver technology (though I may be wrong), the output from these packages makes no mention of iterations when solving large tet meshes linearly.

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

2 Mar 2008 10:26 #2868

Hi John H,

There's no doubt that the right algorithm helps enormously. Depending of how much of a software developer you are, and what resources you can call on, it may be worth reprogramming to make old code work better. Me, I'm an academic and I am on a 2-year mission to make my programs (which work fine in DOS!) into a Windows application. Speed isn't an issue for me - I was getting 3 hour run times on a mainfram in 1973, now I can't even time it with a stopwatch!

Sometimes, one is stuck with the Fortran code one has (and the compiler one uses, and, I suppose, the machine architecture). Then, the problem is what can be done that takes little time, little money and produces the most benefit for the least of both. I'm not much of a fan of getting a faster cpu. Since I build my own computers, it almost always means a new mainboard and RAM, sometimes a videocard and other odds and ends. If there is a faster cpu for the rig one has, then it is usually a bit disappointing how little improvement you get.

Dual or multiple core cpus only speed up multithreaded applications.

The cheapest options seemed to me to be in the arena of speeding up a simple routine which is called zillions of times, or getting improved hard disk performance - given what John C told us about his problem. On reflection, the RAID array of fast disks seems the best and cheapest option here .... so much so that I think I'll do it for myself!

Regards

Eddie

JohnHorspool

Posts: 260 Gloucestershire UK

Back to Top

2 Mar 2008 11:15 #2869

Hi Eddie and John C,

I know that this thread is really about solvers. But it may be of interest that a post processor which I originally wrote in the early 80's on a VAX with tektronix displays is still used today by myself and work colleagues, compiled with FTN95 and running on 32bit windows machines these days. We find that having access to a Linux 64bit machine with eight processors and 32 GB ram for solving, the limiting factor is the size of model that the PCs can handle. We have two very well known FE solvers on the Linux machine, both of which come with their own pre and post processors running on PCs. We don't use their pre-processors to generate the models. Similarly we don't use their post-procecessors either ! The simple fact is, that for the larger models the commercial PC based programs both crash with memory exhaustion problems, yet my post-processor compiled with FTN95 runs just fine !

This to me more than justifies the requirement for writing your own code and confirms that FTN95 produces very capable programs.

JohnCampbell

Posts: 2526 Sydney

Back to Top

6 Mar 2008 1:09 #2888

John H,

Like you, I've always pursued writing my own code for FE and logistics simulation. There are always niches where this approach works better.

With regard to the Multi-Front solvers, I would expect this relates more to a multi-front reordering, which suits the spoked wheel style of problem. I once worked with a boundary element package, which had 4 iterative solvers and for this and I expect most 3D solid problems, the iterative solver appears more suited. I did most of my development work on solvers in 70’s and 80’s so there is a lot of new developments I don’t know about, although I’m not aware of any significant new direction in solving large sets of linear equations. I’ve never forgotten the post-grad who tried to sell everyone iterative solvers and blew the department’s mainframe budget on a problem that never stopped and had the time limit turned off.

We were recently using ANSYS for the analysis of a rail track support / acoustic isolator made of steel, rubber and HDPE. The vastly different stiffness moduli do not appear to suit an iterative solver.

I see that CSI-SAP2000 now has a 'sapfire' solver. With the increasing commercialization of these solvers, it's difficult to know what works best.

Given the large range of projects we have to do, it is hare to find the time to pursue any one to the forefront of today's technology.

Certainly the direct solution Choleski profile solver, relies on two main vector functions, the dot product: A(i) = A.B and vector subtraction: A = A - factor x B, both of which are well suited to parallel or mult-processor optimization. I can’t see an easy way of achieving that with fortran. The dot_product , if limited to same kind vectors would be a clearly defined procedure that could use the new multiple “core”. There should be an API for this !

regards John C