forums.silverfrost.com

jjgermis

A while ago I was looking for a matrix spasity viewer outside Matlab. Furthermore, the result in eps format will be a help.

The following example code write the spasity pattern to an eps file - some usefull tipps on eps/ps can be found at: PS-FORMAT

jjgermis · Posted: Mon Aug 13, 2012 12:17 pm Post subject:

Bandwidth and profile reduction algorithms are indeed a very interessing topic. Often the memory management is explained. If I compare different algorithms, how can I check the memory required.

Question: Are there some ways that FTN offers which make it possible to "look" inside the memory - to be honest, I have no idea where to start.

Some useful websites which simplifies the study:

1.) Test vectors by Everstine;
2.) published results in his paper; and
3.) bandwidth and profile reduction algorithm

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

If you are interested in profile reduction algorithms, then I would suggest that for tree based approaches you should look at the method proposed by Scott Sloan. It was proposed in the mid 80's, while I think RCM is a simpler algorithm, from 70's.
If you are looking for a simple approach, then RCM is very easy to code, once you have a tree definition of the conectivity.
The Sloan algorithm is a two stage approach, which first searches for extreme nodes then develops a node order to minimise the profile.

The key to efficiency is a suitable data structure for defining the connectivity. Stick to simple indexed tables of joints (nodes) and elements (links) and avoid complex tree TYPE definitions. The joint table and element table are two complimentary structures.

I also use a very simple non-tree based alternative approach which works well for space structures: simply sort the nodes in the X, Y or Z direction and test that node order ( X is X + .01 Y + .01 Z; Y is Y + .01X + .01 Z...) It is a good (robust) backup that covers a lot of the failings of tree based approaches, especially when highly connected nodes are involved or large variation in element sizes (local mesh refinement).

The test of optimisation is based on either:
1) minimising the storage size of the matrix, or
2) minimising the operations count for reducing the matrix.
Both are similar.

The hard part is generating a possible node order, then it is very easy to apply the size test. That is why the sort approach is so good.

John

jjgermis · Posted: Tue Aug 14, 2012 8:37 am Post subject:

John, thanks for the advice. I also have a version by Sloan - he actually uses the same matrices (see Table II) as mentioned in my previous post. This makes it very effective to compare the different approaches.

I think to compare the time will be quite easy. However, how do I compare the actual memory required by the processor?

Do you have a way to visualise the matrices? I use the post script method mentioned above - is very fast and ready to insert in my documents. Matlab offers the spy function. However, this requires an extra script (though easy) and Matlab takes much longer to display the graphics than Ghostview for example.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

You could draw the matrix, using a %gr window, drawing a vertical line for each equation for the profile. This would scale easier.
While the matrix can initially be sparse between the profile and the diagonal, it will usually fill up during the reduction. My field is structural / geotechnical engineering for 3d beam, shell and solid element combinations, with typically about 100k to 200k equations. Other field models might be different.
Drawing will show the profile variation. You could draw the profile and the initial non-zeros to see what it looks like.
I hope this answers your question, although I am confused by your need to look at memory or "actual memory required by the processor".
Isn't memory required = the count of numbers inside the profile x bytes per number ?
If you are looking to problems that exceed the memory available, then the out of core profile crout solver is fairly robust. See "Large Capacity Equation Solver for Structural Analysis" by Mondkar and Powell, Computers and Structures 1974 or more recent versions by RL Taylor. This is now all dated but very robust. As you can see I did most of this in late 70's and 80's.

John

jjgermis · Posted: Tue Aug 14, 2012 12:34 pm Post subject:

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

I'm not sure if I agree with where you are going with this reduced bits approach.
For the profile calculation: If you have more than 32768 nodes or elements then you need integer*4 for your tables to be able to record the tree structure.
If you have a matrix with more than 32768 elements then you need integer*4 to be able to calculate the addresses.
Further, If you have a matrix with more than 2^31 elements then you need integer*8 to be able to calculate the addresses. (You need to be able to calculate a bad profile which can be worse than 2^31 elements)
I use integer*4 arrays for the bandwidth minimiser and integer*8 for the matrix size calculation.
For the matrix storage: the size of each number in the matrix depends on the precision required. It would have to be a very well behaved set of equations to be able to use real*4. I use real*8 for all calculations. (I tried real*10 once but did not get significantly better results, as I was not able to use real *10 for all element matrices.)
I'd be surprised if you could use less than integer*4 for the matrix profile definition or real*8 for the matrix calculation.
The only difference in what memory means now is that the limits are a bit bigger. There is still a ceiling with 32-bit if you don't use out of core, as 2^28 is the maximum size real*8 array you can define with ftn95.
I currently have a limit at 8gb, as I still use integer*4 for the matrix addressing.

John

jjgermis · Posted: Tue Aug 14, 2012 4:15 pm Post subject:

We only use 2D field problems (eletrical). Since we have rotating objects and because of our air gap element, we get quite a large initial bandwidth and profile. In a previous thread Ihave uploaded a typical example: profiles

At present I try to break the program up in logical parts, i.e. at present I "isolate" the profile reduction and pack it into a module.

I am looking forward to finilise the comparison between Gibbs et al and Sloan.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

You can always find problem styles that are not suited to a particular ordering approach, but in general the Sloan algorithm is more reliable than either RCM or Gibbs, as it includes a weighting approach for adding nodes to the �front�.
I would recommend that you also try the sorting approach, as it is very simple to implement. It provides an ordering that is usually reasonable, for very little effort. Simply calculate the weighting for each node, which is X + .01* Y or Y + .01 * X and use an indexed quick sort, such as DSORT@. Then apply your profile test and use this as a reference to be improved.
How does the air gap element affect the connectivity? If it is a general element with broad connectivity, you might need to delay the equations associated with this interface to the end, creating a V shaped profile. With profile solvers, it is the average and not the maximum profile which is important. It all depends on how many equations define the air gap linkage between the 2 parts of the model. This is like a substructure approach for the two sections of the model.
Some ideas that might work ?

John

jjgermis · Posted: Thu Aug 16, 2012 11:03 am Post subject:

Hi John,

thanks for your advice. You have addressed some aspects which I will consider - especially to use the sorting approach as a reference.

I do this as a sort of side track. For the moment everything is working fine. For me to make improvements I first have to understand the process. For this reason I identified some parts - the renumbering of the nodes is one topic. We have little documentation of the code.

Many online examples are based on structural engineering problems - little is available for electrical machines. Even though the solving process is the same, I find it difficult to work through the examples.

We use for example the direct solver from FEAPpv

jjgermis · Posted: Fri Aug 17, 2012 9:22 am Post subject:

The FEAPpv is based on a well written book on basic FEM.

Working through the papers of Evertine, Gibbs and Sloan one needs to take care of the (exact) definitions of bandwidth, profile and wavefront. Especially the definition of wavefront is still not clear to me. All my references define the same thing in diffrent ways (this is also a comment in the paper by Evertine).

One can define a lower and upper bandwidth for the lower and upper triangle respectively. The wavefront seems to me like the "row bandwidth"
of the upper triangle.

jjgermis · Posted: Sun Aug 19, 2012 10:55 am Post subject:

The little eps-format "viewer" works fine and saves a lot of time. Using LaTeX I can write the necessary files to include the graphics (including the eps files) without any effort - no additional scripts of programs.

The figure below is automatically generated. This makes it a little bit easier to visualize the results Very Happy

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

Thanks for the book pdf. It is one of the main texts for FE. I have the 3rd edition.

The reference to �bandwidth, profile and wavefront� were the three main forms of equation storage for direct equation solution in 70's and 80's. My preference is for profile, which is always better than bandwidth (fixed band) and simpler than wavefront. It is well suited to vector instructions but not easy to program for multi-processor approach. RCM is best suited to fixed bandwidth, while Sloan is not, as the approach can provide a larger maximum bandwidth, but smaller average profile.

I noticed that you also posted "*Invalid* dimension with assumed size array" which uses a 2D array which implies fixed bandwidth. You would be much better to store the upper half of the matrix as a 1D vector with a index that points to the diagonal. eg:
integer*4 matrix_pointer(0:neq)
real*8 matrix_values(*)
where matrix_pointer(ieq) is locational of diagonal in matrix_pointer and
matrix_pointer(ieq)-matrix_pointer(ieq-1) is the profile size for equation ieq.
You can define these as allocable arrays in a module, defining matrix_pointer when you know the number of equations and matrix_values at the end of the re-ordering.
(this assumes that the matrix is symmetric, although could be modified for a non-symmetric matrix that doesn't require re-ordering (partial pivoting) to improve the solution stability.

Your use of the LaTeX graphics implies to me you have a small problem to solve (20 nodes ?) If it is that small, I'd use a sort and not worry about all the code for Sloan. Your previous mesh plots were much bigger problems ??
The graphics of before and after for each approach does help to understand their relative strengths and weaknesses.

John

jjgermis · Posted: Mon Aug 20, 2012 7:35 am Post subject:

Hi John,

thanks for the 1D storage advice. Sloan uses the same storage method in is publication. I was actually considering to implement it - need it anyway to test the algorithm of Sloan. However, Gibbs et al requires a two-dimensional (NDSTK) as explained in their code:

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

Last year (10-11), I experimented with transfering most of my large arrays to allocatable in a module and found this to work very well.
Now, with FTN95 Ver 6.3 and Win 7_64, I can compile with /debug and get 3.7gb of available memory, which works very well.
I had to map out where the arrays could be allocated and then convert to either using the arrays via USE module or transfering them via subroutine calls.
I could send some examples if you require, as I'd recommend the change.
It all depends on how you can control the restructure of the program approach.

John