Quoted from DanRRight
Moorthy, Even dense matrix sizes 1000x1000 are not considered large nowadays. It may even fit into L2/L3 cache of your processor. What type of sparsity is your matrix?
As to inverting matrix the BLAS is probably the best but all depends on the type of sparse matrix. For dense matrices BLAS gives speedup versus Crout method for example by factor 1.3. I have fortran example with sources of Lapack and BLAS of using different methods of matrix inversion made by DaveGemini called testfpu, try to search this name and you may find it, if not let me know.
If matrix is larger than that then parallel methods are recommended, why not use extra cores, even cellphones are now multi-processors. You can see an example of parallel library speedup versus some other methods in my post today in another thread in 'General'
Hi Dan
Thanks for the elaborated reply. Pls. send the testfpu as I could not download the testfpu.f90 as the DaveGeminii's FTP is not reacheable. I will try it out.
Sparse matrix are nothing but the non-zero elements are about 5% of the whole matrix which includes non-zero diagonal elements. Hence, exploiting the sparse structure in storing and processing would be required. In such situation, finding an inverse of a matrix to solve for the vector x
Quoted from Moorthy
Ax=B
the LU decomposition by triangular factorization, optimal ordering and direct solutions methods would be easily helpful to compute the inverse of a matrix without loosing the advantage of sparsity. Otherwise, the inverted matrix in an conventional way would fetch in the coefficient factor matrix with 100% all elements calculated, causes heavy performance deterioration and time delays. Hence, faster methods are helpful to speed up the computations using the above methods without compromising the performance and accuracy of the results.
This is what I am looking at. I believe, testfpu follows the normal way. Do you have any suggestions in the above directions where some fortran codes or methods to help out.
As you rightly said, the parallel computation methods are very much in need for such cases as I stated above. Exploiting the cores and caches would very much useful. As per your statistics, LAIPE is better approach in exploiting the parallel processing for these computations.. But it is not in usable condition for matrix inversion here 😦
Your discussion is very useful. Thanks