View previous topic :: View next topic |
Author |
Message |
mecej4
Joined: 31 Oct 2006 Posts: 1892
|
Posted: Wed Mar 22, 2017 7:06 pm Post subject: Re: |
|
|
DanRRight wrote: |
2) Also how about your previous assessment that LAIPE2 is always slower then LAIPE1 while the test shows opposite?
|
That assessment remains more useful. The results in the previous two posts, while falling into the same category as your "Gaussmark", pertain to dense random matrices. My previous statement was based on sparse symmetric matrix runs.
Quote: |
3) How about using 64bit Laipe with 64bit FTN95 ?
|
Works fine, if you build a DLL first with Gfortran/GCC.
Quote: | 4) Do you know if MKL have block matrix solver for the symmetric matrix like below? Arrow show the current width of block. In Laipe this is Decompose_VAG_8
|
MKL/Pardiso covers that type. They classify based on symmetric/unsymmetric and +def/indef. Whether a matrix is block-sparse or ragged-sparse they don't care. Laipe will probably be at a disadvantage with matrices such as https://www.cise.ufl.edu/research/sparse/matrices/Cannizzo/sts4098.html , where there are large empty regions between the diagonal and the opposite corners (N.E. and S.W.). Laipe input is almost the same as for a full, dense matrix, whereas with other sparse packages you do not have to fill in the zeros. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Wed Mar 22, 2017 7:54 pm Post subject: |
|
|
I do not know about very sparse cases but for the block matrix like above Laipe stores data only for the blocks omitting empty places. It is packing data into 1D array contiguously. How about licensing conditions of MKL/Patdiso ? I suppose this sparse part of MKL is is also parallel.
Also, noticed strange slowness at 500 matrix size for MKL and Laipe2? first time DLL loading ovehead ? |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1892
|
Posted: Wed Mar 22, 2017 8:25 pm Post subject: Re: |
|
|
DanRRight wrote: | How about licensing conditions of MKL/Patdiso ? I suppose this sparse part of MKL is is also parallel.
|
https://software.intel.com/en-us/performance-libraries
Quote: |
Also, noticed strange slowness at 500 matrix size for MKL and Laipe2? first time DLL loading overhead ? |
Possible. You could circumvent the timing error caused by DLL load by making a dummy call to some DLL routine before starting the timer.
There are plenty of strange things with Laipe. I have noticed cases where run time increases as the number of threads is increased. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Wed Mar 22, 2017 10:18 pm Post subject: |
|
|
I tried your 64bit Laipe DLL with dense matrix code and it works with no single huccup.
I also installed MKL and want to try with your program for Lapack/MKL above first. How it has to be compiled with FTN95 ?
The work you have done by adopting external libraries has great benefits for all FTN95 users. All now can use MKL parallel libraries, updated and future updates of LAIPE as well as basically a lot of other software via DLL. My prize for that which i promised goes to you with all big thanks |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1892
|
Posted: Thu Mar 23, 2017 1:07 am Post subject: Re: |
|
|
DanRRight wrote: | I also installed MKL and want to try with your program for Lapack/MKL above first. How it has to be compiled with FTN95 ?
|
As follows:
Code: | ftn95 tlapack.f90 /64
slink64 tlapack.obj <path to MKL directory>\compilers_and_libraries_2017.2.187\windows\redist\intel64\mkl\mkl_rt.dll /file:tlapack
path %path%;<path to directory containing 64-bit MKL_RT.DLL>
tlapack
|
|
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Thu Mar 23, 2017 2:07 am Post subject: |
|
|
Can you please send exact BAT file? I have some error in syntax
Tried just to copy all MKL DLLs into the same directory and SLINK64 them but still at run time it does not allow to load MKL_Intel_thread.dll or something else |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1892
|
Posted: Thu Mar 23, 2017 2:40 am Post subject: |
|
|
I typed the commands in a command window -- no batch file. I do not install MKL in the default location, since I keep multiple versions for use as needed.
If you have Intel Parallel Studio installed, open a compiler command window for x64, and you will find all the MKL DLLs in %root%\redist\intel64\mkl.
I have never installed MKL by itself, so I don't know how the installer sets up the MKL environment in that case. Nevertheless, there is probably a setup file called mklvars.bat or something similar, which should work for you.
If all that fails, post the error message here or in the MKL forum. |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2559 Location: Sydney
|
Posted: Thu Mar 23, 2017 4:00 am Post subject: |
|
|
mecej4,
Thanks for demonstrating the possibilities. This gives me some useful pointers as to how to link into FTN95 /64.
My review of Laipe2 tests suggests that they (the published tests) do not use SSE or AVX instructions. I don't know why this would be the case, although my suspicion is that the added speed of AVX instructions would introduce the memory speed bottleneck to the performance, which must be a significant issue when many threads are used. |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1892
|
Posted: Thu Mar 23, 2017 11:25 am Post subject: |
|
|
The 64-bit Laipe2 library that is included with Equation.com's recent GFortran distributions (6.2, 6.3) contains SSE2 instructions. Which version did you test? |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Thu Mar 23, 2017 2:01 pm Post subject: |
|
|
Mecej4, From the Intel's link you have mentioned above out of 4-5 different software packages I installed only Intel MKL 2017 update 2. Was really surprised that Intel offer all of them for free. May try different package version, saw that this specific update had problems linking for some people.
Do not see MKL directory \Program Files (x86)\IntelSWTools in the System/Environment Variables/path, may be need a reboot, but I can not reboot now... |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1892
|
Posted: Thu Mar 23, 2017 2:10 pm Post subject: |
|
|
That MKL package is fine. Intel's releasing MKL in a "community" edition is a recent development.
If you do not have some version of Visual Studio 2015 installed, you will not have the support libraries and DLLs needed to make MKL work. My usual suggestion is to
(i) install VS 2015 community edition, if needed
(ii) test that you can build some C programs using VC and only then
(iii) install MKL or Parallel Studio. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Fri Mar 24, 2017 12:15 am Post subject: |
|
|
Tried reinstall MKL, install DevStudio, uninstalled, took different version of MKL was not free and required license key, etc etc etc until I realized that nothing that was needed and I just manually put the path ...hell knows where was my damn error...Devilry. Anyway I returned to your initial bat file.
Also, huge mess is this Intel, different versions do things differently with environment variables, their own tests do not work complaining of missing this LIB, missing that DLL...
Code: |
ftn95 tlapack.f90 /64 /debug /check /free /err /set_error_level error 298 /no_truncate /zeroise >a_FTN95___
slink64 tlapack.obj "c:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2017.1.143\windows\redist\intel64\mkl\mkl_rt.dll" /file:tlapack.exe >a_link___
|
Same DIR was added to the path in the System/Environment Variables
Last edited by DanRRight on Fri Mar 24, 2017 2:19 am; edited 6 times in total |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2559 Location: Sydney
|
Posted: Fri Mar 24, 2017 12:15 am Post subject: |
|
|
mecej4 wrote: | Which version did you test? |
I am comparing equation.com's reported single thread performance of the Intel Xeon and AMD and comparing to what I can achieve on my Intel i5 and i7 processors using basic DO loop code or MATMUL intrinsic. The only way I could get that performance would be to exclude vector instructions. Laipe2 appears to be too slow. |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1892
|
Posted: Fri Mar 24, 2017 11:57 am Post subject: Re: |
|
|
JohnCampbell wrote: |
I am comparing equation.com's reported single thread performance of the Intel Xeon and AMD and comparing to what I can achieve on my Intel i5 and i7 processors using basic DO loop code or MATMUL intrinsic. The only way I could get that performance would be to exclude vector instructions. |
Would it be correct to conclude that you did not actually run programs using Laipe1 or Laipe2, but are estimating what times Laipe might yield on your computer(s)?
Laipe2-64 bit definitely uses SSE2 for floating point operations. Here is proof: The EXE produced by Gfortran 6.2-64-bit for Dan's dense random square matrix problem (Laipe2 static library linked) yields this:
Code: |
s:\sparse\LAIPE>objdump -d a.exe | findstr /i fadd
40fadd: 48 8b 84 24 a8 01 00 mov 0x1a8(%rsp),%rax
42f5b0: d8 05 a2 dd 01 00 fadds 0x1dda2(%rip) # 0x44d358
4305c0: d8 05 92 cd 01 00 fadds 0x1cd92(%rip) # 0x44d358 |
Surely, one cannot implement Gaussian elimination with just two FADD instructions? Furthermore, the benchmark is for double precision matrices, and what you see here are single precision FADDS instructions, probably from some RTL routine.
Here are more timing results, with Dan's timings added for comparison:
Code: |
T W O C O R E I5 - 4200U I7 - 4770K
64-bit 32-bit 32-bit 64-bit 32-bit
N MKL MKL Laipe1 Laipe2 Laipe1 (DanR)
---- ----- ----- ------ ----- ----
500 0.110 0.000 0.062 0.031
750 0.000 0.016 0.141 0.110
1000 0.031 0.047 0.328 0.219 0.09
1250 0.047 0.078 0.610 0.407
1500 0.078 0.093 1.000 0.656
1750 0.078 0.125 1.594 1.015
2000 0.110 0.203 2.285 1.469 0.75
2250 0.172 0.250 3.401 2.125
2500 0.234 0.375 4.509 2.875
2750 0.344 0.484 6.547 3.735
3000 0.375 0.641 7.625 4.796 2.44
3250 0.468 0.828 10.283 6.328
3500 0.578 1.054 12.580 7.734
3750 0.703 1.223 15.933 9.609
4000 0.892 1.422 19.148 11.422 5.90
|
Last edited by mecej4 on Sun Aug 11, 2019 9:04 am; edited 2 times in total |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1892
|
Posted: Fri Mar 24, 2017 12:07 pm Post subject: Re: |
|
|
DanRRight wrote: |
Also, huge mess is this Intel, different versions do things differently with environment variables, their own tests do not work complaining of missing this LIB, missing that DLL...
Code: |
ftn95 tlapack.f90 /64 /debug /check /free /err /set_error_level error 298 /no_truncate /zeroise >a_FTN95___
slink64 tlapack.obj "c:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2017.1.143\windows\redist\intel64\mkl\mkl_rt.dll" /file:tlapack.exe >a_link___
|
Same DIR was added to the path in the System/Environment Variables |
Dan, did you finally get the program built and did you run it? You redirected the error messages to files, and forgot to post the contents of those files!
For the purposes of this test, you do not need any of the compiler options that you used, especially /check and /zeroise. |
|
Back to top |
|
|
|