forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Advice wanted by Newbie considering purchase
Goto page 1, 2  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
TonyRichards



Joined: 19 Aug 2016
Posts: 3
Location: Oxford

PostPosted: Fri Aug 19, 2016 6:10 pm    Post subject: Advice wanted by Newbie considering purchase Reply with quote

I want to continue developing Fortran applications in my retirement but have lost access to the Intel Composer + Visual Studio IDE that I used at work.
I am considering purchasing FTN95,so before I purchase, I would like to know what problems I might meet converting Fortran programs with GUIs developed in the Visual studio IDE that I used for the Intel compiler. Also what support is there for developing mixed-language applications (Fortran and C++, Fortran VisualBasic).

All advice appreciated, thanks.
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 1683
Location: Yateley, Hants, UK

PostPosted: Fri Aug 19, 2016 11:11 pm    Post subject: Reply with quote

There is no better way to find out if one of FTN95's modes is right for you than to download the PE, and give it a go. Polyhedron's benchmarks may favour other compilers, but FTN95 is my compiler of choice.

Compiling and linking vanilla Fortran is simple enough on your own, but using Clearwin+ has a fairly long steep learning curve. If that was your way forward and you felt that some face to face tuition would help, then I'm game. Like you, I am retired, and live about 35 miles away from Oxford. PM me if you want.

Going the Visual Studio route with FTN95 isn't something I could help with (no experience).

Eddie
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 739

PostPosted: Fri Aug 19, 2016 11:25 pm    Post subject: Reply with quote

FTN95 comes with the Salford/Silverfrost C compiler SCC, and you can build mixed language programs using the two compilers. However, there is no support yet for the Fortran 2003 ISO/C environment, and there is limited support for Fortran features newer than Fortran-95.

On the other hand, for Fortran-95 (or earlier) programs, FTN95 provides excellent capabilities for catching errors by compiling error-checking code into the EXE, and comes with its own symbolic debugger and an IDE+editor (Plato) that you can use as an alternative to Visual Studio.

FTN95 programs can be linked with DLLs that were produced by using the MS C compiler. The SLINK linker does not need import libraries.
Back to top
View user's profile Send private message
TonyRichards



Joined: 19 Aug 2016
Posts: 3
Location: Oxford

PostPosted: Sat Aug 20, 2016 4:36 pm    Post subject: Reply with quote

Thanks both for your replies.
Hi Meceji, I recognise you from the Intel Fortran forums!

My Intel projects with dialogs come with C++ .h and .res resource files containing the dialog details. Can these still be used?, or will I have to completely rebuild these dialogs using a different resource editor (always assuming one is available that works with FTN95)?
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1647
Location: South Pole, Antarctica

PostPosted: Sat Aug 20, 2016 7:37 pm    Post subject: Reply with quote

Tony,

Can you make the shot from the screen and post it (Postimage dot org I typically use) to see your most complex GUI and we will immediately tell you if it is worth to keep it or rebuild to something new.

As to which compiler to choose I have just one-two sentences: no other compiler in the world will save you your valuable debugging time like FTN95 will do. If your code "debugged" with other compilers is large I bet you will find so many hidden bugs in it you will not believe your eyes. FTN95 is a monster bug eater Smile

If the code is large, or other people use it, or you want it to look nice, simple and self-explanatory, or you want it to produce stunning run-time graphics then Clearwin+ will give you a lot of fun.

Mecej4,
The DLLs produced with Intel or any other Fortran will probably also work with FTN95. Can you check that if you have IVF?
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 739

PostPosted: Sat Aug 20, 2016 10:59 pm    Post subject: Re: Reply with quote

DanRRight wrote:

Mecej4,
The DLLs produced with Intel or any other Fortran will probably also work with FTN95. Can you check that if you have IVF?

Sometimes that will work, but most of the time the problem is that a DLL produced with IFort will have dependencies on the Intel Fortran runtime DLLs. It may even happen that the EXE that you produce and the compiler runtime depend on more than one version of the Microsoft VC DLLs.

Such problems may be overcome by carefully reordering the paths within %PATH%, but that is probably something to be avoided.

Tony: You may try to obtain a "hobbyist license" for Intel Fortran to help you during the transition. Please see

<https://groups.google.com/forum/#!topic/comp.lang.fortran/MkocNfmKP2M>
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1647
Location: South Pole, Antarctica

PostPosted: Sun Aug 21, 2016 11:40 pm    Post subject: Reply with quote

Mecej4,
I know for sure that even LIB files compiled by Intel Fortran sometimes work with FTN95 as if they are their own despite FTN95 may or may not complain during the linking

Remember we were doing comparison of different linear algebra solvers couple years back? The sources were published here. There was parallel library in the form of LIB file called LAIPE compiled with IFort and its dense matrix solver worked fine. It was successfully employing all the cores of the processor to work in parallel and got almost proportional to core count speedup. Beat $30 John Campbell still can not beat its speedup with his own OpenMP tricks Smile. The sparse matrix solver though did not compile by some reason missing some strange system function no one knew what it was. I would appreciate if someone here with the interest to this solver who has Intel Fortran convert this LIB into DLL as I got the respond years back from Intel that this may solve the problems with using it with FTN95 ( or I would add any other compiler). The author of this library for whom this would be just 1 second job is totally not-responsive on this matter despite my numerous requests in the past. Here is how this could be done

Code:

>... can IVF convert existing lib files into dll itself or I have to ask library developer to create dll from the sources?

Yes, you can, with some limitations. If the library consists of routines only,
and there is no shared data such as COMMON blocks that are expected to be used
by the "end user", then you can do this.  You will need to create a ".DEF"
file which lists all the routines to be exported from the DLL.  It is a text
file that has a line for each routine to be exported like this:

EXPORT routine1
EXPORT routine2
EXPORT routine3

You may have to experiment with the names - the case (upper/lower/mixed) must
match and there may be prefixes or suffixes that you have to consider.

Name this file with a .DEF (or .def) file type.  This is an input to the
linker, so you can, from the command line, say:

link /dll mylib.lib mydef.def

This will require that the run-time libraries of the other compiler be
available and you may have to add them to the link command.

--
Steve Lionel
Developer Products Division
Intel Corporation
Nashua, NH
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 1835
Location: Sydney

PostPosted: Mon Aug 22, 2016 2:21 am    Post subject: Reply with quote

Quote:
Beat $30 John Campbell still can not beat its speedup


Dan, only $30 !! That is not a confident bet !

The bet would be difficult to test as it is difficult to compare like with like. (I would like to see the Laipe test using real*8, as FTN95 using SAXPY8@ can certainly run a lot faster than 3493.63 seconds for 1 thread. Actually the test has been updated to real*4 so I shall find my test results and update this post.)

I do now have an !$OMP skyline solver working that performs well on my i7-4790K for large matrices (tested up to 23 gb) It uses a cache based strategy and also an attempt at load balancing between the threads. It all depends on the characteristics of the skyline profile.

For a single thread solution, I would recommend looking at the new SSE/AVX instructions in FTN95 /64 (SAXPY8@ and DOT_PRODUCT8@). They do provide a significant speed-up and compare well to other 64-bit compilers, especially for realistic tests.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1647
Location: South Pole, Antarctica

PostPosted: Mon Aug 22, 2016 2:59 am    Post subject: Reply with quote

John,

The test we have done here was for Real*8 dense matrices.
That is more or less a general purpose choice and in interest to everyone.

As to skyline solver - FTN95 needs conversion of LAIPE.LIB to LAIPE.DLL as i wrote above. Plus preliminary LAIPE has a factor of 1.5-2 additional advantage on AMD processors. So with just $30 i'm just saving you from harder beats Smile

You can download REAL*8 and 64bit tests of skyline solvers from equation dot com and compare with yours. Or ask the developer the original source code for this test.
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5044
Location: Salford, UK

PostPosted: Mon Aug 22, 2016 7:33 am    Post subject: Reply with quote

Tony

If you do try FTN95 and begin to get into ClearWin+ then it would be worth your looking at winio@ with %di. Potentially this allows you to use an existing dialog resource script, created via Visual Studio, in ClearWin+ code.

After installing FTN95, you will find sample code for this approach in C:\Users\xxx\Documents\FTN95 Examples\clearwin\di.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 1835
Location: Sydney

PostPosted: Mon Aug 22, 2016 1:47 pm    Post subject: Reply with quote

Dan,

I did a run of real*8 matrix multiply for A(15000,10998) and B(10998,12000). I used 2 algorithms, both based on using subroutine DAXPY approach, with the second approach using a cached strategy. This is a good approach for large arrays.
I ran the test on my i7-4790k using 1 to 8 threads selected. It has 4 cores and uses hyperthreading to provide two threads per core.

The results are elapsed time in seconds for the 3 tests x 1:8 threads selected:

Code:
Number of cores / threads   Laipe Elapsed Time (sec)   gFortran on i7-4790K   i7-4790K with Cache strategy
1      3493.63    1054.73     589.03
2      1730.18     560.52     308.41
3      1151.58     383.76     207.92
4       865.85     369.88     161.75
5       691.38     380.11     200.16
6       580.7      399.97     192.53
7       497.99     365.84     191.86
8       434.81     429.91     192.43

The 4 columns are:
1 : number of threads
2 : elapse time for real*4 calc, from Laipe website
3 : elapsed time using gFortran on i7-4790k
4 : elapsed time using gFortran on i7-4790k and cache strategy

What I would like to point out is that :
1) the results I have presented are much faster than Laipe solution and
2) my results do not show a uniform efficiency.

Calculations using large arrays and OpenMP suffer from a cache to memory bottleneck, which is a significant problem. Managing the cache for multiple threads becomes an even larger problem.
The i7-4790k is claimed to support 8 threads, on 4 cores using hyper-threading. These results hopefully show how difficult this can be.
Note that 4 threads perform very well with my algorithm but do not continue to improve with more threads. I expect the problem is in the way L3 cache is shared among the threads. This result varies, depending on the type of calculation. For this calculation, all calls to daxpy are for the same value of n=15000, which may be the reason for this performance tanking.

The results for 1 thread would be similar if using FTN95 and axpy8@, which is the same performance as Laipe reports with 6 threads.

I hope you find these interesting. I will collect my bet the next time I am at the "North Pole"

John
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1647
Location: South Pole, Antarctica

PostPosted: Mon Aug 22, 2016 11:43 pm    Post subject: Reply with quote

Not that fast, John Exclamation First, you are comparing apples to oranges. Please do all the tests on the same computer.

Then I am not sure matrix multiply is the test we have to run. Usually it is never the slowest in linear algebra, and is just relatively slow in your particular case due to large matrix size. Run the common interest matrix equations Ax=B with dense, block or skyline shapes. These mostly kill CPU time of 99.99% people for days, weeks and months of continued calculations because amount of simulations growing super-linearly with amount of nodes.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 1835
Location: Sydney

PostPosted: Tue Aug 23, 2016 12:09 am    Post subject: Reply with quote

Dan,

Why all the new rules !
I don't have access to the type of computer Laipe used. It actually sounds much better than an i7-4790K?
I compared to the Laipe calculation, although they used real*4 and I used real*8, which again should be a disadvantage to me.

The basic comparison is :
for a single thread; Laipe is 3494 seconds, cached is 589 seconds which is nearly 6 x faster.
at 4 threads, cached is 162 seconds, while Laipe requires 23 threads to achieve this same performance.

My results for this test are showing that above 4 threads hyper-threading is not working. Not sure why. It could also relate to my assumptions of sharing L3 cache, which may need improving. My approach is still giving a good result.

While Laipe shows very good efficiency with using multiple threads, it does not appear to be getting the vectorising efficiency that can be easily obtained from other compilers, including FTN95. It is not about thread efficiency but performance.

My skyline solver also uses OpenMP and a cache strategy. It works very well also.

I am considering coming to the "North Pole" to collect.

John

PS: I understand L1:L2:L3 cache sharing and hyper-threading about as well as dark matter! Any explanation I have read is incomprehensible so if someone has a clear understanding I would like to read it. It appears to be a mix of technical and marketing, and there is a lot of that around.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1647
Location: South Pole, Antarctica

PostPosted: Tue Aug 23, 2016 4:18 am    Post subject: Reply with quote

Because we are comparing LAIPE and your methods, not different computers, John. So take the same tests with the same precision and run two on the same computer, and then ... when you will come we will collect from your almost South pole. Is beer OK there? Or 5-10 years old cognacs?

Take real linear algebra AX=B test not something which is tiny building blocks of it, auxiliary subroutines, general purpose utilities, system level utilities, supplemental material etc. What you are comparing to now is not even in the LAIPE library (!). Not only you are not comparing apples to apples, you are comparing apples to vacuum Very Happy

It has no big sense to run matmul for comparison because typically if matrix fits the cache matmul goes very fast and overhead of parallelization makes the parallelization useless. If matrix is too large then we deal with the bottleneck of the memory subsystem and the tricks of better utilization of the caches not the speed of compiler, processor, or method of solution.

This test on equation website was some demonstration of how well LAIPE2 compiled by GFortran (means also it is much slower then original LAIPE compiled with iFort or Lahey) handles multithreading on some very slow server and not how fast it is.
The same I have done in the past to show that the FTN95 is the best in the world respect to scaling with the number of threads DOING FP CALCULATIONS: we have only 4 FP cores but compiler runs as if there was 8. There I took a= log(exp(a)) for some arbitrary simulations:

http://forums.silverfrost.com/viewtopic.php?t=2534&postdays=0&postorder=asc&highlight=net+paralell+parallel&start=0

Here is LAIPE content. You can play with Matmul (though i do not see it in documentation), but then take good old LAIPE compiled with Intel Fortran and chose something serious from it to run

"- Constant-Bandwidth, Symmetric, and Positive Definite Systems.
- Variable-Bandwidth, Symmetric, and Positive Definite Systems.
- Dense, Symmetric, and Positive Definite Systems.
- Constant-Bandwidth and Symmetric Systems.
- Variable-Bandwidth and Symmetric Systems.
- Dense and Symmetric Systems.
- Constant-Bandwidth and Asymmetric Systems.
- Variable-Bandwidth and Asymmetric Systems.
- Dense and Asymmetric systems.
- Constant-Bandwidth and Asymmetric Solvers with Partial Pivoting.
- Constant-Bandwidth, Symmetric, and Positive Definite Solvers with Partial Pivoting.
- Constant-Bandwidth and Symmetric Solvers with Partial Pivoting.
- Dense Solvers with Partial Pivoting.
- Dense Solvers with full pivoting."

"This manual covers parallel direct solvers, i.e., Cholesky decomposition, skyline solver, Crout decomposition, multiple entry solvers, and other popular and useful techniques. Solvers for dense and sparse systems are included. More than 90% of scientific and engineering problems are formulated into a system of equations. Solution of system equations is required in many scientific and engineering computing. LAIPE has the most useful and highly efficient solvers for scientific and engineering computing"

From all that i am interested only with block matrix on the main diagonal solvers VAG_S and VAG_D. If anyone would extract them from LIB and put into DLL and that worked with FTN95 you would collect from my North Pole immediately Smile


Last edited by DanRRight on Fri Aug 26, 2016 9:44 am; edited 2 times in total
Back to top
View user's profile Send private message
TonyRichards



Joined: 19 Aug 2016
Posts: 3
Location: Oxford

PostPosted: Tue Aug 23, 2016 11:40 pm    Post subject: Reply with quote

Um, things seem to be veering away from the original post you two!
Dan and John - get a room guys!
Thanks otherwise for the comments to the point.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group