Hi David,
Thanks for the update and thanks to both you and Paul for the work you have put into FTN95 /64. I have found the change from /32 to /64 to be very easy and surprisingly reliable very early on.
I am especially interested in the capabilities that /OPT will provide. I assume that features of the /32 /OPT version will be replicated where appropriate. Has there been a review of what optimisation features are provided in FTN95 vs those in other compilers? I am hoping that there may be some other optimisation features that are more suited to /64 that could be included.
I reviewed a number of the Polyhedron benchmark tests to see what was causing FTN95 to perform poorly.
In some it was just poor coding, especially for the use of array sections, some where temporary copies were being used more often by FTN95. (I must admit that iFort's use of stride to overcome temporary arrays for array sections scares me a lot, as it changes the old Fortran concept of subroutine arguments provide a memory address for contiguous arrays.)
Other common ones were Xreal, eg X2.0 which can be fixed.
Identifying the repetition of groups of calculation were another.
I think an area where FTN95 does not compare well is with long calculations which can require many registers, eg repeated lines in chemistry calculations. This may be due to identifying repeated calcs involving unchanged values, although I learnt to avoid this coding approach many years ago. The alternative approach of providing code that documents the formula and letting a smart compiler optimise is good for auditing code, as long as the compiler gets it right.
Another clean out I found was that there should not be large local or automatic arrays placed on the stack. They should be handled via a virtual ALLOCATE. Stack overflows should not happen due to local arrays.
I finally gave up with the review, as about a third I could clean up with better array structures, another third were identifying repeated or unnecessary calculations, but a significant proportion were just complex code that other smart compilers could pull apart. I was left thinking that re-writing this type of legacy code is a very bad approach and optimising compilers have a definite place for this style of code.
Vector instructions via /SSE and /AVX are a very good example of where significant performance improvement can be achieved with modern hardware. This can be easy with array syntax or identified inner loop calculations by other smart compilers. FTN95 should develop this capability where possible.
Unfortunately optimisation is an area that generates a lot of reported compiler bugs, be it the fault of the compiler or of non-conforming code that use to work.
I would be interested if there could be more discussion of the FTN95 /64 /OPT features and if there are possibilities of other enhancements that the /64 instruction set may readily provide.
John
ps: could an option /32 be provided ? While it is default, it could be a good form of documentation of the compile statement. also /net (or .net) could be another option.