I think that it is possibly worth automatically optimizing source code of the form realreal** to realinteger **when the exponent is in actual fact an integer.
I have been looking at the Polyhedron benchmark mp_prop_design which is not only the slowest, but the one where FTN95 does worst. The run times are improved around 10% if you do this optimization by hand (there are lots of instances of this, to powers 2, 3, 4 and 5, all of which are coded in the following form (simplified by me):
beta = alpha**2.0D0
It still doesn't bring the executable anywhere near as fast as the leaders, but it makes a comparatively large difference. On my machine and using the timer in the program, it reduces run times from 10.8 to 9.8 minutes.
I am aware that repeated multiplication can also be quicker than raising to a small power, but do not know where the cutoff happens.
Eddie