forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

New Kinds
Goto page 1, 2  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Suggestions
View previous topic :: View next topic  
Author Message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Mon Oct 26, 2009 6:35 am    Post subject: New Kinds Reply with quote

Paul,

Following on from the discussion of KIND, is it an option to provide REAL*6 or INTEGER*6.
There was a time when all reals were calculated in the co-processor, and I thought that real*4 ( and real*8 ) was just a truncated 80-bit real*10. Is this the case ? If so would REAL*6 be a simple extension of managing REAL*4. There is certainly a big gap between R*4 and R*8 in precision and R*6 would provide about 11 significant digits (precision).

I'm not sure of the basis of INTEGER*8 from INTEGER*4, but INTEGER*6 could be a useful alternative ?

Just a thought !

John
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Mon Oct 26, 2009 3:11 pm    Post subject: Reply with quote

John,

I'm a real believer (no pun intended) in REAL*6 and INTEGER*6. The problem is that they aren't native to (x87) coprocessors, and all the operations would need to be coded from scratch (i.e. done in software).

When I used MS Fortran, they had 2 libraries one could link with - one where the math was done largely in software, and one where it was done largely in hardware. They didn't always give the same result! In part, this was because REAL*8 match was done in 64 bits, whereas the coprocessor operations loaded things into 80-bit registers, so that the round-off was potentially different.

Eddie
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Mon Oct 26, 2009 8:37 pm    Post subject: Reply with quote

selected_integer_kind and selected_real_kind allow you to select the precision etc (within certain hardware limits) but these are mapped to those provided by the processor and co-processor. In other words if you asked for the equivalent of *6 then you would get *8 anyway. Providing *6 via software would be slower than the *8 provided by the hardware.
Back to top
View user's profile Send private message AIM Address
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Tue Oct 27, 2009 3:54 am    Post subject: Reply with quote

Paul,

I was under the impression that real*4 and real*8 were done in the 80-bit math co-processor. Results were stored in the word address, with truncation of the accuracy.
So my assumption for real*6 would be that the calcs would be in the coprocessor, but the truncation would be different.
This is not consistent with the statement "providing real*6 via software"
I have also seen past reference to a 64-bit rather than 80-bit arithmetic (SSE?) instructions, which would change this assumption.
Is the co-processor no longer used and are real*4 and real*8 calculations now done differently ?

John
Back to top
View user's profile Send private message
Sebastian



Joined: 20 Feb 2008
Posts: 177

PostPosted: Tue Oct 27, 2009 8:04 am    Post subject: Reply with quote

Quote:
So my assumption for real*6 would be that the calcs would be in the coprocessor, but the truncation would be different.

The fpu has no support for that. It handles 32bit (single precision), 64bit (double precision) and 80bit (extended precision) operations. If you need more information just post or read through some hardware docs like http://sandpile.org/ia32/opc_fpu.htm or the intel (amd) instruction set references.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Tue Oct 27, 2009 8:26 am    Post subject: Reply with quote

Is 80-bit extended precision the same as real*10 or is real*10 software implemented ?
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Tue Oct 27, 2009 9:10 am    Post subject: Reply with quote

Yes extended precision is the same as real*10.
Back to top
View user's profile Send private message AIM Address
Sebastian



Joined: 20 Feb 2008
Posts: 177

PostPosted: Tue Oct 27, 2009 9:24 am    Post subject: Reply with quote

The *x usually specifies the amount of bytes required for the data type (this may be awfully wrong for non-x86/non-PC fortran implementations) so real*10 is the 10byte=80bit floating point type as Paul said.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Wed Oct 28, 2009 1:24 am    Post subject: Reply with quote

Paul,

I am trying to understand how real*6 could be done and real*10 is done.
My question re real*10 is : Is it hardware implemented, with all calculations done in the 80-bit math co-processor, or is that an obsolete technology?

To test out this I wrote a program that repeated vector dot product on 1000 element arrays as real*8 or real*10, using dot_product intrinsic or simple function which has a loop:-
Code:
      REAL*10 FUNCTION VECSUM_10 (A, B, N)
!
!     Performs a vector dot product  VECSUM =  [A] . [B]
!     account is taken of the leading zero terms in the vectors
!
      integer*4,                 intent (in)    :: n
      real*10,   dimension(n),   intent (in)    :: a
      real*10,   dimension(n),   intent (in)    :: b
!
      real*10   c
      integer*4 i
!
      c = 0
      do i = 1,n
         if (a(i) /= 0) exit
      end do
      do i = i,n
         c = c + a(i)*b(i)
      end do
!
      vecsum_10 = c
      return
!
      end
 


Compiling without /opt The results are :-

Code:
 Test Type      Routine      Seconds      Ratio
real*8 test     vecsum_8       4.28        1.00
real*8 test     dot_product    4.276       1.00
real*10 test    vecsum_10      5.515       1.29
real*10 test    dot_product    7.432       1.74
real*4 test     vecsum_4       2.923       0.68

Real*10 takes 30% longer that real*8, but 74% longer using the dot_product intrinsic. Real*4 takes only 68% of real*8 computation time.

This indicates to me that real*10 is not simply taking the 80-bit result from the math co-processor while real*8 and real*4 truncate the output. Either this or the instructions to move 4, 8 or 10 bytes take a lot of time.

Any advice ?

John
Back to top
View user's profile Send private message
Sebastian



Joined: 20 Feb 2008
Posts: 177

PostPosted: Wed Oct 28, 2009 8:14 am    Post subject: Reply with quote

Quote:
This indicates to me that real*10 is not simply taking the 80-bit result from the math co-processor while real*8 and real*4 truncate the output.

How do you come to that conclusion? There are a lot of implementation details in the fpu that make 80bit usage the non-standard like there are no operations like "add an 80bit value from memory to an fpu register" like there is for 32bit and 64bit. 80bit values always have to be loaded into a temp fpu register first. Also keep in mind that of course reading 10 bytes from memory obviously takes longer than only reading 4 or 8 bytes, especially since 10 bytes usually are laid out to occupy 16 bytes due to better access speeds.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Wed Oct 28, 2009 8:30 am    Post subject: Reply with quote

Sebastian wrote "How do you come to that conclusion? " I also said that "Either this or the instructions to move 4, 8 or 10 bytes take a lot of time." I just find that the ratios of 130% and 68% are big spreads for just moving bytes, as compared to floating point calculation times. Is an 80-bit fpu always used for real calcualtions ?

Sebastion also wrote :
Quote:
Also keep in mind that of course reading 10 bytes from memory obviously takes longer than only reading 4 or 8 bytes, especially since 10 bytes usually are laid out to occupy 16 bytes due to better access speeds.

Again I'm surprised how much longer it takes for reading values and when is this 16 byte claim true ?

John
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Wed Oct 28, 2009 9:20 am    Post subject: Reply with quote

The answer to these questions can be researched by using /explist on the command line. This will show the assembly instructions generated by FTN95. You will then need to look up these instructions in an Intel manual.

There will be little or no software intervention except perhaps in the case of INTEGER*8. The native 32, 64 and 80 bit instructions will not be truncated unless your source code stipulates this. You will also be able to look up the timing of the native instructions.

Basically FTN95 will aim to give you the maximum precision that is available in any given situation, even to the point of sometimes using 80 bits internally when a 64 bit result is being generated.

With the speed of modern processors, the speed of a native 32 bit multiply (say) as against a 64 bit native multiply is rarely an issue.
Back to top
View user's profile Send private message AIM Address
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Wed Oct 28, 2009 11:37 am    Post subject: Reply with quote

Speed may not be an issue, but storage is, and if (say) REAL*6 was good enough for (again, say) FE calculations, then one would have 25% longer arrays to do the matrix operations in - while sticking with a 32-bit OS and the limitations of that. That puts off the evil moment when the solution has to use the hard disk .... which slows the process down hugely.

It's a very ong time since I knew my way round the 8087 fpu book (8087 applications and programming) and my understanding is that first MMX and later SSE provided alternate ways to do certain math operations. I got lost at that point. None of the standard methods countenance REAL*6.

Eddie
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Wed Oct 28, 2009 2:09 pm    Post subject: Reply with quote

Thanks Eddie for providing the names of the more recent MMX and later SSE instructions.
I apologise, but I am not sufficiently familiar with assembler to understand what is happening in /explist.
Can't I get a clear answer to my question of is the real*x maths done in the co-processor or is it the more recent instructions ?
I am surprised by the difference in gross computation time between real*4, *8 and *10. Is the only explaination the different in moving the necessary bytes.
Any clear advice would be appreciated.
John
Back to top
View user's profile Send private message
Sebastian



Joined: 20 Feb 2008
Posts: 177

PostPosted: Wed Oct 28, 2009 4:26 pm    Post subject: Reply with quote

As far as I know MMX/SSE/SSE2 do not support 80bit registers.


Quote:
I am surprised by the difference in gross computation time between real*4, *8 and *10. Is the only explaination the different in moving the necessary bytes.

As I've already noted above there are fundamental differences in how 80bit data can be used in the fpu compared to 32bit and 64bit. And the differences between 32bit and 64bit access are data loading and the time required for the respective instruction which depends on the CPU's implementation. So you'd have to ask Intel/AMD why 64bit operations are slower than 32bit.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Suggestions All times are GMT + 1 Hour
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group