|
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
DanRRight
Joined: 10 Mar 2008 Posts: 2819 Location: South Pole, Antarctica
|
Posted: Mon Sep 27, 2021 4:51 am Post subject: Can REAL*4 array store more than 4GB? |
|
|
In small test programs I allocate real*4 arrays to store more than 4GB no problem. But when in large program they unpredictably crash program with access violation even sometimes with smaller than 4 GB. Is this because internally the indexes of real*4 arrays use integer*4 numbers and they somehow get integer overflow ? |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7928 Location: Salford, UK
|
Posted: Mon Sep 27, 2021 8:23 am Post subject: |
|
|
The addressing uses 64 bit integers for both 32 bit and 64 bit applications so I assume the fault lies elsewhere. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2819 Location: South Pole, Antarctica
|
Posted: Mon Sep 27, 2021 9:13 pm Post subject: |
|
|
I do not know where to search for error... Here is where crash happen
Code: | real*4, dimension(:,:), allocatable :: Partcl4
write(19,err=970) partcl4 |
Crashes for larger than few GB arrays, smaller ones usually go OK.
May be FTN95 has limitation on size of writable array such way ? First dimension is around 10, second around 100-500 M
And if substitute write with DO loop then all works OK but 10x slower
Code: | do iii=1,NumbOfPartInData(ion)
write(19,err=970) partcl4(:,iii)
enddo
|
Could be that denormal numbers somehow are involved ? Such numbers could appear and loaded from large data set ( i read them as real*8 and then i assign these real*8 to real*4 to save on space and speed) |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7928 Location: Salford, UK
|
Posted: Tue Sep 28, 2021 9:13 am Post subject: |
|
|
Dan
If you have a reasonably small sample program that demonstrates the failure then we could investigate. We would need a "working" program and details of the command line arguments (did you mention if it's 64 bits?). |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Wed Sep 29, 2021 9:13 am Post subject: |
|
|
Paul,
This is an example of the problem.
It appears that unformatted binary file records can not exceed 2-gb in size.
I presume the record header/footer does not support it.
The test code is: Code: | real*4, parameter :: gb = 2.5 ! = 1.5 works, but = 2.5 fails
real*4 :: size_6gb = gb * 2.**30
integer*4 :: n,m,i,j, iostat
integer*1 :: header(17)
integer*4 :: H4(4)
equivalence ( header(2), H4(1) )
integer*4, dimension(:,:), allocatable :: Partcl4
m = 2**20
N = size_6gb / (4.*m)
write ( *,*) 'target size =',gb,' GBytes'
write ( *,*) 'target dimensions =',n,m
write ( *,*) size_6gb / (n*m), ' should = 4.0 bytes per word'
open (unit=19, file='large_binary.bin', form='unformatted', iostat=iostat)
write (*,*) ' file=large_binary.bin : opening unformatted file : iostat=',iostat
allocate ( Partcl4(N,M), stat=iostat )
write ( *,*) ' Partcl4(N,M) allocated, stat=',iostat
write ( *,*) ' size(Partcl4) =',size(Partcl4)
do j = 1,m
do i = 1,n
Partcl4(i,j) = i+j
end do
end do
write ( *,*) ' Partcl4(N,M) initialised'
write (19,iostat=iostat) partcl4
write (*,*) ' large record written, iostat=',iostat
close (unit=19)
!z open (unit=11, file='large_binary.bin', access='transparent', iostat=iostat)
open (unit=11, file='large_binary.bin', access='stream', iostat=iostat)
write (*,*) ' opening transparent file : iostat=',iostat
read (11) header
close (unit=11)
write (*,*) 'record header'
do i = 1,size(header)
write (*,*) i, header(i)
end do
do i = 1,size(h4)
write (*,*) i, h4(i)
end do
end |
My build test is Code: | del %1.exe >> %1.log
del large_binary_file.bin >> %1.log
ftn95 %1 /64 /debug /link >> %1.log
del large_binary.bin >> %1.log
%1 >> %1.log
dir *.bin >> %1.log
type %1.log |
my test results WHERE Code: | [FTN95/x64 Ver. 8.74.0 Copyright (c) Silverfrost Ltd 1993-2021]
[Current options] 64;DEBUG;ERROR_NUMBERS;IMPLICIT_NONE;INTL;LINK;LOGL;
NO ERRORS [<main program> FTN95 v8.74.0]
[SLINK64 v3.02, Copyright (c) Silverfrost Ltd. 2015-2021]
Loading c:\temp\forum\danR\lgotemp@.obj
Creating executable file gb6_file.exe
target size = 1.50000 GBytes
target dimensions = 384 1048576
4.00000 should = 4.0 bytes per word
file=large_binary.bin : opening unformatted file : iostat= 0
Partcl4(N,M) allocated, stat= 0
size(Partcl4) = 402653184
Partcl4(N,M) initialised
large record written, iostat= 0
opening transparent file : iostat= 0
record header
1 -1
2 0
3 0
4 0
5 96
6 2
7 0
8 0
9 0
10 3
11 0
12 0
13 0
14 4
Volume in drive C is Acer
Volume Serial Number is 5CF1-CCA3
Directory of c:\temp\forum\danR
29/09/2021 05:54 PM 1,610,612,746 large_binary.bin
2 File(s) 1,610,612,746 bytes
0 Dir(s) 689,366,872,064 bytes free |
It woul |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Wed Sep 29, 2021 9:56 am Post subject: |
|
|
/ctd
It would be good to:
1) support records greater than 2GBytes
2) have a compiler option to support gFortran and iFort record header formats, which may support larger records. I don't know the format.
It appears that gFortran writes records larger than ~2gb (or ~ 4gb) as multiple records.
My gFortran test code is Code: | real*4, parameter :: gb = 4.5 ! = 1.5 works, but = 2.5 fails
real*4 :: size_6gb = gb * 2.**30
integer*4 :: n,m,i,j, iostat
integer*1 :: header(16)
integer*4 :: H4(4)
equivalence ( header(1), H4(1) )
integer*4, dimension(:,:), allocatable :: Partcl4
m = 2**20
N = size_6gb / (4.*m)
write ( *,*) 'target size =',gb,' GBytes'
write ( *,*) 'target dimensions =',n,m
write ( *,*) size_6gb / (n*m), ' should = 4.0 bytes per word'
open (unit=19, file='large_binary.bin', form='unformatted', iostat=iostat)
write (*,*) ' file=large_binary.bin : opening unformatted file : iostat=',iostat
allocate ( Partcl4(N,M), stat=iostat )
write ( *,*) ' Partcl4(N,M) allocated, stat=',iostat
write ( *,*) ' size(Partcl4) =',size(Partcl4)
do j = 1,m
do i = 1,n
Partcl4(i,j) = i+j
end do
end do
write ( *,*) ' Partcl4(N,M) initialised'
write (19,iostat=iostat) partcl4
write (*,*) ' large record written, iostat=',iostat
close (unit=19)
!z open (unit=11, file='large_binary.bin', access='transparent', iostat=iostat)
open (unit=11, file='large_binary.bin', access='stream', iostat=iostat)
write (*,*) ' opening transparent file : iostat=',iostat
read (11) header
close (unit=11)
write (*,*) 'record header'
do i = 1,size(header)
write (*,*) i, header(i)
end do
do i = 1,size(h4)
write (*,*) i, h4(i)
end do
end |
the build is Code: | del %1.exe >> %1.log
del large_binary_file.bin >> %1.log
gfortran %1.f90 -fimplicit-none -o2 -o %1.exe >> %1.log 2>&1
del large_binary.bin >> %1.log
%1 >> %1.log
dir *.bin >> %1.log
type %1.log |
You can experiment with record sizes of 1.5, 2.5 and 4.5 gb to get the header format, but it appears to use a 4-byte header and reads/writes multiple records when oversized. |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7928 Location: Salford, UK
|
Posted: Wed Sep 29, 2021 2:46 pm Post subject: |
|
|
Thanks John
I have made a note of the issue that you have raised. |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Thu Sep 30, 2021 9:27 am Post subject: |
|
|
FTN95 uses a 1-byte or 5-byte record header/trailer format.
1-byte is for records smaller than 255 bytes (1-byte /= -1),
while for records >= 255 bytes, byte-1 = -1 and bytes 2-5 = larger record size.
I assume the largest header-data-trailer is limited to 2^31-1 bytes ?
It appears that in FTN95 there is no support for records larger than 2^31-1 bytes in size. This is based on the example above.
gFortran however uses a 4-byte record header/trailer format.
The largest header-data-trailer is limited to 2^31-9 bytes.
However unformatted read/write supports larger "records" by writing a group record consisting of :
multiple records of 2^31-9 bytes (where header is -ve and trailer is +ve) ,
followed by a final record (where header is +ve and trailer is -ve)
The maximum record size can be changed via compiler options,
An option to change the maximum (component) record size, or
an option to change the header and trailer format to to 8-bytes.
If FTN95 does not support records larger than 2^31-1 bytes in size, it has the option of either going to
i) a (1+4+8) 13-byte format or
ii) using a group record approach similar to as gFortran uses.
The gFortran approach could both extend the record size supported by use of a /gFortran_sequential_binary compiler option and also support a gFortran unformatted binary file interchange.
At present I use a Fortran direct-access fixed length record interface file to share data between FTN95 and gFortran. This library also overlays a variable length record format on the direct access file by generating a table of record addresses and sizes which must be stored on another databse file.
Unformatted sequential access binary files are a very simple file interchange.
By combining these with stream/transparent access, they can provide a random access variable length format, by creating a list of record start addresses. |
|
Back to top |
|
|
Robert
Joined: 29 Nov 2006 Posts: 445 Location: Manchester
|
Posted: Thu Sep 30, 2021 5:05 pm Post subject: |
|
|
When would you use a record length greater than 2GB? |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2819 Location: South Pole, Antarctica
|
Posted: Thu Sep 30, 2021 6:43 pm Post subject: |
|
|
Robert,
Typical PIC code files can be 50-100GB, total size for the run up to 2-10TB. In this case using real*4 instead of real*8 is critically important
John,
Thanks for the insights |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7928 Location: Salford, UK
|
Posted: Thu Sep 30, 2021 7:05 pm Post subject: |
|
|
Are there multiple very large records or just one per file?
If there is just one then there may be a case for changing to STREAM access which is now available with FTN95. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2819 Location: South Pole, Antarctica
|
Posted: Thu Sep 30, 2021 9:31 pm Post subject: |
|
|
Paul,
First few records are header which tells what version of file it is,
what dimensions idim1,idim2 for the array Partcl4(idim1,idim2) are etc.
Then follows one large record. But currently because this does not work i record hundreds millions of records of 11 real*4 numbers each. Or if it is 4D array then something like 1000x1000x1000 records of 3 real*4 numbers each.
Is "stream" usable in this case ? If yes, how to do that specifically?
Despite the humongous size compared to Gates' "640k which will fit everyone" if save such arrays as one record this takes just a second or two because the write speed may reach 3-7 GB/second and with PCIe5 next year will reach 15 GB/second on the NVMe drives. But because this does not work for larger files and i use sequential write, this goes for tens of seconds. Reading is also similarly super fast if it is just one record, and tens of seconds if read sequentially all records one by one (speed around 0.5 GB/second)
Such dimensions are not considered large. I use around 1000-2500 cores maximum. Some colleagues got access to 100,000 cores and this is also nothing as widespread and cheap become supercomputers which have millions cores. So my files could be 100x larger easily, literally next day i get access to monstrous supercomputer, this is a matter of changing two-three setup numbers
Last edited by DanRRight on Fri Oct 01, 2021 4:12 am; edited 2 times in total |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Fri Oct 01, 2021 3:07 am Post subject: |
|
|
A couple of points:
1) Large (virtual) records > 2 GB are used to save arrays larger than 2GBytes, so the capability is required.
2) The main benefit of unformatted I/O is the use of an I/O list, so more than a single array can be transferred. eg writing derived type components in F95.
3) The main problem of unformatted I/O is incompatible header/trailer formats between different Fortran compilers. (stream or direct access files can be better for transfers)
4) There can be multiple very large records in a file. For gFortran, unformatted read/write manages each very large record as a set of (2gb-9) size records + a final record , as a group of records.
5) stream I/O can do all this in a single record but you need to code the header/trailer. Yesterday I wrote a simple stream I/O program to test records and using an 8-byte file address and 8-byte header-data-trailer format. It works well. The library routines demonstrate a solution, but without the I/O flexibility + automatic header-data-trailer structure.
Below are links to :
stream_test.f90 : stream I/O record demonstration with 8-byte header-data-trailer. ( this is "in development" so some redundant or incomplete info, eg managing record data structure tables.)
gf6_file.f90 : demonstrates the record structure for unformatted records larger than 2GBytes. This also has routines to test the FTN95 header/trailer format, although I havn't tested them as yet.
test_xx.bat : a test batch file for gFortran.
Dan, as a short term test, the stream_test.f90 could be converted to FTN95, although it is structured as a single array per record.
You can use a more complex I/O list with stream I/O, but you would need to provide the header-data-trailer structure if you wanted to use this for stepping through the file and recognising the structure. All achievable with stream.
https://www.dropbox.com/s/itvo676lbxv60sd/stream_test.f90?dl=0
https://www.dropbox.com/s/63qddfo5bf36pbi/gf6_file.f90?dl=0
https://www.dropbox.com/s/ea0galabqsmpur7/test_xx.bat?dl=0 |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Fri Oct 01, 2021 2:19 pm Post subject: |
|
|
Paul,
I have been trying to utilise stream I/O in FTN95 Ver 8.00.0 but I am not getting "pos=" to work for:
READ (...,pos=address, ...)
WRITE (...,pos=address, ...)
INQUIRE (...,pos=address, ...)
FTN95.chm does reference "pos=" in the OPEN ( ACCESS='STREAM' description.
I am trying to test a random access binary file format using stream access and an 8-byte header/data/trailer record format (which no compiler supports !!)
The syntax I am using is:
integer*8 :: address
integer*4 :: stream_unit, iostat
open ( unit=stream_unit, file=stream_file_name, access='STREAM', form='UNFORMATTED', status='UNKNOWN', iostat=iostat )
READ (unit=stream_unit, pos=address, iostat=iostat ) array(1:nw)
WRITE (unit=stream_unit, pos=address, iostat=iostat ) array(1:nw)
inquire ( unit=stream_unit, pos=stream_address, iostat=iostat )
Is this supported, or is there another way to get or set the file address ?
John |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Fri Oct 01, 2021 3:28 pm Post subject: |
|
|
Paul,
Could it be that "pos=" does not support an integer*8 address ?
I changed to integer*4 address ( which is not adequate ) and the program worked better, providing a more realistic address.
However "inquire ( unit=stream_unit, pos=stream_address, iostat=iostat )"
now returns the last address in use, rather than the next address to be used in the file,
except when at the start of the file, when it returns 1 : the next address to be used. (gFortran returns the next byte address to be used in all cases)
John |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|