Topic: Binary file in FORTRAN in General

christyleomin

Posts: 155

Back to Top

28 Jul 2011 8:36 #8641

I am trying (or need to!) something I have never done in FORTRAN! I'm mainly into coding numerical stuff in Fortran.

Now, I need to write a binary file which is to be generated from another huge file (which contains enormous data).

Can anyone guide me in write direction? I know the info provided is a bit vague.Howver, if you can help or need more clarifications please let me know.

Thanks a lot in advance.

Christy

christyleomin

Posts: 155

Back to Top

28 Jul 2011 4:00 #8643

Please can anyone help urgently?

PaulLaidler

Posts: 7975 Salford, UK

Back to Top

29 Jul 2011 7:00 #8645

You may need to use unformatted WRITE statements using standard Fortran. Alternatively FTN95 has its own binary write via WRITEF@ etc.

christyleomin

Posts: 155

Back to Top

29 Jul 2011 7:30 #8646

Please can you suggest a reference/example concerning this (writing unformatted write statements)??

Please help

JohnCampbell

Posts: 2526 Sydney

Back to Top

29 Jul 2011 8:12 #8647

I'd recommend that you take the simpler approach and write using formatted write statements to a text file. There are a number of reasons I would recommend this.

It is easier to do text. Binary is for when you have sorted it all out and want to retain maximum real precision.
It is easier to check as you go. Open the text file with NOTEPAD and you can easily check what you have created.
If you are writing real numbers be sure that you retain sufficient precision in the format you use. This can also apply to large integers. You can fix this as you go.
Don't expect the first attempt will be the last version, so start with an easy part first, then build on it.

An example of writing information to (say) unit 12

OPEN (unit=12, file='Echo_output.txt', iostat=iostat) ! open the file if (iostat /= 0) write (,) 'error opening file',iostat

do i = 1, many_numbers write (12,2001) i, (iarray(j,i),j=1,a_few) ! assume integer write (12,2002) i, (array(j,i),j=1,a_few) ! assume real end do 2001 format ('Row',i6,' Iarray Values are', 20i8) 2002 format ('Row',i6,' Rarray Values are', 20es13.8 )

Finish with a CLOSE ane exit the program before you look at the results, as the file buffers must be cleared.

Do not write too many values to a single line. Start with a simple approach and build on it.

You have not stated why you need the output and how you are going to use it. Importing into excel is a common requirement, so have a simple structure to your file. (think of what file layout you want !)

The next stage would be to include key words in the output and then be able to search for these words.

Hope this helps.

John

christyleomin

Posts: 155

Back to Top

29 Jul 2011 12:07 #8649

Thanks a lot.

Yes, I know to create a text file.

The idea of binary was as i need to export this file in a software (not Excel) which understands the binary format and carries out a finite element analysis (if you're not familiar with finite element analysis, it is a kind of engineering analysis).

I understand when creating in a binary format will be musch useful as the size will be much smaller than a regular text file in notepad as you (i.e.John) suggests.

Any ideas/examples of creating binary file?

Christy

JohnCampbell

Posts: 2526 Sydney

Back to Top

31 Jul 2011 1:07 #8658

Binary file formats vary across compilers.

To create a binary file you need to ! open OPEN (unit=12, file='file_name', iostat=iostat, access='SEQUENTIAL', form='UNFORMATTED', status='UNKNOWN')

! then write WRITE (12) ((array(i,j),i=1,n),j=1,m)

The write statement creates the 'record' on the file, which consists of: <header> <binary values> <trailer>

The problem is that the format of the <header> and <trailer> vary between compilers. FTN95 and IFORT are different. (FTN95 header format also varies, depending on the length of the record, so be careful. You need to read FTN95 help)

You can overcome this problem, by opening with access='TRANSPARENT' and then writing the <header> and <trailer> in a format expected by the FE program. You can also do this with access='DIRECT'.

Assuming that you are not transfering the files between operating systems, you do not need to consider the binary format. ( see Big-endian and little-endian if you need to go there )

An alternative could be to get FTN95 to provide a compiler option of compatible binary file formats, but this would need to consider which compiler file format to provide.

Years ago a goal of the fortran standard was to improve compatibility between different compilers. There are a few areas, like this, where the standard has gone missing.

You need to read the FTN95 and the FE documentation !

John

IanLambley

Posts: 501 Sunderland

Back to Top

31 Jul 2011 11:35 #8661

John et all, I wrote a bit of example code to convert FTN95 unformatted data to the 4byte header method used by others, and back again. Here they are:

winapp
! convert an unformatted binary file from the FTN95 format
integer*1 input(65536),work(65536*2),ibyte_long(4)
INTEGER*2 HANDLE_in,handle_out, ERROR_CODE,itest_ffh
integer*4 i,iwork_pointer,iwork_end,irec_len, ilong, nline,imove, ilen_extra, itransfer_4bytes, nbytes_read 
equivalence (ibyte_long,ilong)
iwork_pointer  = 1
iwork_end      = 0
itest_ffh      = -1 !-1 as a single byte is ffh the code for a long record
irec_len       = 0
call openr@('ftn95unformattedlong.dat',handle_in,error_code)
call openw@('otherftnformattedlong.dat',handle_out,error_code)
nline = 0
do 
  call READF@(input, handle_in, 65536L, NBYTES_READ, ERROR_CODE) 
  if(NBYTES_READ .eq. 0)goto 9999
  nline = nline + 1
  print *,nline,nbytes_read,error_code
!move previously handled work to the start of the work buffer    
  imove = 0
  do i=iwork_pointer,iwork_end
    imove = imove + 1
    work(imove) = work(i)
  enddo
  iwork_pointer = 1
  iwork_end     = imove
!move the input line to the work  
  do i=1,nbytes_read
    imove = imove + 1
    work(imove) = input(i)
  enddo
  iwork_end     = imove
!
! now process the next chunk from iwork_pointer to iwork_end
  do while (iwork_pointer .lt. iwork_end)
    if(work(iwork_pointer) .eq. itest_ffh)then
    ! a 255 byte detected, so next four bytes are the record length
      iwork_pointer = iwork_pointer + 1
      print *,'iwork_pointer',iwork_pointer
      irec_len = itransfer_4bytes(work(iwork_pointer))
      iwork_pointer = iwork_pointer + 4
      ilen_extra = 5
    else
! not 255 (FFh or -1 as a single byte) therefore
! as a single byte value greater than 127 is treated as a negative number, 
! first clear a 4 byte word and then using equivalence, load up its first byte
! with the integer*1 length 
      ilong = 0
      ibyte_long(1) = work(iwork_pointer)
      irec_len = ilong
      iwork_pointer = iwork_pointer + 1
      ilen_extra = 1
    endif
    print *,'irec_len',irec_len,ilen_extra
  !process record to output data here
! prefix with record length as a 4 byte integer
    call WRITEF@(irec_len,handle_out, 4L, ERROR_CODE)
!output the data
    call WRITEF@(work(iwork_pointer),handle_out, irec_len, ERROR_CODE)
! postfix with record length as a 4 byte integer
    call WRITEF@(irec_len,handle_out, 4L, ERROR_CODE)

  ! now advance to the next record start
    iwork_pointer = iwork_pointer + irec_len + ilen_extra
  
    print*,'next iwork_pointer',iwork_pointer
  enddo
enddo
9999 continue
call CLOSEF@(handle_in, error_code) 
call CLOSEF@(handle_out, error_code) 
print *,nline
end
integer*4 function itransfer_4bytes(idata1)
integer*1 idata1(4),jdata1(4)
integer*4 jdata4,i
equivalence (jdata1,jdata4)
do i=1,4
  jdata1(i) = idata1(i)
  print '(z2.2)',jdata1(i)
enddo
itransfer_4bytes = jdata4
print '(z8.8)',jdata4
end

IanLambley

Posts: 501 Sunderland

Back to Top

31 Jul 2011 11:36 #8662

And in the other direction:

winapp
! convert an unformatted binary file to the FTN95 format
integer*1 input(65536),work(65536*2),ibyte_long(4)
INTEGER*2 handle_in,handle_out, error_code,itest_ffh
integer*4 i,iwork_pointer,iwork_end,irec_len, ilong, nline,imove, ilen_extra, itransfer_4bytes, nbytes_read 
equivalence (ibyte_long,ilong)
iwork_pointer  = 1
iwork_end      = 0
itest_ffh      = -1 !-1 as a single byte is ffh the code for a long record
irec_len       = 0
call openr@('otherftnformattedlong.dat',handle_in,error_code)
call openw@('ftn95formattedlong_new.dat',handle_out,error_code)
nline = 0
do 
  call READF@(input, handle_in, 65536L, nbytes_read, error_code) 
  if(nbytes_read .eq. 0)goto 9999
  nline = nline + 1
  print *,nline,nbytes_read,error_code
!move previously handled work to the start of the work buffer    
  imove = 0
  do i=iwork_pointer,iwork_end
    imove = imove + 1
    work(imove) = work(i)
  enddo
  iwork_pointer = 1
  iwork_end     = imove
!move the input line to the work  
  do i=1,nbytes_read
    imove = imove + 1
    work(imove) = input(i)
  enddo
  iwork_end     = imove
!
! now process the next chunk from iwork_pointer to iwork_end
  do while (iwork_pointer .lt. iwork_end)
!get the 4 byte record length
    irec_len = itransfer_4bytes(work(iwork_pointer))
    iwork_pointer = iwork_pointer + 4
    ilen_extra = 4
    print *,'irec_len',irec_len,ilen_extra
!process record to output data here
!if the record length is less than 255 long, then output a single byte
! record length
    if(irec_len .lt. 255)then
      ilong = irec_len  
      call WRITEF@(ibyte_long,handle_out, 1L, error_code)
      call WRITEF@(work(iwork_pointer),handle_out, irec_len, error_code)
      call WRITEF@(ibyte_long,handle_out, 1L, error_code)
    else
! prefix with record length as ffh + a 4 byte integer
      call WRITEF@(itest_ffh,handle_out, 1L, error_code)
      call WRITEF@(irec_len,handle_out, 4L, error_code)
!output the data
      call WRITEF@(work(iwork_pointer),handle_out, irec_len, error_code)
! postfix with record length as a 4 byte integer + ffH
      call WRITEF@(irec_len,handle_out, 4L, error_code)
      call WRITEF@(itest_ffh,handle_out, 1L, error_code)
    endif
  ! now advance to the next record start
    iwork_pointer = iwork_pointer + irec_len + ilen_extra
  
    print*,'next iwork_pointer',iwork_pointer
  enddo
enddo
9999 continue
call CLOSEF@(handle_in, error_code) 
call CLOSEF@(handle_out, error_code) 
print *,nline
end
integer*4 function itransfer_4bytes(idata1)
integer*1 idata1(4),jdata1(4)
integer*4 jdata4,i
equivalence (jdata1,jdata4)
do i=1,4
  jdata1(i) = idata1(i)
  print '(z2.2)',jdata1(i)
enddo
itransfer_4bytes = jdata4
print '(z8.8)',jdata4
end

They need tidying and generalising. Ian

JohnCampbell

Posts: 2526 Sydney

Back to Top

1 Aug 2011 12:02 #8666

Ian,

Since you have investigated the record structure of the FTN95 sequential binary format, I'd be interested in confirming the format. What is the <header> format for FTN95 ? Is it a 2 byte / 6 byte mixed length count. What is the count; bytes or 4-byte words ? Does FTN95 have a <trailer> to allow backspace ?

I think the idea of a compiler option to standardise on a 4-byte header would be a good option. I think there still remains the problem of if the length is bytes or words, as I think Lahey and Intel do vary with this. I think Intel provide a compiler switch for changing between bytes and words. The standard refers to 'processor-dependent' or 'file storage' units, so it's no help in standardising the operation across compilers!

I do successfully transfer binary files between Lahey and FTN95, but use a fixed length direct access file, which does not include the record length header in the file record.

John

IanLambley

Posts: 501 Sunderland

Back to Top

1 Aug 2011 6:52 #8670

John, FTN95 header format seems to be

For records less than 255 bytes, a one byte length followed by the data terminated by the same one byte length. This allows reading forward and backspacing.
For records of length 255 and above, a one byte flag = ffh or -1 as a signed one byte integer, followed by a 4-byte length, the data and then followed by the 4-byte length and a one byte flag = ffh or -1.

So yes records have a trailer to allow backspace.

FTN95 use bytes as the storage size. I seem to remember that VAX F77 used words for record lengths in direct access files. Don't remember whether that was 2-byte or 4-byte - the latter I think. I would have to boot my VaxStation to find out!

Ian

christyleomin

Posts: 155

Back to Top

1 Aug 2011 2:26 #8677

Thanks a lot John and Ian,

John as you suggested, I'm reading the docuemntation for the FE part.Please can you suggest any reading refernce for Fortran documentation? Shall be really obliged.

Christy

JohnCampbell

Posts: 2526 Sydney

Back to Top

2 Aug 2011 12:13 #8679

Why not look at ftn95.chm; especially OPEN and FORM='UNFORMATTED'

lahey.com also has a fortran 95 reference .pdf which is set out in well. http://www.lahey.com/docs/LangRefEXP72_revG04.pdf It is a good reference document for the Fortran 95 language.

Using ACCESS='TRANSPARENT' you should be able to write a binary file, plus write the header and trailer in a compatible format to what the FE program expects. For writing a real8 array A and HEADER is integer4, the following should work.

HEADER = NM8 ! *8 for 1_byte units WRITE (12) HEADER, ((A(i,j),i=1,n,j=1,m), HEADER

Remember to confirm what 'file storage units' are being expected by the FE program. FTN95 and Lahey use bytes, while Intel uses 1-byte or 4-byte units (default), depending on the compiler option.

christyleomin

Posts: 155

Back to Top

2 Aug 2011 8:13 #8682

John, thanks a lot.Just a very absic question (sorry for that); you said;

*To create a binary file you need to ! open OPEN (unit=12, file='file_name', iostat=iostat, access='SEQUENTIAL', form='UNFORMATTED', status='UNKNOWN')

! then write WRITE (12) ((array(i,j),i=1,n),j=1,m)

The write statement creates the 'record' on the file, which consists of: <header> <binary values> <trailer> *

I don't get what you mean when you said; *The write statement creates the 'record' on the file, which consists of: <header> <binary values> <trailer> *

What you mean by: header, trailer and binary values?

JohnCampbell

Posts: 2526 Sydney

Back to Top

2 Aug 2011 9:21 #8683

When you open a file as access='SEQUENTIAL', form='UNFORMATTED'

You must use a binary write with a binary (unformatted) file: write (12) ((array(i,j),i=1,n),j=1,m)

This creates a 'record' on the file, which can be read with a : read (12) ((array(i,j),i=1,n),j=1,m)

The record consists of the binary memory dump of 'array'. The fortran I/O routines also put a header and a trailer on the record to let the read (and backspace) statement know how long the record is so that the read statements can step from record to record. You don't have to read the full record, although this would be unusual. It is recommended that you use an identical READ to the WRITE

This is the structure of the record in the file.

You should read the the Elements of Fortran > Input/Output section of the Lahey fortran language documentation. Binary files are easy to use. They are just difficult to look at.

John

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

2 Aug 2011 1:12 #8686

Christy,

The 'header' is the set of bytes at the beginning of something (in your case, the array), and the 'trailer' is a set of bytes after it. What these header and trailer sections consist of is system dependent, and they could consist of nothing at all. The header, trailer and the data in between make up one record. A file then consists of a set of records.

Imagine a Fortran program source code. Each line is a record. There isn't any header to each line, but the trailer consists of a CR, LF sequence. The Fortran program is made up of a number of these statements (records).

As to binary values, the computer doesn't record bytes with readable characters, it just keeps a whole list of 1 and 0 (on or off) values. That's binary. These are maddening to contemplate. Therefore people interpret them in different ways. One way is to take half a byte (4 bits) and represent it as:

0, 1, 2, .... 9, A, B, C, D, E or F. (there are 16 possible combinations).

Then, you can see what is in a byte as maybe 0E or CF (FF=255 or 11111111, 00 is 00000000). That is called hexadecimal. But binary is how the computer works.

As there are 256 possible combinations in a whole byte, we could also take each byte as representing a single character. This can cover the whole alphabet in both upper and lower case, and lots of symbols and control codes (like CR and LF). Deep down, however, it's all binary 0 and 1 ...

Eddie

christyleomin

Posts: 155

Back to Top

3 Aug 2011 12:04 #8709

That means the binary file that is generated is not readable, right? IT comprises of just 0's and 1's.Correct?

Basically, I have a l FE (finite element analysis) software (say software 'A') and want to write a binary file through my own program (say 'MY FE program') so that software A understands it.

I have the docuemntation of binary file contents of software A . I have some binary files which the software A has written whilst I worked on it.

Are the binary files of software A readable?

IanLambley

Posts: 501 Sunderland

Back to Top

3 Aug 2011 12:32 #8711

If the unformatted/binary files do have a header and trailer for each record which conforms to the FTN95 standard then yes, they can be read directly.

If they conform to the other standard, then it is harder to read them. The code that I posted above converts between the two unformatted methods. You could run the file from the non-FTN95 through the conversion routine and then open it in FTN95 to read it. If you write a file from FTN95, run it through the other converter for reading by your FE program.

Alternatively if they are fixed length records then the file will be readable with 'direct' access methods and the code I gave above is not useful as there is no header or trailer.

The beauty of binary files is that nothing is lost when writing and reading. the data. The internal representation of the data is directly stored in the file and can reduce file size compared to formatted. For example for a real*8 value which gives say 17 decimal digits precision, it takes 8 bytes to store in unformatte/binary, but you might need a format as ridiculous as F35.17 to store it to the full precision in formatted mode, i.e 35 bytes. Anything less can result in loss of precision. Ian

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

3 Aug 2011 12:38 #8712

Christy,

Not readable by you or I, just using something like Notepad. Correct. Notepad takes each byte, and interprets the 0 and 1 pattern as a character. Some of the resulting characters are invisible, which makes it even harder.

I'll start by considering that you want to READ a file written by the pre-existing program, and then go on to the difficulties in writing a file for it to read.

Your pre-existing Fortran program probably wrote a bunch of INTEGER values each time as 4 bytes, and a bunch of REAL values as 8 bytes at a time. Unlike a text file, there are no 'blank' characters between the values, you just have a sequence of bytes, which means a sequence of 0 and 1 values.

The first problem you have in **reading **a file is to know which bytes are an INTEGER, and which bytes are a REAL . OK, you get that from the FE program documentation.

The next problem is sorting out any header and trailer for each record. This may vary not only computer system to system (if you are using a mainframe) or compiler to compiler - e.g. it is different for Intel Fortran from FTN95. OK, you get the answer to that from help by John, for example.

Finally, you have the problem with which order the bytes are written in. This is the 'big-endian' and 'little-endian' stuff. It's difficult in binary, so suppose we take a 4-digit decimal integer, such as 8632. Big-endian is just this way round, 8 - 6 - 3 - 2. Little endian stores it (in effect) 2 - 3 - 6 - 8. If your file was written by a code compiled with a different compiler, you will only know if you read the relevant documentation. It would be so much easier if both codes were compiled with FTN95 on a PC! The PC cpu has a native way of doing it, but any compiler can do it however it sees fit.

The problem you have in **writing **a file for reading by the other programme contains the same difficulties, but in reverse.

Transferring data as unformatted files always has this difficulty. Formatted files are bad enough, as the trailer to each line could be CR LF or just CR, but that is minor compared to unformatted.

Eddie

PS. I simplified the 8632 case. It is more like 86 - 32 versus 32 - 86 !

PPS. Big and little endian refer to whether the most significant bytes come first or last. The name derives from'Gulliver's Travels' where the hero travelled to various foreign lands, in one he witnessed a war between factions over whether to eat a boiled egg from the wide (big end) or narrow (little end) end! Whoever named this had a literary background, clearly. There used to be big arguments about which version of endian should be used.

The binary files that were written are readable, once you know the 3 tricks above to read them.

E

christyleomin

Posts: 155

Back to Top

3 Aug 2011 11:52 #8726

I have downlaoded hexeditor which reads a binary file, it comes like this;

4c 53 2d 44 59 4e 41 20 75 73 65 72 20 69 6e 70 75 74 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 cb 03 00 00 10 95 2f 03 00 00 70 44 04 00 00 00 2d 24 00 00 06 00 00 00 09 00 00 00 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 40 1f 00 00 09 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 92

However, as Eddie said, this is in 0's and 1's.

I'm really clueless how to start, anything suggested?[/list]