Silverfrost Forums

Welcome to our forums

Reading Linux/Unix format text files

24 Jun 2017 3:51 #19804

This is a simple follow-up to the thread at https://forums.silverfrost.com/Forum/Topic/1317 The first post in that thread suggested using CARRIAGECONTROL=LIST in the OPEN statement, and all would be well.

Sadly not in FTN95. Compiler error 516, CARRIAGECONTROL is not a recognised keyword in OPEN

Is there a simple way (such as any combination of OPEN keywords) which would guarantee Linux files are read correctly? If not, I see only two options - either the use of John Campbell's get_next_line, which though it would do the job well, would also require recoding of a lot of READ statements, or convert the file beforehand using something like

TYPE input_filename | MORE /P > output_filename

Advice please !

25 Jun 2017 6:22 #19805

Doesn't a simple read work ? ie READ (11,fmt='(a)',iostat=iostat) line

I tried the following program with some Fortran code from a Linux OS and it looked to run ok. Not sure if there is another problem ?

! read_file.f90
      character line*120
      integer*4 i, iostat

      open (unit=11, file='aaefdc.for',action='READ')

      do i = 1,20
        read (11,fmt='(a)', iostat=iostat) line
        write (*,1001) iostat, trim (line)
      end do
 1001 format (i0,1x,a)
      end

I compiled with ftn95 read_file /link

NOTEPAD displays the file without line feeds, so it looks to be the Unix file format problem I am assuming.

25 Jun 2017 7:51 #19807

Thanks, John - I hope you're right! Maybe I'm worrying unnecessarily

I've asked the engineers to send me a sample Linux format CSV file, so will test when I have one. But if all else fails, your get_next_line code should do the trick. The real data that we'll have will be laser scan data from a submersible robot, many millions of {X,Y,Z,intensity} points which will all be managed on-board in ROS which is the Linux-based robot operating system, and stored on removable flash-memory packs.

We'll also have image data which needs processing pixel-by-pixel but that's a solved problem, as long as they store it in one of the GDI+ formats (ideally BMP or uncompressed JPEG). Coded and tested that!

Thanks again for the help!

1 Jul 2017 9:21 #19821

Another potential cause of mysterious problems is the presence of tab characters (Ascii 09H) in Fortran source and formatted data files.

In some program text editors, there is a setting to make tabs (or tabs and other 'whitespace' characters) visible.

Many collections of text utility programs (such as those in Cygwin) provide the utilities dos2unix and unix2dos to convert between MSDOS and Unix/Linux EOL formats.

1 Jul 2017 9:41 #19822

Received the Linux format test data, and no problem with it. Ordinary read statement handles it.

Thanks for the suggestion about tabs, mecej4. Worth remembering. However, at least on this project we have outlawed the tab character so it's not something we need to worry about. Any file that contains tab or any other non-printable character (apart from CR and LF) is treated as corrupted, and will be dealt with accordingly - sent back to the originator to fix!

2 Jul 2017 1:36 #19823

It is fairly easy to manage any set of characters in a routine READ_NEXT_LINE and define a parser to get the fields, separated by any character. Probably easier than sending the file back !

As a first pass I scan a new file to see what unusual characters are in the file. The following is a link to my FTN95 utility to scan any file. Make with ftn95 scan_file.f95 /link. Try 'scan_file file_name /stats'

https://www.dropbox.com/s/trk4k3acmj4gz9e/scan_file.f95?dl=0

John

3 Jul 2017 5:05 #19825

Quoted from mecej4

Many collections of text utility programs (such as those in Cygwin) provide the utilities dos2unix and unix2dos to convert between MSDOS and Unix/Linux EOL formats.

Indeed. These can also be installed in the bash shell that Microsoft provides for Windows 10.

There are also a windows port. https://waterlan.home.xs4all.nl/dos2unix.html

There is no standard conforming way for Fortran to read files with both unix/linux 'end of record' marks (line endings). This is part of the 'Fortran Processor' or environment.

Please login to reply.