forums.silverfrost.com

KennyT · Joined: 02 Aug 2005 Posts: 318

A normal PC format ASCII file uses CRLF (ASCII 13,10) as a line separator. Unix uses LF (ASCII 10). Both of these formats read quite happily using standard IO calls (if opened with CARRIAGECONTROL=LIST).

However, occasionally, we recieve files that are separated by CR (ASCII 13) alone. These don't read in. The CRs are stripped out and the entire file gets read in by a single READ(LUN,'(A)') call, which then causes a crash because of a buffer overwrite (I think!).

Browsing the web seems to indicate that the files may have originally come from a Mac? I can convert them using Wordpad (which replaces CR with CRLF) but I'd rather my program didn't crash on reading the original file!

So, to cut to the chase, is there a setting on opening a FORMATTED ASCII file that will accept CR, CRLF or LF as a line separator?

TIA

K

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

You could use OPEN (... ACCESS='TRANSPARENT'...) then write your own routine for GET_NEXT_LINE that reads one character at a time to cope with all mixes of <CR> and <LF>.
I have had to do this in the past to also cope with any other non-printable character that can be found in files from other devices.

KennyT · Joined: 02 Aug 2005 Posts: 318

Thanks, John,

I was kinda hoping for something that wouldn't need me to rewrite large chunks of code!

Perhaps I'll use your technique to at least flag that there is a problem and recommend the "Wordpad" solution.

But if anyone's got a neater solution, don't be shy!

Tks

K

JohnHorspool · Joined: 26 Sep 2005 Posts: 270 Location: Gloucestershire UK

Kenny,

Not necessary to rewrite large chunks of code.

You just need to write one routine only, which first checks whether you have a unix style text file or not and then converts it into a DOS type format (using John's method) if found to be a unix format.

Subsequently all your existing code will read it okay. I have also had to do this myself.

regards,
John

KennyT · Joined: 02 Aug 2005 Posts: 318

OK, is there a guide to what to do? I can't see anything in the manual.

Tks

K

JohnHorspool · Joined: 26 Sep 2005 Posts: 270 Location: Gloucestershire UK

Kenny,

John Campbell has given the basics of how to do this.

As an alternative to OPEN with ACCESS='TRANSPARENT' you could use OPENRW@ with READF@ to read one byte at a time.

Assuming you have scanned a file to find that it is in unix format, then perhaps something like this:-

1. rename the file using CISSUE or START_PROCESS@

2. open the file (after renaming) with OPENRW@

3. open a new file using the original file name with OPEN

4. read the original file one byte (one text character) at a time, store all characters that are not CR 13 in a string that you increment in length with each character.

5. when you hit a CR13 write the string contents as a single string of characters to the new file, discard the CR13, set the length of the string back to zero and repeat step 4)

6. when you get to the end of the file just write the string contents to the new file.

7. close the original file and delete it.

8. close the new file which has the same original name as the old file and is now in DOS ascii text format.

that's it !

cheers,
John

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

The following "RE-"tested code should go part of the way. You may have to improve it to cope with:
- other non-printable characters, such as TAB, should not be ignored (see ASCII in wikipedia)
- multiple CR CR or LF LF may imply valid multiple blank lines
The concept should be able to be improved.
As John H suggested, you could use this at the start of the program to clean out the file before using generally, or simply replace the read statement(s) by the subroutine call.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

The rest of the updated test program:

KennyT · Joined: 02 Aug 2005 Posts: 318

Thanks guys! Very Happy

I hadn't realised you'd written so much code yourself, I thought "GET_NEXT_LINE" was a system routine (in much the same way as TRAP_EXCEPTION@ works!)

K