replica nfl jerseysreplica nfl jerseyssoccer jerseyreplica nfl jerseys forums.silverfrost.com :: View topic - BACKSPACE on wide files
forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

BACKSPACE on wide files

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
simon



Joined: 05 Jul 2006
Posts: 299

PostPosted: Tue Feb 11, 2014 12:09 am    Post subject: BACKSPACE on wide files Reply with quote

I am trying to read an external formatted input file. The first column of data in the file contains a representation of the date using a combination of numbers '-' and '/' and then follows a set of columns with data in unknown, and possibly variable, format. For example, a line in the file may look something like:

2010-10/12 12.3 45678.90

It is possible that the first field (the date) can change in width in the file.

Unfortunately, because of the '/' in the date, the line does not seem to be easily read in using FMT='*' in a READ statement. Therefore, the approach I have been taking is:
1. read each line in using FMT='(A)'
2. work out how many characters (n) there are in the date
3. BACKSPACE
4. read only the date using FMT='(An)', and ADVANCE='no'
5. read the data in the rest of the line using another READ statement with FMT='*', and ADVANCE='yes'

Assume that I have a subroutine width_date that takes a character input determines how wide the date is, and outputs a format statement, then the code is as follows:

Code:
READ (UNIT=iin,FMT='(A)',ERR=1,END=2) c
CALL width_date (c,cfmt)
BACKSPACE (UNIT=iin)
READ (UNIT=iin,FMT=cfmt,ADVANCE='no',ERR=1,END=2) cdate
READ (UNIT=iin,FMT=*,ADVANCE='yes',ERR=1)


The above works fine except that when I have a very wide input file, the BACKSPACE does not seem to take me back to the beginning of the line. I'm not quite sure how wide the file needs to be in order for the procedure to stop working, but I have a file that is >23000 columns wide that fails to backspace properly.

Of course, if I made LEN(c) in the first line large enough I would not need to backspace in the file and could change the units in the last two read statements as UNIT=c. But if I don't know how wide the file is, there is a possibility that I may set LEN(c) too small; hence the rather complicated procedure above.

So, my question is: is there any reason why BACKSPACE may not be working as expected if the input file is exceedingly wide?

I can provide an example short program and problem input file to illustrate the problem if anyone needs, but you would have to let me know how best to make these available.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2615
Location: Sydney

PostPosted: Tue Feb 11, 2014 3:11 am    Post subject: Reply with quote

Simon,

If in step 1, you have read in the record into a character string C, is there a reason that you can't skip step 3 and apply steps 4 and 5 to the character string C.

From your sample code, it apprears that steps 4 and 5 are operating on a single line, which is fully contained in string C.

You could even replace step 2 with "Parse string C and return 2 strings, being the date string and the rest of the data.

I must admit that I go back to when you could only backspace on a binary file. Even then variable length binary files were trouble to backspace, which lead me to write a library for variable length binary record file that was based on a fixed length direct access file.
I'm always cautious of using approaches that were considered inefficient 30 years ago.

John
Back to top
View user's profile Send private message
simon



Joined: 05 Jul 2006
Posts: 299

PostPosted: Tue Feb 11, 2014 1:44 pm    Post subject: Reply with quote

Hi John,

I cannot go back to reading the data from c because I don't know whether c has been long enough to contain all the data.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2615
Location: Sydney

PostPosted: Tue Feb 11, 2014 2:04 pm    Post subject: Reply with quote

There have been a few examples of stream input lately. Could you open the file as:
OPEN (unit=22, file='file_name', access='transparent',
form='unformatted', iostat=iostat)

With this you could write a few access routines.
subroutine get_next_character
subroutine get_next_n_characters
subroutine get_rest_of_line
subroutine get_next_line
subroutine get_start_date

Alternatively, how big could a single record be ? 2k, 20k ..
I don't like data files with excessively long records. ( I don't think formatted I/O does either)
Redefine the record structure to allow for & continuation.

Some hopefully helpful ideas ?

John
Back to top
View user's profile Send private message
IanLambley



Joined: 17 Dec 2006
Posts: 506
Location: Sunderland

PostPosted: Wed Feb 12, 2014 12:52 pm    Post subject: Reply with quote

John,
You are right about formatted I/O not being happy with long lines, but in the past, when lines get quite long I have used a RECL= with a large number which seems as though it might allocate a longer buffer than the normal. Now, I know that is really only for direct access, but it seemed to work. I'n not sure which compiler, it might be VAX-11 Fortran, but I think it worked on the FTN77/FTN95 series of compilers as well. Of course all that was a long time ago when I was younger!
Ian
Back to top
View user's profile Send private message Send e-mail
davidb



Joined: 17 Jul 2009
Posts: 560
Location: UK

PostPosted: Wed Feb 12, 2014 6:48 pm    Post subject: Reply with quote

RECL may be used legitimately with Seqeuntial files in Fortran 95 (Section 9.3.4.5 of draft standard).

With Sequential files, it specifies the maximum record size in characters.

The default maximum record length is processor/compiler dependent.

What is the default maximum record length with FTN95? (Paul?)

This could be being exceeded causing simon's problem with wide files.
_________________
Programmer in: Fortran 77/95/2003/2008, C, C++ (& OpenMP), java, Python, Perl
Back to top
View user's profile Send private message
simon



Joined: 05 Jul 2006
Posts: 299

PostPosted: Mon Feb 24, 2014 9:56 am    Post subject: Reply with quote

The simple program below can be used to generate the BACKSPACE problem. It seems that the program fails as soon as the length of the line exceeds 2**13. If the file width exceeds 2*13, backspace will only move back 8192 (2**13) spaces in the file.

If I open the file (line 21) with the RECL specified, the BACKSPACE seems unaffected - the program still moves back only 8192 spaces in the file.

Code:
! This program identifies maximum width of file for which BACKSPACE works.
PROGRAM p
  IMPLICIT NONE
  INTEGER, PARAMETER :: iout=21
  INTEGER, PARAMETER :: iin=11
  INTEGER :: i,n
  INTEGER, DIMENSION(:), ALLOCATABLE :: r
  CHARACTER(LEN=8) :: c0,c1,c2
  n=1
  DO
    ALLOCATE (r(n))
    DO i=1,n
       r(i)=NINT(RANDOM@()*1.0d4)-1
    END DO
    OPEN (UNIT=iout,FILE='test.txt',ACTION='write',FORM='formatted',STATUS='unknown')
    WRITE (UNIT=iout,FMT='(A)') 'A'
    WRITE (UNIT=iout,FMT='(A,32768I5)') 'B',(r(i),i=1,n)
    WRITE (UNIT=iout,FMT='(A)') 'C'
    CLOSE (UNIT=iout)
!
    OPEN (UNIT=iin,FILE='test.txt',ACTION='read',FORM='formatted',STATUS='old')
    READ (UNIT=iin,FMT=*) c0
    READ (UNIT=iin,FMT=*) c1,(r(i),i=1,n)
    BACKSPACE (UNIT=iin)
    READ (UNIT=iin,FMT=*) c2
    CLOSE (UNIT=iin)
    DEALLOCATE (r)
    IF (c2/=c1) EXIT
    n=n+1
  END DO
  PRINT *, 'C1 ',c1
  PRINT *, 'C2 ',c2
  PRINT *, n,1+5*n
END PROGRAM p
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2615
Location: Sydney

PostPosted: Tue Feb 25, 2014 12:18 am    Post subject: Reply with quote

Simon,

I have probably said this before, but why would you require such long text records ?

I'd change:
WRITE (UNIT=iout,FMT='(A,32768I5)') 'B',(r(i),i=1,n)
to:
WRITE (UNIT=iout,FMT='(A,2I5)') 'B',n
WRITE (UNIT=iout,FMT='(A,2I5)') ('B',i,r(i),i=1,n)
or:
WRITE (UNIT=iout,FMT='(A,2I5)') 'B',n
WRITE (UNIT=iout,FMT='(1x,10I5)') (r(i),i=1,n)

When you go to read this, there would be no need for backspace and the overhead for file size or I/O time is not significant.
Provide a file data structure that is easy to manage. I work with survey points files with up to 100 million points and a simple text format that is easy to review saves a lot of time.

There are alternatives to notepad for large files. I even have my own line editor that displays the first 1gb of a text file.
You might claim to have no control of the input file format, but if I received a file as you show, the first thing I would do (have actually done) is write a conversion to a more manageable format and archive the originals. Puting in a few extra <CR><LF> won't cost you much.

John
Back to top
View user's profile Send private message
simon



Joined: 05 Jul 2006
Posts: 299

PostPosted: Tue Feb 25, 2014 2:10 am    Post subject: Reply with quote

Thanks John,

The basic principle I am applying here is that the software I am trying to create should be able to read somebody else's file without me imposing limits on the user. I don't want to have to say to the user "sorry, if your file is wider than 8192 characters then you will have to reformat it somehow." The point is not so much that I want to be able to read or create wide files, the point is IF one has a wide file, how do you read it?

I agree that here is a strong case for using a more sensible file format. But it would be helpful to at least get an error message from BACKSPACE if it has not worked. Anyway, perhaps Paul could include in the maunal somewhere a comment that there is a limit of 8192.

One bonus is that FTN95 can at least read wider files than NAGWare, for example. NAGWare baulks beyond 1024, but it complains at the stage of trying to read the line whereas FTN95 keeps going and you don't immediately realise that there has been an error.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2615
Location: Sydney

PostPosted: Tue Feb 25, 2014 4:03 am    Post subject: Reply with quote

Simon,

You could look at FORM="UNFORMATTED",ACCESS='TRANSPARENT"
I have found this to provide a good solution, as you can manage what ever character buffer you require. It is easy to write subroutines to read or write characters and built up strings and records.
I am not sure if you can use an internal formatted read on a very long character, such as
character buffer*20000

I read the next record into a large character array (1 character at a time) and do my own parsing for numbers etc. It works very well.
The line editor I mentioned has the following declarations which provide for large records and files:
Code:
      INTEGER*4, PARAMETER :: milion =   1000000  ! 1 million
      INTEGER*4, PARAMETER :: MAXLIN = 20*milion  ! max lines in file       20m
      INTEGER*4, PARAMETER :: MAXSTR =950*milion  ! max characters in file 950mb
!
      COMMON /FILCOM/ CSTOR(MAXSTR)
      COMMON /FILIND/ START(MAXLIN), LENGTH(MAXLIN), LINE_ORDER(MAXLIN)
!
      CHARACTER*1   CSTOR
      INTEGER       START, LENGTH, LINE_ORDER
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 8210
Location: Salford, UK

PostPosted: Tue Feb 25, 2014 8:44 am    Post subject: Reply with quote

At a quick glance the limit looks like 32K.
Back to top
View user's profile Send private message AIM Address
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group