 |
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
johannes
Joined: 21 Jan 2011 Posts: 65 Location: Leimen, Germany
|
Posted: Tue May 29, 2012 9:02 am Post subject: Maximum length of a character variable? |
|
|
Hi all,
here an odd situation. There are files containing lines of very large length, up to 1 to 3 million characters until EOL. The format dates back to times with punched tapes, but did survive. When you edit such a file with notepad++, the length of the line is displayed - and it is a large number.
I had no luck in reading the long line using this piece of code. It works for much smaller numbers but not for 2 millions.
INTEGER :: nlen=2000000,iline=0
CHARACTER(LEN=nlen) :: cline
DO
iline=iline+1
READ (logunit,'(A)',iostat=istatus) cline
print *,'iline,istatus=',iline,istatus,TRIM(cline)
if (istatus/=0) EXIT
ENDDO
Question: is there a limit on the length of character variables? Who has any idea to split a long record into pieces?
A test file could be provided. |
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2615 Location: Sydney
|
Posted: Tue May 29, 2012 12:39 pm Post subject: |
|
|
I would recommend using "character*1 contents(big_number)" rather than "character*big_number contents".
I do not expect that there is a problem with a maximum line length.
I changed a program that reads any file to also store the contents.
Hopefully this will work. Let me know how it goes.
Code: | !
CHARACTER file_name*128, get_file_name*128, c, c_typ*4
INTEGER*4 ic, iostat, jc
integer*8 count_ic(0:256), n, nc, na, nz, nl
integer*8 :: one = 1
EXTERNAL get_file_name
!
integer*4, parameter :: million = 1000000
integer*4, parameter :: max_file_size = 50*million
integer*4 num_line, next_char, line_len, max_len
integer*4 line_index(million)
character line_content(max_file_size)
!
character ch_name(0:32)*34
data ch_name( 0) / ' 00 NUL Null character ' /
data ch_name( 1) / ' 01 SOH Start of Header ' /
data ch_name( 2) / ' 02 STX Start of Text ' /
data ch_name( 3) / ' 03 ETX End of Text ' /
data ch_name( 4) / ' 04 EOT End of Transmission ' /
data ch_name( 5) / ' 05 ENQ Enquiry ' /
data ch_name( 6) / ' 06 ACK Acknowledgment ' /
data ch_name( 7) / ' 07 BEL Bell ' /
data ch_name( 8) / ' 08 BS Backspace ' /
data ch_name( 9) / ' 09 HT Horizontal Tab ' /
data ch_name( 10) / ' 0A LF Line feed ' /
data ch_name( 11) / ' 0B VT Vertical Tab ' /
data ch_name( 12) / ' 0C FF Form feed ' /
data ch_name( 13) / ' 0D CR Carriage return ' /
data ch_name( 14) / ' 0E SO Shift Out ' /
data ch_name( 15) / ' 0F SI Shift In ' /
data ch_name( 16) / ' 10 DLE Data Link Escape ' /
data ch_name( 17) / ' 11 DC1 Device Control 1 (XON) ' /
data ch_name( 18) / ' 12 DC2 Device Control 2 ' /
data ch_name( 19) / ' 13 DC3 Device Control 3 (XOFF) ' /
data ch_name( 20) / ' 14 DC4 Device Control 4 ' /
data ch_name( 21) / ' 15 NAK Negative Acknowledgement ' /
data ch_name( 22) / ' 16 SYN Synchronous idle ' /
data ch_name( 23) / ' 17 ETB End of Transmission Block' /
data ch_name( 24) / ' 18 CAN Cancel ' /
data ch_name( 25) / ' 19 EM End of Medium ' /
data ch_name( 26) / ' 1A SUB Substitute ' /
data ch_name( 27) / ' 1B ESC Escape ' /
data ch_name( 28) / ' 1C FS File Separator ' /
data ch_name( 29) / ' 1D GS Group Separator ' /
data ch_name( 30) / ' 1E RS Record Separator ' /
data ch_name( 31) / ' 1F US Unit Separator ' /
data ch_name( 32) / ' 7F DEL Delete ' /
!
! Open the text file as transparent
!
file_name = get_file_name ()
!
open (unit = 11, &
file = file_name, &
status = 'OLD', &
form = 'UNFORMATTED', &
access = 'TRANSPARENT', &
iostat = iostat)
write (*,2000) 'Opening file ',trim(file_name),' iostat = ',iostat
2000 format (a,a,a,i0)
!
! Read all characters and lines
!
count_ic = 0
n = 0
nl = 0
nc = 0
na = 0
nz = 0
!
next_char = 1
num_line = 1
line_len = 0
max_len = 0
line_index(num_line) = next_char
!
do
read (unit=11, iostat=iostat) c
if (iostat /= 0) exit
!
n = n + one
ic = ichar(c)
count_ic(ic) = count_ic(ic)+one
!
|
|
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2615 Location: Sydney
|
Posted: Tue May 29, 2012 12:41 pm Post subject: |
|
|
Code: | !
select case (ic)
case (10)
nl = nl + one ! line control characters ; 10:LF indicates end of line
num_line = num_line + 1
if (num_line < million) line_index(num_line) = next_char
max_len = max (max_len, line_len)
line_len = 0
case (13)
nl = nl + one ! line control characters ; 10:LF indicates end of line
case (0:9,11,12,14:31)
nc = nc + one ! control characters
case (32:126)
na = na + one ! normal character
line_content(next_char) = c
if (next_char < max_file_size) next_char = next_char + 1
line_len = line_len + 1
case (127:256)
nz = nz + one ! other character
case default
write (*,*) 'Unrecognised character : ichar =',ic
end select
end do
!
write (*,2001) trim(file_name)
2001 format (/'Count of active characters in file ',a// 'Ichar C Char_Count')
2002 format (i5,a5,b'zz,zzz,zzz,zz#',2x,a)
do ic = 256,0,-1
if (count_ic(ic) < one) cycle
if (ic >= 32 .AND. ic < 128) then
write (*,2002) ic, char(ic), count_ic(ic)
else if (ic < 32 .or. ic == 128) then
jc = min (ic,32)
c_typ = ch_name(jc)(4:7)
write (*,2002) ic, c_typ, count_ic(ic), trim (ch_name(jc)(10:))
else
c_typ = '???'
if (ic==10) c_typ = '<LF>'
if (ic==13) c_typ = '<CR>'
write (*,2002) ic, c_typ, count_ic(ic)
end if
end do
!
write (*,*) ' '
if (iostat == -1) then
write (*,2003) n,' characters read from file ',trim(file_name)
else
write (*,2004) 'end of file after',n,' characters : IOSTAT = ',iostat
end if
2003 format (b'zz,zzz,zzz,zz#',a,a)
2004 format (a,b'zz,zzz,zzz,zz#',a,i0)
write (*,2003) na, ' normal text characters 32-126'
write (*,2003) nl, ' line control characters 10,13'
write (*,2003) nc, ' other control characters 0-31'
write (*,2003) nz, ' non-printable characters 127-255'
nl = max (count_ic(10), count_ic(13))
write (*,2003) nl, ' lines identified if text file'
!
if (count_ic(10) == nl .and. count_ic(13) == 0) then
write (*,2000) ' NOTE : No <CR> in file : not typical DOS text file'
end if
!
if ( count_ic(10) /= count_ic(13)) then
if ( count_ic(10) == 0) then
write (*,2000) ' Only <CR> used for end of line'
else if ( count_ic(13) == 0) then
write (*,2000) ' Only <LF> used for end of line'
else
write (*,2000) ' Inconsistent <LF> and <CR> count found'
end if
end if
!
write (*,*) ' '
write (*,2003) num_line-1, ' lines stored'
write (*,2003) max_len, ' maximum line length identified'
write (*,2003) next_char-1, ' text characters identified in file'
!
end
|
Last edited by JohnCampbell on Tue May 29, 2012 12:44 pm; edited 1 time in total |
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2615 Location: Sydney
|
Posted: Tue May 29, 2012 12:43 pm Post subject: |
|
|
We realy need a better way of attaching code examples !
Code: | character*128 function get_file_name ()
character*132 fname
integer*4 i,n
!
call command_line (fname)
!
n = 0
do i = 1,len(fname)
if (fname(i:i) /= '"') then
n = n+1
if (n==i) cycle
fname(n:n) = fname(i:i)
fname(i:i) = ' '
else
fname(i:i) = ' '
end if
end do
!
get_file_name = fname
end function get_file_name
|
You might want to change the code to include control characters and ic>=127 in the line if that suits.
I think the important issues are use a character*1 array and open the file for access = 'TRANSPARENT'
John |
|
Back to top |
|
 |
johannes
Joined: 21 Jan 2011 Posts: 65 Location: Leimen, Germany
|
Posted: Tue May 29, 2012 3:35 pm Post subject: |
|
|
Hello John,
thank you for the brilliant idea. Unformatted reading is the key, Transparent seems not to be necessary.
This is my final piece of code in a subroutine.
Regards,
Johannes
Code: | SUBROUTINE READ_longline(lugerb,cline,n,istatus)
! read very long lines character by character unformatted
INTEGER, INTENT(IN) :: lugerb
CHARACTER(LEN=*), INTENT(OUT) :: cline ! line in a long character variable
INTEGER, INTENT(OUT) :: n ! number of good characters in cline
INTEGER :: iline=0,istatus,ic,ncmax
CHARACTER :: c ! aux character
SAVE iline
! Reading gerbfile quasi line by line as Unformatted character by character
iline=iline+1
ncmax=LEN(cline)
cline=' '
n=0 ! number of characters in line
do
read (unit=lugerb, iostat=istatus) c ! does not need a counter
if (istatus /= 0) THEN ! last line does not end with LF
!print *,cline(1:n)
EXIT
endif
n=n+1 ! n'th character read successfully
ic = ichar(c)
if (ic>=32.and.ic<=126) THEN ! normal character
IF (n<ncmax) THEN
cline(n:n)=c
ELSE
print *,'READ_longline: number of characters in cline exceeded'
istatus=-333
RETURN
ENDIF
else if (ic==10) THEN ! line control characters ; 10:LF indicates end of line
n=n-1
!print *,cline(1:n) ! don't use the LF Character itself
EXIT
else
print *,'READ_longline: unrecognized character:',c,' in line',iline ! should not happen
EXIT
endif
enddo
RETURN
END |
|
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2615 Location: Sydney
|
Posted: Tue May 29, 2012 11:33 pm Post subject: |
|
|
I'm surprised that you do not need TRANSPARENT. I've always assumed you would.
To check, report the number of characters read from the file and compare this to the file size.
John |
|
Back to top |
|
 |
johannes
Joined: 21 Jan 2011 Posts: 65 Location: Leimen, Germany
|
Posted: Wed May 30, 2012 8:56 am Post subject: |
|
|
Hi John,
you are right, access=transparent is the only correct way:
OPEN (unit=lugerb,file=TRIM(file),form='unformatted',access='transparent',iostat=ierror) .
BTW for later readers: The Subject 'maximum length of character variables?' was not correct.
Better: 'maximum length of formatted records?' This seems to be around some 10.000
Regards
johannes |
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|