|
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
aebolzan
Joined: 06 Jul 2007 Posts: 229 Location: La Plata, Argentina
|
Posted: Wed Oct 19, 2016 5:15 pm Post subject: IOSTAT = 52 |
|
|
I was trying to open what I thought was a simple ASCII file and found that the program crashes. The value returned by IOSTAT was 52 instead of 0, which means invalid character in field. Apparently, the ASCII file that I received was an UTF-8, and once I opened it, I discovered that effectively, there was a strange character at the very first position in the first line of data. The question is: is FTN95 unable to read UTF-8 files?, only ANSI?, and if this were the case: what can I do in order to work with this type of files?
Agustin |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7916 Location: Salford, UK
|
Posted: Wed Oct 19, 2016 6:02 pm Post subject: |
|
|
What do your open and read statements look like? |
|
Back to top |
|
|
aebolzan
Joined: 06 Jul 2007 Posts: 229 Location: La Plata, Argentina
|
Posted: Wed Oct 19, 2016 6:26 pm Post subject: |
|
|
[code]
open(37,file=filename,status='old',action='read')
ndata=0
do
read(37,*,iostat=stat) x1,y1
if(stat/= 0) exit
ndata=ndata+1
end do
close(37) |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Wed Oct 19, 2016 6:40 pm Post subject: |
|
|
If you can post the file so it can be downloaded (i.e. share on Dropbox or GoogleDrive), I'd be happy to take a look at it in detail and report back, either here or directly to you. Just post the link here.
Bill |
|
Back to top |
|
|
aebolzan
Joined: 06 Jul 2007 Posts: 229 Location: La Plata, Argentina
|
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Wed Oct 19, 2016 7:17 pm Post subject: |
|
|
You need to set the permissions to "Anyone with a link". If you need to restrict access, you can use (wahorger@gmail.com) to authorize. Either way is fine. |
|
Back to top |
|
|
aebolzan
Joined: 06 Jul 2007 Posts: 229 Location: La Plata, Argentina
|
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Wed Oct 19, 2016 7:47 pm Post subject: |
|
|
The first 3 characters of the file are non-ASCII. EFBBBF is the UTF-8 byte order marking. The remainder of the file is standard ASCII with CR/LF marks between the data lines.
You could do this a number of ways, but let's assume you don't want to edit the file in any way.
1. You could read the first line with a format (A3,A) so the first three characters would get sucked up and ignored. Then read the data portion using a list directed I/O but on the characters. Then, read the remainder of the file using the list directed I/O. This seems to be the easiest solution. I tested this code against your file and it correctly reported ndata=1999 and istat=-1 (EOF).
Code: |
character*3 ignore_me
character*32 read_me
open(37,file='utf-8-file.dat',status='old',action='read')
ndata=0
read(37,"(a3,a)")ignore_me,read_me
read(read_me,*)x1,y1
print *,"x1,y1=",x1,y1
ndata=1
do
read(37,*,iostat=istat) x1,y1
if(istat/= 0) exit
ndata=ndata+1
end do
print *,istat,ndata
close(37)
end
|
Bill |
|
Back to top |
|
|
aebolzan
Joined: 06 Jul 2007 Posts: 229 Location: La Plata, Argentina
|
Posted: Wed Oct 19, 2016 9:19 pm Post subject: |
|
|
Thanks Bill!, I will try your code, because I do not want to edit each file that comes from my instrument at the lab before running my program. Till now I was importing data as ASCII file in say Origin software and then export as ASCII. The funny thing is that I didn't know that the ASCII file provided by the instrument was actually a UTF-8. Origin had no problem at all, but seems that Fortran cannot deal with this type of ASCII files. Good to know that there is a way to overcome this issue. Thanks again!
Agustin |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Wed Oct 19, 2016 9:53 pm Post subject: |
|
|
Glad to be of some assistance. |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Wed Oct 19, 2016 10:14 pm Post subject: |
|
|
You could open the file and overwrite the first 3 characters.
The following program appears to do this. ( I first tried access=transparent but it did not work)
Needs more work to include if (iostat/=0) ... Code: | character file_name*40
character ignore_me*1
integer*4 iostat, i
!
call get_command_argument (1, file_name)
! file_name = 'utf-8-file.dat'
open (unit = 11, &
file = file_name, &
status = 'OLD', &
form = 'UNFORMATTED', &
access = 'DIRECT', &
recl = 1, &
iostat = iostat)
!
ignore_me = ' '
do i = 1,3
write (unit=11, rec=i, iostat=iostat) ignore_me
end do
close (unit=11)
end |
|
|
Back to top |
|
|
aebolzan
Joined: 06 Jul 2007 Posts: 229 Location: La Plata, Argentina
|
Posted: Wed Oct 19, 2016 11:53 pm Post subject: |
|
|
well, it seems that I have not been quite clear because I did not include what happens next....the complete action is this: I open a file and check the number of data points, then with this data, I allocate the data points and read once again the data. Bill's solution fails and John's also. The code I have for these actions is:
Code: |
type coordenate
real*4 x
real*4 y
end type coordenate
type(coordenate),dimension(:),allocatable :: data_point
real*4 x1,y1
character*129 filename
integer stat,i
open(37,file=filename,status='old',action='read')
ndata=0
do
read(37,*,iostat=stat) x1,y1
if(stat/= 0) exit
ndata=ndata+1
end do
close(37)
if(allocated(data_point)) deallocate(data_point)
allocate(data_point(ndata))
open(37,file=filename,status='old',action='read')
do i=1,ndata
read(37,*) data_point(i)%x,data_point(i)%y
end do
close(37)
|
|
|
Back to top |
|
|
aebolzan
Joined: 06 Jul 2007 Posts: 229 Location: La Plata, Argentina
|
Posted: Thu Oct 20, 2016 12:29 am Post subject: |
|
|
IT WORKS!!....SORRY...IT WAS MY FAULT!!!.....I did a mistake when adding the code of John and that was the reason for the failure of the subroutine (I did not notice that John changed filename to file_name).....now it works fine!!!...THANKS!!!
Time to go to bed......seems that my eyes are not seeing well at night...
But: how can I detect that an ASCII file is UTF-8 or not?, I mean with Fortran, because now the program works fine for UTF-8 files, but if I get an ANSI file, I will be erasing the three first numbers in the file!........
Agustin |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Thu Oct 20, 2016 12:52 am Post subject: |
|
|
the following works without changing the file Code: | type coordenate
real*4 x
real*4 y
end type coordenate
type(coordenate),dimension(:),allocatable :: data_point
!
real*8 x1,y1
character*129 filename
integer i
integer*4 count_lines, ndata
external count_lines
!
filename='\tmp\utf-8-file.dat'
!
ndata = count_lines (filename)
write (*,*) ndata,' lines identified'
!
if (allocated(data_point)) deallocate(data_point)
allocate(data_point(ndata))
!
do i=1,ndata
call get_xy (x1,y1,i)
data_point(i)%x = x1
data_point(i)%y = y1
end do
write (*,*) ndata,' lines recovered'
!
close(37)
end
integer*4 function count_lines (filename)
character filename*129
character line*80
integer*4 i, iostat
open (37,file=filename,status='old',action='read', iostat=iostat)
write (*,*) 'Opening file :',trim(filename),' iostat=',iostat
!
do i = 1,1000000
read (37,fmt='(a)', iostat=iostat) line
if ( iostat /= 0 ) then
write (*,*) 'iostat =',iostat,' at line',i
if ( iostat < 0 ) exit
end if
end do
rewind (37)
count_lines = i-1
end function count_lines
subroutine get_xy (x1,y1,i)
real*8 x1,y1
integer*4 i
integer*4 iostat
character line*80
!
x1 = -1
y1 = -1
read (37,fmt='(a)', iostat=iostat) line
if ( iostat /= 0 ) then
write (*,*) 'error reading file : iostat =',iostat,' at line',i
if ( iostat < 0 ) return
end if
!
call clean_line (line, i)
!
read (line,fmt='(2f30.0)',iostat=iostat) x1,y1
if ( iostat /= 0 ) then
write (*,*) 'error reading from line : iostat =',iostat,' at line',i
if ( iostat < 0 ) return
end if
!
end subroutine get_xy
subroutine clean_line (line, i)
!
! check for numeric line, removing parity characters
!
character line*(*), c
integer*4 i, j,k
!
do j = 1,len_trim(line)
c = line(j:j)
k = ichar (c)
if ( k > 127 ) then
write (*,*) 'parity set in line',i,j,' ',c,k-128
line(j:j) = ' '
cycle
end if
if ( index ('0123456789.+-, ',c) > 0 ) cycle
write (*,*) 'unrecognised character :',c,k
end do
!
end subroutine clean_line |
I would use real*8 for the x values you are reading.
you could improve on clean_line to do more cleaning
John |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Thu Oct 20, 2016 2:38 am Post subject: |
|
|
I appreciate the work that John has done. Seems like a lot of work to do something simple.
BTW, I ran the program I posted on the data set you provided, and it worked, so I'd appreciate knowing what is different in what you ran and what error(s) you got. If you run this on a non-UTF-8 file, yes, it will fail. That wasn't the question posed.
To see what kind of file you have, execute the read of the first line as I have outlined. Then look at the data retrieved, and if is equal to the UTF-8 header, then read as UTF-8. Otherwise it is ASCII; perform a REWIND on the file, and begin again.
One way to do this is to read every character, bypassing the UTF-* header (if any) to get to the data, then reconstruct every line of data regardless of the header contents.
It can be done, but why? If the data are naturally constrained (UTF-8 or not), then use the constraints and go forward. Easily done, easily documented (for the next poor soul dealing with the data), and you get the job done more quickly.
Bill |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|