|
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Tue Jan 13, 2015 7:55 pm Post subject: Sporadic error code when opening binary, direct access file |
|
|
I'm getting an error code (IOSTAT=) of 10005. This error is not documented, so I'm at a loss to trap and deal with it.
I am opening a file for exclusive access (read and write).
Any clues? Is there a list of these extended kinds of error codes?
Thanks!
Bill |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7916 Location: Salford, UK
|
Posted: Wed Jan 14, 2015 8:51 am Post subject: |
|
|
Can you reproduce the error in a small program that we can read?
In theory one of the error reporting routines should give the reason for the error but this error number seems high.
Off hand I don't know which routine to use but I am thinking of routines like DOS_ERROR_MESSAGE@. |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Sat Jan 17, 2015 4:10 am Post subject: |
|
|
I found the root cause of the problem, at least the event that precipitates it.
I was accessing a file while, in the background, my "cloud service" was trying to save the file. I open the file as new to force the creation, immediately close it, then open it later in the code. It was sporadic because most of the time it sneaked past the cloud service.
I had opened the direct access file as DENYRW. That seemed to be the best at the time, full contention lockout. But then this odd error code showed up.
I have changed it to DENYWR (allow reading by other programs) and have only received the proper error codes since then. |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Mon Jan 19, 2015 6:39 am Post subject: |
|
|
Again, proven wrong by events. With cloud service turned off, I'm getting 10005 AND 10002 errors, always on the same file. 10005 predominates.
I an trying to open a unformatted, direct access file as "UNKNOWN". I know that if the file already exists, I should delete it, so that's what I do. The code segment.
Code: | 2200 CONTINUE
C --- DIRECT ACCESS
IF(OLD_NEW.EQ.2 .AND. SEQ_DA.EQ.2) THEN
CALL ERASE@(trim(FNAME),ICHECK)
c --- 2=file not found
IF(ICHECK.NE.0.and.icheck.ne.2)THEN
PRINT *,"DA: Delete of ",trim(fname),' code=',icheck
CALL DOS_ERROR_MESSAGE@(ICHECK,TTBUFF)
PRINT *,"MSG:",TRIM(TTBUFF)
GO TO 2900
else
print *,"Deleted ",trim(fname)
ENDIF
ENDIF
ICHECK=0 ! ENSURE THE RETURN CODE IS SET APPROPRIATELY
OPEN(UNIT=IUNIT,FILE=TRIM(FNAME),STATUS=FILE_STATUS(OLD_NEW),
$ ACCESS=FILE_ACCESS(SEQ_DA),RECL=IISIZE,FORM=FILE_FORM(SEQ_DA),
$ SHARE=FILE_CONTENTION(ACCESS_STYLE),IOSTAT=ICHECK)
|
The error is occurring on Unit=15. This unit number may change as the program executes and opens/closes files. For this error, however, unit 15 is always the one chosen, repeatedly, as I process each of the 167 files.
Since I have added extra print statements to trace the problem, the problem seems to have changed (a bit) and occurs less frequently.
Is it possible that there is a "race" condition where I am deleting the file then trying to open it as new, yet the OS has not yet actually deleted the file, and the IO code reflects that? |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7916 Location: Salford, UK
|
Posted: Mon Jan 19, 2015 11:11 am Post subject: |
|
|
You could try calling SLEEP1@(val) to see if this helps.
Make val suitably large to start with, say 1 second. |
|
Back to top |
|
|
IanLambley
Joined: 17 Dec 2006 Posts: 490 Location: Sunderland
|
Posted: Mon Jan 19, 2015 2:02 pm Post subject: |
|
|
Instead of using the erase@ subroutine, try opening the file and closing it with Status='delete'.
It probably doesn't matter how you open the file, but any file can be opened as direct access with a record length of 1.
e.g.
Code: |
subroutine kill_file(name,ichan)
character*(*) name
open(unit=ichan,file=name,status='unknown',access='direct',recl=1)
close(unit=ichan,status='delete')
end
|
Or you could wait for the file to be deleted after a call to erase@, e.g.
Code: |
subroutine wait_for_kill_file(name,error_code)
character*(*) name
logical*4 fexists@
integer*4 error_code
do while )fexists@(name,error_code))
enddo
end
|
or
Code: |
subroutine kill_file_and_wait(name,error_code1,error_code2)
character*(*) name
logical*4 fexists@
integer*4 error_code1,error_code2
call erase@(name,error_code1)
do while (fexists@(name,error_code2))
enddo
end
|
These are untested suggestions.
Regards
Ian |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Mon Jan 19, 2015 6:51 pm Post subject: |
|
|
Ian, good suggestions. It looks like I'll have to do something like that (read below).
Paul, I was using a 1 second delay before attempting to retry the operation. Sometimes, it never worked (60 seconds). Other times, after a few (4-10) it would succeed.
First, before changing the OPENing code, I took the exclusive access off of this file, just letting it remain a regular file. Still failed, but much more rarely. In the 167 file openings, it first failed on file 57. Same bizarre status code (10005).
Second, I tried to open the file as "TRANSPARENT" but that did not seem to work with using record numbers in the read/write's. Abandoned this.
Next, I recoded the open. The significant change is that I did not try to delete the file before opening it, like I did when the file is to be opened as NEW (UNKNOWN). I did not trap any errors. However, accessing the file itself now appears to be exceedingly slow, which was unexpected. Still trying to track that down, as other files that are opened as OLD, direct access don't seem to be slow in their access.
By way of explanation, deleting the file allowed me to use the file for different temporary purposes as needed. However, a different record length applied to an existing file would cause an error to be thrown (IOSTAT=105 or 108 ). So, if it is opened as new, delete the file first to prevent this error.
I even tried placing the file on a different logical drive, just to make sure it was not some fluke of where the file resided. No difference in performance; still slow.
Lastly, I implemented a delete operation just prior to the OPEN for a status of NEW (UNKNOWN). This code segment is below. Better, but no joy. The 10005 error still occurs. Referring to the tracing statements in the code, the following happens:
File is found to exist.
File is deleted.
File is found not to exist.
Initial open results in IOSTAT=10005
Try again to open the file, IOSTAT=10005
Code stops trying.
The code that is running at this time is in a fairly tight loop, finding the target file to be opened from a list of files, then opening that file which causes the temporary file to be opened, processing the file contents into the direct access, unformatted file I'm having an issue with. At the end, of writing this work file, the work file is closed and control is returned to the looping section. If the direct access file cannot be opened, the lower level routine quits and returns an error indicator, stopping the loop.
Any additional insight would be much appreciated. Yes, the code looks horrible. Will be cleaned up once I understand what can and cannot work!!
[code:1:2551a8f89b] ICHECK=0 ! ENSURE THE RETURN CODE IS SET APPROPRIATELY
if(old_new.eq.2) then
if(fexists@(trim(fname),icheck)) then
if(icheck .ne. 0) print *,"fexists@ returned",icheck
call erase@(trim(fname),icheck)
print *,"Deleting file before 'new':",trim(fname),icheck
do 2211 ii=1,10
if(.not.fexists@(trim(fname),icheck)) go to 2212
call sleep1@(0.2)
2211 continue
2212 print *,"Waited to delete",ii
endif
endif
OPEN(UNIT=IUNIT,FILE=TRIM(FNAME),STATUS=FILE_STATUS(OLD_NEW),
$ ACCESS=FILE_ACCESS(SEQ_DA),RECL=IISIZE,FORM=FILE_FORM(SEQ_DA),
$ SHARE=FILE_CONTENTION(ACCESS_STYLE),IOSTAT=ICHECK)
if(icheck .ne. 0) then !don't know what the issue was, but can't be good.
print *,"1st open icheck=",icheck," file=",trim(fname)
if(old_new .eq. 1) then ! attempting to open an old file
if(icheck .eq. 2 ) go to 2905 ! attempt to open and old file, and it wasn't there.
c --- unexpected status for an old file
print *,"da Open:","Status:",icheck," File:",trim(fname),
$ "Option:",iopt
go to 2905
|
|
Back to top |
|
|
IanLambley
Joined: 17 Dec 2006 Posts: 490 Location: Sunderland
|
Posted: Mon Jan 19, 2015 7:08 pm Post subject: |
|
|
For temporary files, open with status='scratch' and no file name. Closing them will then delete them automatically. |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Mon Jan 19, 2015 7:43 pm Post subject: |
|
|
Yes, but that defeats the program intent. Open/create the file, write all the data, then allow other program components to access the file as (and if) needed.
It is only temporary in the sense that after the program exits, these data re no longer needed. |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Mon Jan 19, 2015 9:07 pm Post subject: |
|
|
Since nothing seemed to work, I got desperate. I turned off the Anti-Virus software (McAfee), and had my cloud service on.
I also modified the code slightly to open the temporary file, write two records and close it as I cycled through a list of files.
With Full AntiVirus scanning turn on, 166 file open/write 2/close cycles with 2 errors (both 10005's).
With Scanning set to only "Programs and documents", processed 2000+ file open/write 2/close cycles with 1 error (10005).
With Scanning turned off entirely, 3154 open/write 2/close cycles with no errors. While not definitive, it's a sign!
This doesn't get to root cause, but it is at least a strong indication that the error I'm seeing is associated with the anti-virus software McAfee and scanning of files in real-time. Perhaps this will be helpful to someone else out there.
Since even waiting for the file does not seem to always correct the problem, there's not much I can do to trap the event and get past it.
For now, I'm leaving the real-time scan turned off. |
|
Back to top |
|
|
LitusSaxonicum
Joined: 23 Aug 2005 Posts: 2388 Location: Yateley, Hants, UK
|
Posted: Tue Jan 20, 2015 2:05 pm Post subject: |
|
|
My answer may possibly be somewhat 'off topic'.
Firstly, some antivirus software is a nightmare to use with FTN95. My experience was with Avast, where FTN95 itself would run, executables created with it were rejected on the grounds that the publisher was unrecognised. Kaspersky works.
A further problem arises if the target subdirectory is locked for some reason, This could happen in a server, or network attached storage (NAS), because it was being accessed by another user.
I wrote some software for students to use on University laptops while out in the field. These machines were 'conveniently' configured to turn themselves off after a while if they checked and found that they weren't connected to the University network (to compound the problem, they had to be connected to a subnetwork in another campus some miles away, so they weren't even usable at base).
In the scramble to save work, I discovered that the same computer support techie had converted the default folder for the program to save files as read only, on the grounds that students couldn't be trusted to put anything on the hard drive. This negated my best efforts to give student users a second chance to save their work, both on program exit and windows exit, and led me into the library routines in FTN77 (documentation under 'Support' for FTN95 on this website) to check file attributes. Having an old DOS manual to hand is useful for understanding those attribute settings!
You may find one of those old FTN77 routines helps you.
Eddie |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Tue Jan 20, 2015 5:04 pm Post subject: |
|
|
Thanks, Eddie! This was an interesting story. When you've been in this business a long time, the number of weird happenings and fixes gets longer and longer. I have some of my own from the mid- to late-70's, some on mainframes, some on the first S-100 bus computers.
There is not an issue with the folder. There are no permissions/attributes that would directly impact the FTN95 executables. The "drive" that the testing folders is on is a mapped drive, although not actually on a network. I did, in the config file, move the temporary file to an unmapped drive, and it failed there. So, most likely, that is not root cause.
That said, when I turned off McAfee real-time scanning altogether, and ran 3154 file repetitions, the problem did not manifest itself. Not definitive, to be sure, so I am devising a test program that I will run with and without McAfee enabled, and with it disabled, run overnight.
In my former life, I was working in aerospace, and the focus for any failure is to get to root cause. I seem to have a penchant (or perhaps, a curse) to find issues, identify the root cause, and devise a solution. Even if the root cause is not fully identified, if I can find an acceptable workaround, I'll be happy. |
|
Back to top |
|
|
wahorger
Joined: 13 Oct 2014 Posts: 1217 Location: Morrison, CO, USA
|
Posted: Wed Jan 21, 2015 5:04 pm Post subject: |
|
|
One more evidence item: I just had a failure to delete a file using the CLOSE statement. Looks like I'll have to bullet-proof that one as well. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|