forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Sporadic error code when opening binary, direct access file

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Tue Jan 13, 2015 7:55 pm    Post subject: Sporadic error code when opening binary, direct access file Reply with quote

I'm getting an error code (IOSTAT=) of 10005. This error is not documented, so I'm at a loss to trap and deal with it.

I am opening a file for exclusive access (read and write).

Any clues? Is there a list of these extended kinds of error codes?

Thanks!
Bill
Back to top
View user's profile Send private message Visit poster's website
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Wed Jan 14, 2015 8:51 am    Post subject: Reply with quote

Can you reproduce the error in a small program that we can read?

In theory one of the error reporting routines should give the reason for the error but this error number seems high.

Off hand I don't know which routine to use but I am thinking of routines like DOS_ERROR_MESSAGE@.
Back to top
View user's profile Send private message AIM Address
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Sat Jan 17, 2015 4:10 am    Post subject: Reply with quote

I found the root cause of the problem, at least the event that precipitates it.

I was accessing a file while, in the background, my "cloud service" was trying to save the file. I open the file as new to force the creation, immediately close it, then open it later in the code. It was sporadic because most of the time it sneaked past the cloud service.

I had opened the direct access file as DENYRW. That seemed to be the best at the time, full contention lockout. But then this odd error code showed up.

I have changed it to DENYWR (allow reading by other programs) and have only received the proper error codes since then.
Back to top
View user's profile Send private message Visit poster's website
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Mon Jan 19, 2015 6:39 am    Post subject: Reply with quote

Again, proven wrong by events. With cloud service turned off, I'm getting 10005 AND 10002 errors, always on the same file. 10005 predominates.

I an trying to open a unformatted, direct access file as "UNKNOWN". I know that if the file already exists, I should delete it, so that's what I do. The code segment.
Code:
2200   CONTINUE
C --- DIRECT ACCESS
   IF(OLD_NEW.EQ.2 .AND. SEQ_DA.EQ.2) THEN
     CALL ERASE@(trim(FNAME),ICHECK)
c --- 2=file not found
     IF(ICHECK.NE.0.and.icheck.ne.2)THEN
       PRINT *,"DA: Delete of ",trim(fname),' code=',icheck
       CALL DOS_ERROR_MESSAGE@(ICHECK,TTBUFF)
       PRINT *,"MSG:",TRIM(TTBUFF)
       GO TO 2900
     else
        print *,"Deleted ",trim(fname)
     ENDIF
   ENDIF
   ICHECK=0 ! ENSURE THE RETURN CODE IS SET APPROPRIATELY
   OPEN(UNIT=IUNIT,FILE=TRIM(FNAME),STATUS=FILE_STATUS(OLD_NEW),
     $   ACCESS=FILE_ACCESS(SEQ_DA),RECL=IISIZE,FORM=FILE_FORM(SEQ_DA),
     $   SHARE=FILE_CONTENTION(ACCESS_STYLE),IOSTAT=ICHECK)


The error is occurring on Unit=15. This unit number may change as the program executes and opens/closes files. For this error, however, unit 15 is always the one chosen, repeatedly, as I process each of the 167 files.

Since I have added extra print statements to trace the problem, the problem seems to have changed (a bit) and occurs less frequently.

Is it possible that there is a "race" condition where I am deleting the file then trying to open it as new, yet the OS has not yet actually deleted the file, and the IO code reflects that?
Back to top
View user's profile Send private message Visit poster's website
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Mon Jan 19, 2015 11:11 am    Post subject: Reply with quote

You could try calling SLEEP1@(val) to see if this helps.
Make val suitably large to start with, say 1 second.
Back to top
View user's profile Send private message AIM Address
IanLambley



Joined: 17 Dec 2006
Posts: 490
Location: Sunderland

PostPosted: Mon Jan 19, 2015 2:02 pm    Post subject: Reply with quote

Instead of using the erase@ subroutine, try opening the file and closing it with Status='delete'.

It probably doesn't matter how you open the file, but any file can be opened as direct access with a record length of 1.

e.g.
Code:

      subroutine kill_file(name,ichan)
      character*(*) name
      open(unit=ichan,file=name,status='unknown',access='direct',recl=1)
      close(unit=ichan,status='delete')
      end


Or you could wait for the file to be deleted after a call to erase@, e.g.
Code:

      subroutine wait_for_kill_file(name,error_code)
      character*(*) name
      logical*4 fexists@
      integer*4 error_code
      do while )fexists@(name,error_code))
      enddo
      end


or

Code:

      subroutine kill_file_and_wait(name,error_code1,error_code2)
      character*(*) name
      logical*4 fexists@
      integer*4 error_code1,error_code2
      call erase@(name,error_code1)
      do while (fexists@(name,error_code2))
      enddo
      end


These are untested suggestions.

Regards
Ian
Back to top
View user's profile Send private message Send e-mail
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Mon Jan 19, 2015 6:51 pm    Post subject: Reply with quote

Ian, good suggestions. It looks like I'll have to do something like that (read below).

Paul, I was using a 1 second delay before attempting to retry the operation. Sometimes, it never worked (60 seconds). Other times, after a few (4-10) it would succeed.

First, before changing the OPENing code, I took the exclusive access off of this file, just letting it remain a regular file. Still failed, but much more rarely. In the 167 file openings, it first failed on file 57. Same bizarre status code (10005).

Second, I tried to open the file as "TRANSPARENT" but that did not seem to work with using record numbers in the read/write's. Abandoned this.

Next, I recoded the open. The significant change is that I did not try to delete the file before opening it, like I did when the file is to be opened as NEW (UNKNOWN). I did not trap any errors. However, accessing the file itself now appears to be exceedingly slow, which was unexpected. Still trying to track that down, as other files that are opened as OLD, direct access don't seem to be slow in their access.

By way of explanation, deleting the file allowed me to use the file for different temporary purposes as needed. However, a different record length applied to an existing file would cause an error to be thrown (IOSTAT=105 or 108 ). So, if it is opened as new, delete the file first to prevent this error.

I even tried placing the file on a different logical drive, just to make sure it was not some fluke of where the file resided. No difference in performance; still slow.

Lastly, I implemented a delete operation just prior to the OPEN for a status of NEW (UNKNOWN). This code segment is below. Better, but no joy. The 10005 error still occurs. Referring to the tracing statements in the code, the following happens:

File is found to exist.
File is deleted.
File is found not to exist.
Initial open results in IOSTAT=10005
Try again to open the file, IOSTAT=10005
Code stops trying.

The code that is running at this time is in a fairly tight loop, finding the target file to be opened from a list of files, then opening that file which causes the temporary file to be opened, processing the file contents into the direct access, unformatted file I'm having an issue with. At the end, of writing this work file, the work file is closed and control is returned to the looping section. If the direct access file cannot be opened, the lower level routine quits and returns an error indicator, stopping the loop.

Any additional insight would be much appreciated. Yes, the code looks horrible. Will be cleaned up once I understand what can and cannot work!!

[code:1:2551a8f89b] ICHECK=0 ! ENSURE THE RETURN CODE IS SET APPROPRIATELY
if(old_new.eq.2) then
if(fexists@(trim(fname),icheck)) then
if(icheck .ne. 0) print *,"fexists@ returned",icheck
call erase@(trim(fname),icheck)
print *,"Deleting file before 'new':",trim(fname),icheck
do 2211 ii=1,10
if(.not.fexists@(trim(fname),icheck)) go to 2212
call sleep1@(0.2)
2211 continue
2212 print *,"Waited to delete",ii
endif
endif
OPEN(UNIT=IUNIT,FILE=TRIM(FNAME),STATUS=FILE_STATUS(OLD_NEW),
$ ACCESS=FILE_ACCESS(SEQ_DA),RECL=IISIZE,FORM=FILE_FORM(SEQ_DA),
$ SHARE=FILE_CONTENTION(ACCESS_STYLE),IOSTAT=ICHECK)
if(icheck .ne. 0) then !don't know what the issue was, but can't be good.
print *,"1st open icheck=",icheck," file=",trim(fname)
if(old_new .eq. 1) then ! attempting to open an old file
if(icheck .eq. 2 ) go to 2905 ! attempt to open and old file, and it wasn't there.
c --- unexpected status for an old file
print *,"da Open:","Status:",icheck," File:",trim(fname),
$ "Option:",iopt
go to 2905
Back to top
View user's profile Send private message Visit poster's website
IanLambley



Joined: 17 Dec 2006
Posts: 490
Location: Sunderland

PostPosted: Mon Jan 19, 2015 7:08 pm    Post subject: Reply with quote

For temporary files, open with status='scratch' and no file name. Closing them will then delete them automatically.
Back to top
View user's profile Send private message Send e-mail
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Mon Jan 19, 2015 7:43 pm    Post subject: Reply with quote

Yes, but that defeats the program intent. Open/create the file, write all the data, then allow other program components to access the file as (and if) needed.

It is only temporary in the sense that after the program exits, these data re no longer needed.
Back to top
View user's profile Send private message Visit poster's website
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Mon Jan 19, 2015 9:07 pm    Post subject: Reply with quote

Since nothing seemed to work, I got desperate. I turned off the Anti-Virus software (McAfee), and had my cloud service on.

I also modified the code slightly to open the temporary file, write two records and close it as I cycled through a list of files.

With Full AntiVirus scanning turn on, 166 file open/write 2/close cycles with 2 errors (both 10005's).

With Scanning set to only "Programs and documents", processed 2000+ file open/write 2/close cycles with 1 error (10005).

With Scanning turned off entirely, 3154 open/write 2/close cycles with no errors. While not definitive, it's a sign!

This doesn't get to root cause, but it is at least a strong indication that the error I'm seeing is associated with the anti-virus software McAfee and scanning of files in real-time. Perhaps this will be helpful to someone else out there.

Since even waiting for the file does not seem to always correct the problem, there's not much I can do to trap the event and get past it.

For now, I'm leaving the real-time scan turned off.
Back to top
View user's profile Send private message Visit poster's website
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Tue Jan 20, 2015 2:05 pm    Post subject: Reply with quote

My answer may possibly be somewhat 'off topic'.

Firstly, some antivirus software is a nightmare to use with FTN95. My experience was with Avast, where FTN95 itself would run, executables created with it were rejected on the grounds that the publisher was unrecognised. Kaspersky works.

A further problem arises if the target subdirectory is locked for some reason, This could happen in a server, or network attached storage (NAS), because it was being accessed by another user.

I wrote some software for students to use on University laptops while out in the field. These machines were 'conveniently' configured to turn themselves off after a while if they checked and found that they weren't connected to the University network (to compound the problem, they had to be connected to a subnetwork in another campus some miles away, so they weren't even usable at base).

In the scramble to save work, I discovered that the same computer support techie had converted the default folder for the program to save files as read only, on the grounds that students couldn't be trusted to put anything on the hard drive. This negated my best efforts to give student users a second chance to save their work, both on program exit and windows exit, and led me into the library routines in FTN77 (documentation under 'Support' for FTN95 on this website) to check file attributes. Having an old DOS manual to hand is useful for understanding those attribute settings!

You may find one of those old FTN77 routines helps you.

Eddie
Back to top
View user's profile Send private message
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Tue Jan 20, 2015 5:04 pm    Post subject: Reply with quote

Thanks, Eddie! This was an interesting story. When you've been in this business a long time, the number of weird happenings and fixes gets longer and longer. I have some of my own from the mid- to late-70's, some on mainframes, some on the first S-100 bus computers.

There is not an issue with the folder. There are no permissions/attributes that would directly impact the FTN95 executables. The "drive" that the testing folders is on is a mapped drive, although not actually on a network. I did, in the config file, move the temporary file to an unmapped drive, and it failed there. So, most likely, that is not root cause.

That said, when I turned off McAfee real-time scanning altogether, and ran 3154 file repetitions, the problem did not manifest itself. Not definitive, to be sure, so I am devising a test program that I will run with and without McAfee enabled, and with it disabled, run overnight.

In my former life, I was working in aerospace, and the focus for any failure is to get to root cause. I seem to have a penchant (or perhaps, a curse) to find issues, identify the root cause, and devise a solution. Even if the root cause is not fully identified, if I can find an acceptable workaround, I'll be happy.
Back to top
View user's profile Send private message Visit poster's website
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Wed Jan 21, 2015 5:04 pm    Post subject: Reply with quote

One more evidence item: I just had a failure to delete a file using the CLOSE statement. Looks like I'll have to bullet-proof that one as well.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group