Topic: allocate and /check in Support

JohnCampbell

Posts: 2526 Sydney

Back to Top

21 May 2015 1:13 #16324

I have a problem using ALLOCATE and mixing /opt with /check, where I get an integer overflow error using FTN95 Ver 7.1, although this problem has been around for years. my main program is compiled with /opt ( I think /debug also) my subroutine I am checking is compiled with /check In general, I am using /check to check the subroutine argument lists. The following is a cut down example; hopefully with no errors!

! test_call.f90
!  Program to produce integer overflow error
!
      real*8,    allocatable, dimension(:,:) :: EPROP
      integer*4 :: Num_Mat = 5
      integer*4 num, stat
!
      ALLOCATE ( EPROP(16,NUM_MAT),  stat=stat )
!
      CALL B2MAT_read   (NUM_MAT, EPROP, num)
!
      end

      LOGICAL FUNCTION EQUAL_DUMMY (VARIABLE)
!
      real*8 :: DUMMY = -.99898d00  ! special coordinate for undefined coordinates
!
      REAL*8  VARIABLE
!
      equal_dummy = ( variable == DUMMY )
      RETURN
      END FUNCTION EQUAL_DUMMY

      SUBROUTINE SET_DUMMY (VARIABLE)
!
      real*8 :: DUMMY = -.99898d00  ! special coordinate for undefined coordinates
!
      REAL*8  VARIABLE
!
      variable = DUMMY
      RETURN
      END SUBROUTINE SET_DUMMY

! test_called.f90
      SUBROUTINE B2MAT_read (NUM_MAT, EPROP, num)
!
      INTEGER*4 NUM_MAT, num
      REAL*8    EPROP(16,*)
!
      INTEGER*4 I
      LOGICAL   equal_dummy
      EXTERNAL  equal_dummy
!
      CALL material_read (2, NUM_MAT, EPROP,  num)
      write (*,*) 'material_read  <',num_mat,' >',num
!
      DO I = 1,NUM_MAT
         IF ( equal_dummy (EPROP(1,I)) ) cycle
         EPROP(6,I) = 0.5*EPROP(1,I)/(1.+EPROP(2,I))         ! PR > G
      END DO
!
      RETURN
!
      END SUBROUTINE B2MAT_read

      SUBROUTINE material_read (NTYPE, NUM_MAT, EPROP, num)
!
      INTEGER*4 NTYPE, NUM_MAT, num
      REAL*8    EPROP(16,*)
      INTEGER*4 I
!
      DO I = 1,NUM_MAT
         EPROP(:,I) = 0
         call set_dummy (EPROP(1,I))
      END DO
      num = i
!
      RETURN
!
      END SUBROUTINE material_read

ftn95 test_call /opt
ftn95 test_called /check
slink test_call.obj test_called.obj
test_call

Running this batch file produces the error The error is when accessing the logical function.

This logical function has a history of being a patch to remove warnings about testing (real.eq.real) but is now crashing with tests on allocated arrays.

I think the problem relates to /check having trouble with the information about the array EPROP, which is being provided from the call. If the main is compiled with /check all is ok, however /opt, /debug or no option all fail. I hope this is a good small example of a problem that has annoyed me for years. This problem limits the usability of /check, especially when applying selective use of /check in a larger program.

John

mecej4

Posts: 1914

Back to Top

21 May 2015 1:08 #16329

An EXE produced by linking a number of .OBJ files when only some of them have been compiled with /CHECK (and other related options such as /UNDEF) has always been a touch-and-go item, with FTN95 as well as other Fortran compilers. I console myself with the following reasoning: 'In order to permit checking dummy argument variables, the caller has to provide information regarding those arguments such as subscript ranges, allocation state, definability, etc. If the caller is not compiled with /CHECK but the callee is, the callee will still reference information in the expected places, expecting the caller to have provided that information, so the program will probably fail'.

There is a workaround which may or may not resolve your situation. Build a DLL with all the previously debugged subprograms in it. In order to be able to link the DLL, you will need to make sure that the code to be used for the DLL has no external references that are not resolved by linking to SALFLIBC.DLL.

For this to function properly, you should not have any I/O to the same file in both the DLL and the EXE client of the DLL, because there will be two instances of the I/O runtime, one in the EXE and another in the DLL, and these I/O operations are performed independently.

For the test example, you will have to move the function and the subroutine to the second file. For your convenience, here is the rearranged code. The main program in jcm.f90:

! test_call.f90
!  Program to produce integer overflow error
!
      real*8,    allocatable, dimension(:,:) :: EPROP
      integer*4 :: Num_Mat = 5
      integer*4 num, stat
!
      ALLOCATE ( EPROP(16,NUM_MAT),  stat=stat )
!
      CALL B2MAT_read   (NUM_MAT, EPROP, num)
!
      end

and the code for building the DLL, jcs.f90:

! test_called.f90
      SUBROUTINE B2MAT_read (NUM_MAT, EPROP, num)
!
      INTEGER*4 NUM_MAT, num
      REAL*8    EPROP(16,*)
!
      INTEGER*4 I
      LOGICAL   equal_dummy
      EXTERNAL  equal_dummy
!
      CALL material_read (2, NUM_MAT, EPROP,  num)
      write (*,*) 'material_read  <',num_mat,' >',num
!
      DO I = 1,NUM_MAT
         IF ( equal_dummy (EPROP(1,I)) ) cycle
         EPROP(6,I) = 0.5*EPROP(1,I)/(1.+EPROP(2,I))         ! PR > G
      END DO
!
      RETURN
!
      END SUBROUTINE B2MAT_read

      SUBROUTINE material_read (NTYPE, NUM_MAT, EPROP, num)
!
      INTEGER*4 NTYPE, NUM_MAT, num
      REAL*8    EPROP(16,*)
      INTEGER*4 I
!
      DO I = 1,NUM_MAT
         EPROP(:,I) = 0
         call set_dummy (EPROP(1,I))
      END DO
      num = i
!
      RETURN
!
      END SUBROUTINE material_read
      LOGICAL FUNCTION EQUAL_DUMMY (VARIABLE)
!
      real*8 :: DUMMY = -.99898d00  ! special coordinate for undefined coordinates
!
      REAL*8  VARIABLE
!
      equal_dummy = ( variable == DUMMY )
      RETURN
      END FUNCTION EQUAL_DUMMY

      SUBROUTINE SET_DUMMY (VARIABLE)
!
      real*8 :: DUMMY = -.99898d00  ! special coordinate for undefined coordinates
!
      REAL*8  VARIABLE
!
      variable = DUMMY
      RETURN
      END SUBROUTINE SET_DUMMY

I built the DLL using

ftn95 jcs.f90
slink /DLL jcs.obj /export:B2MAT_READ

and built the program using

ftn95 /check jcm.f90
slink jcm.obj jcs.dll

JohnCampbell

Posts: 2526 Sydney

Back to Top

21 May 2015 11:46 #16331

Mecej4,

Thanks very much for your comments, although I should have better explained what is the problem I have. With FTN95, I almost always use different compile options for different files in the same project. The code can be divided into a few types. Managers : this is most of the code for which I compile with /DEBUG (/MANAGE), so that the trace back can keep track of where they are. This code component does very little work but does organise the program. It is between 95% and 99% of the written code. Workers : this is where all the work is done (computation) and I use /P6 /OPT to try and get better performance. These are a few small routines, which have been around for a long time, often kept in the back room libraries. Vec_sum_sse and Vec_add_sse are typical examples. It is 99%+ of the run time. There are also new managers and workers being introduced, which require supervision, typically compiled with /CHECK. These are often low-level positions and should in time become unsupervised and provide a valuable addition to the organisation.

Mecej4, unfortunately, for the example I have provided, you have placed the new code that requires supervision (/check) in the jcs.f90 .dll, while your jc_manager.f90 does not require /check. I think the solution is that jc_manager.f90, which is typically existing code needs a /debug like option to provide sufficient information to the new /check code for adequate supervision. The new code is not ready to move out to the the .dll annex until probation is over. You have certainly identified the problem and hopefully there can be a solution.

/CHECK code needs to identify what information has been provided by the calling “manager” to not produce integer overflow. Perhaps an alternative /DEBUG, say /SUPERVISE is required for the manager that provides some extra information for new workers. The problem I have is that this is a large organisation and I don’t want all managers to have the extra burden of work associated with /SUPERVISE for new code being developed. These managers often call on many others, so the distinction between /MANAGE and /SUPERVISE can be difficult to implement for the one routine or routine tree.

Paul, does this better identify the problem that is occurring? Can the source of the integer overflow be corrected?

John

PaulLaidler

Posts: 7977 Salford, UK

Back to Top

22 May 2015 2:13 #16332

Here is an outline of what I think is happening.

CHECKed code an unCHECKed code can often be mixed but when calling a subprogram you are likely to have problems when the subprogram is CHECKed and the caller is not.

In the present case B2MAT_read is CHECKed and is doing a bounds check on an array that has been passed to it. But the bounds are not being passed (because the main program is not CHECKed). Consequently a spurious value is multiplied by 16 leading to the overflow. You can see this in the explist.

FTN95 is designed to handle mixed CHECKed and unCHECKed code but only when the unCHECKed code is being called (e.g. is in a library) and not the other way round. So for example, you write and test a function using /CHECK and when fully tested you remove /CHECK and keep the function in a library. You then call the library function from other CHECKed code and so on in the development cycle.

JohnCampbell

Posts: 2526 Sydney

Back to Top

23 May 2015 6:24 #16333

Paul,

Yes, the problem is occurring where /check code is being called from /debug code and the call argument list has arrays.

For this situation, can there be a change to detect this situation and avoiding integer overflow failures ?

I am aware that when using /check in the calling code, the array size information is transferred, even where the Fortran syntax does not provide this information, such as array 'real8 EPROP(16,)' will have allocated size information provided. The problem appears to be in the called routine, that it is not identifying that this extra information has not been provided. Is it possible to check that this information is there and not crash ?

I previously used 'real8 EPROP(16,)', assuming it provided some flexibility with the second dimension name, providing it does not exceed the maximum space allocated. I assumed that using *) removed the checking of the upper bound limit for the second argument, although recent experience of array size information being provided, as outlined above, may show this approach to be wrong.

The solution may be that I am more careful with mixing /debug and /check, but a more robust approach would be easier to use.

John

PaulLaidler

Posts: 7977 Salford, UK

Back to Top

23 May 2015 6:59 #16334

I think that FTN95 is designed to tolerate the situation where a CHECKed subprogram is called from unCHECKed code so what we have here should strictly be considered as a bug. But the point is that significant aspects of the checking mechanism can not be applied in this context and the user may be lulled into a false sense of security.

However, you can switch off the related checking with '/CHECK /INHIBIT_CHECK 6' and this will leave other (local) checks in place.