Topic: Access violation error in 64-bit; works in 32-bit in 64-bit

ahalls_dsc

Posts: 16

Back to Top

31 Jul 2019 9:51 #24114

Hi all. I have a program which works fine when compiled in 32-bit, yet terminates with an access violation error when compiled as 64-bit.

The main thing that piques my senses is that the violation occurs at 00000002071E05A0 when it should be addressing 00000003071E05A0. Note the similarities.

I don't want to duplicate what I've written elsewhere (though can for SEO or other purposes, if admin/mods so wish), so you can access the steps taken to investigate the issue via this link (Stack Overflow).

Many thanks for taking the time. I'm still a newbie so please go easy on me, and I'll do my best to provide additional information!

mecej4

Posts: 1911

Back to Top

31 Jul 2019 11:25 #24116

Please provide complete code that we can compile and link in order to run and reproduce the access error.

The code that you posted at stackoverflow is incomplete and cannot be compiled without errors. You also posted two versions of the include file. Please name the one that you used when you built the EXE.

ahalls_dsc

Posts: 16

Back to Top

31 Jul 2019 1:34 #24119

Quoted from mecej4 Please provide complete code that we can compile and link in order to run and reproduce the access error.

Thanks for the response. My apologies - will work on a reproducible example and return.

For info: the include file is relocmon.inc; the second example is with the variables replaced with constants, for clarity.

ahalls_dsc

Posts: 16

Back to Top

1 Aug 2019 2:18 #24128

@mecej4

As noted in the other thread, I have been able to reproduce the error originally mentioned in the Stack Overflow link.

For posterity, the git repo containing the code is here (Bitbucket). I have added a 'no_inputs' branch that does away with the zone.def input file and check.

Apologies if it's not strictly the minimal reproducible example, but it's heavily simplified from the version it's based on, and my expertise is limited!

mecej4

Posts: 1911

Back to Top

1 Aug 2019 3:09 #24129

Posterity need not go to all that bother. There is no need to make readers go to other sites and download many files. Here is a short reproducer for the compiler bug with /64:

      SUBROUTINE ddmodel14()

      INTEGER,PARAMETER :: LXTZN = 1714, MXAV = 191
      REAL, save, DIMENSION(lxtzn,lxtzn,mxav) :: ddfuncval

      call random(ddfuncval(:,:,1))

      END SUBROUTINE

You probably do not need to wait for the next release of the compiler. Instead, you may remove the programming errors in your code, some of which I mentioned above, and use a different way to allocate the array DDFUNCVAL.

The newer set of sources that you posted at Bitbucket can be compiled with the current release of FTN95, with or without /64. The 32-bit EXE will not run on my Windws 10 PC. The 64-bit EXE runs to completion with seemingly no error, but compiling with /undef /64 and running turns up some undefined variables. Please follow up and fix these errors. Note also that this behaviour is not consistent with the description in your Stackoverflow post.

ahalls_dsc

Posts: 16

Back to Top

1 Aug 2019 4:06 #24130

Understood, thanks.

This rectifies the compiler bug, which is the subject of the other thread. This thread, however, is referring to an access violation error in program execution.

I have now reduced the code to a minimal reproducible example:

      PROGRAM ML14ERROR   
      CALL ddmodel14()
      contains   
      SUBROUTINE ddmodel14()  

        INTEGER :: origzn, destzn
        INTEGER,PARAMETER :: MXZMA = 1713, LXTZN = 1714, MXAV = 183
        INTEGER,PARAMETER :: JTMPREL = 1003, av = 1
        REAL(KIND=2) :: RANDOM@
        REAL,dimension (1:mxav,lxtzn,lxtzn,JTMPREL:JTMPREL):: znzndaav

        DO origzn=1,lxtzn
          DO destzn=1,lxtzn
            znzndaav(av,origzn,destzn,JTMPREL) = RANDOM@()
          END DO
        END DO

        DO origzn=1,mxzma
          DO destzn=1,mxzma
            ! This is where the error occurs
            znzndaav(av,origzn,lxtzn,JTMPREL)=
     $         znzndaav(av,origzn,lxtzn,JTMPREL)+
     $         znzndaav(av,origzn,destzn,JTMPREL)

          ENDDO
        ENDDO

        WRITE(6,*)'No errors'
      END SUBROUTINE
      END PROGRAM

There is an access violation when mxav=183, yet this does not occur when mxav=182.

EDIT: there's some strange things happening when compiling to Win32; the issue I'm seeing with x64 is replicating though.

mecej4

Posts: 1911

Back to Top

1 Aug 2019 4:36 #24131

Look into the memory needed for array ZNZNDAAV. It is too big for addressing using 32-bits. Likewise, it is too large even for 64-bits because it is a local variable that is allocated on the stack.

When your program uses variables that are beyond the address ranges and size limits of the operating system and the compiler, you cannot trust any error messages that you see, nor can you trust the output of the program even with no error messages.

ahalls_dsc

Posts: 16

Back to Top

1 Aug 2019 7:20 #24132

Quoted from mecej4 Look into the memory needed for array ZNZNDAAV. It is too big for addressing using 32-bits. Likewise, it is too large even for 64-bits because it is a local variable that is allocated on the stack.

When your program uses variables that are beyond the address ranges and size limits of the operating system and the compiler, you cannot trust any error messages that you see, nor can you trust the output of the program even with no error messages.

Thanks for your patience and a clear resolution! Very helpful.

So, in order to solve once and for all, I should either split into smaller arrays, or port to f95 and use an allocatable array?

mecej4

Posts: 1911

Back to Top

1 Aug 2019 11:04 #24133

To answer the last question, one needs to know the purpose of the program.

The short test program fills a huge 4-D array with random numbers and does a little bit of averaging on a boundary. The result is neither output nor used in a subsequent calculation. The effect of running the program is nil.

Do you see why a clear and complete problem definition is needed?

DanRRight

Posts: 2877 South Pole, Antarctica

Back to Top

2 Aug 2019 12:49 #24134

The code is just 180 * 1700 *1000 * 4 = 1,224,000,000 bytes or only ~1GB in size. Compiling it

ftn95 aaa.for /full_debug /64  >z
slink64 aaa.obj /stack:2064 >zz
sdbg64 aaa.exe

with stack 2 GB does not work (access violation crash). Making stack more than 2 GB does not even open the file. The 64 bit compiler should not work this way. 1GB for 64bit compiler is almost exactly like 1 byte for 32 bit one!!!

Imagine the crazy world we are living, everyone, Microsoft, Intel want to control everything and call it 'advancement' when they increase stack to equivalent of 3 bytes, then 4 bytes or increase for you addressable space to 32bytes. Remember how we just few years back discussed how to read 3GB file? 3GB for 64bit world is like 3 bytes for 32bit one. Intel also has restriction on most few years old processors equivalent to 32 bytes, some have 64 bytes, and i've heard new AMD will allow us to have have whopping 128 BYTES equivalent. BYTES! Want more - buy their $50K 92xx processors. It's like printing money on laser printer

mecej4

Posts: 1911

Back to Top

2 Aug 2019 2:57 #24135

Quoted from DanRRight The code is just 180 * 1700 *1000 * 4 = 1,224,000,000 bytes or only ~1GB in size.

Please check again. The array size in bytes is

 183*1714*1714*1*4 = 2150466672 > 2 GB

Furthermore, in his larger program (and perhaps more so in his actual full size program) he has a number of other large arrays. It will be necessary to plan and reduce the memory footprint, and strike a balance between how much data is in disk files and what part of that can be brought into memory for processing.

JohnCampbell

Posts: 2526 Sydney

Back to Top

2 Aug 2019 4:35 #24136

Apart from the other programming errors, it is foolish to put such large arrays on the stack. Their size should be confirmed and if required they should be placed on the heap via ALLOCATE.

There are also other coding errors, such as the use of uninitialized variables and arrays. These errors should be corrected, before claiming there is a compiler error.

Why present this in two forums ?

JohnCampbell

Posts: 2526 Sydney

Back to Top

2 Aug 2019 4:37 #24137

Quoted from DanRRight Imagine the crazy world we are living, everyone, Microsoft, Intel want to control everything and call it 'advancement' when they increase stack to equivalent of 3 bytes, then 4 bytes or increase for you addressable space to 32bytes. Remember how we just few years back discussed how to read 3GB file? 3GB for 64bit world is like 3 bytes for 32bit one. Intel also has restriction on most few years old processors equivalent to 32 bytes, some have 64 bytes, and i've heard new AMD will allow us to have have whopping 128 BYTES equivalent. BYTES! Want more - buy their $50K 92xx processors. It's like printing money on laser printer

Dan, what is a terabyte of random numbers ?

DanRRight

Posts: 2877 South Pole, Antarctica

Back to Top

2 Aug 2019 4:43 (Edited: 2 Aug 2019 8:02) #24138

Yes, array is bit larger than 2GB. Does not matter, the /stack is set to ~2GB maximum like it was maximum possible with 32bit system (up to 3+ GB with /3GB ) after which it does not work. It is impossible to set 3024 or 10024. Who set this limit and for what damn purpose i do not know. Of course allocatables will work, but why the heck to set this small stack limit and what purpose is designed stack for? Now seems we started playing the same stupid games with this stack we played all these years with 32 bit system. Programs will start crashing with no diagnostics of why this happening and which large array caused crash.

By the way newer linker with INI file has the same results. And even more: the older linker without INI and without /stack:xxxx works the same way as with /stack:2048. So /stack option either does not work or set to 2GB no matter what, i.e. is useless with INI or without

PaulLaidler

Posts: 7974 Salford, UK

Back to Top

2 Aug 2019 7:29 #24139

The SAVE attribute makes this array storage permanent. It does not go on the stack. When the subroutine is re-entered, data from the previous call is preserved.

It is quite possible that the author does not need/intend to use SAVE in this context.

There should be no problem if SAVE is not used even with very large arrays.

SAVE in one sense is equivalent to using a COMMON block or MODULE for the array but there is currently a limitation of 1GB on SAVEd data which does not apply to COMMON/MODULE data.

SAVEd data is always initialised and packed into the executable so it makes sense to limit the amount you can have.

DanRRight

Posts: 2877 South Pole, Antarctica

Back to Top

2 Aug 2019 8:09 (Edited: 2 Aug 2019 8:24) #24140

Quoted from JohnCampbell Dan, what is a terabyte of random numbers ?

John, Who knows what for this code was written. May be this is just a demo. Or author is writing safe media eraser permanently generating random numbers for multiple times erase of existing data on harddrives and SSD? Or initializing PIC code from some random state. PIC codes may easily deal with terabytes, we generated 40TB just this month. Plus, there are infinite amount of reasons to do things wrong way, anyone has rights to do so.

ahalls_dsc

Posts: 16

Back to Top

2 Aug 2019 8:15 #24141

Quoted from mecej4 To answer the last question, one needs to know the purpose of the program.

The short test program fills a huge 4-D array with random numbers and does a little bit of averaging on a boundary. The result is neither output nor used in a subsequent calculation. The effect of running the program is nil.

Do you see why a clear and complete problem definition is needed?

The labelled errant code is one line of many hundreds in one module, of many hundreds more of the entire program.

Quoted from JohnCampbell Apart from the other programming errors, it is foolish to put such large arrays on the stack. Their size should be confirmed and if required they should be placed on the heap via ALLOCATE.

There are also other coding errors, such as the use of uninitialized variables and arrays. These errors should be corrected, before claiming there is a compiler error.

Why present this in two forums ?

I don't think I suggested it was a compiler error - though maybe the fact it's in this forum makes that implicit.

The full program aims to relocate a country's worth of employment and population using a distance deterrence function. If I'm understanding this point in the program correctly, the array is necessary to be able to optimise the entire modelled area at once. The modelled area we are working with is many times larger than any previously, hence the overflow and the need for optimisation.

Thanks for confirming that we need to use ALLOCATE - I'm not a programmer (had you guessed?) I'm just debugging to aid the development process, and learning.

Quoted from PaulLaidler The SAVE attribute makes this array storage permanent. It does not go on the stack. When the subroutine is re-entered, data from the previous call is preserved.

It is quite possible that the author does not need/intend to use SAVE in this context.

There should be no problem if SAVE is not used even with very large arrays.

SAVE in one sense is equivalent to using a COMMON block or MODULE for the array but there is currently a limitation of 1GB on SAVEd data which does not apply to COMMON/MODULE data.

SAVEd data is always initialised and packed into the executable so it makes sense to limit the amount you can have.

The issue with SAVE was a new one introduced through my implementation, without proper forethought as to how moving it from a COMMON block would affect the program. As we can see from the final example I shared, it wasn't needed to demonstrate what turned out to be a rookie error regarding stack size.

DanRRight

Posts: 2877 South Pole, Antarctica

Back to Top

2 Aug 2019 8:37 #24142

One positive moment i found playing with this demo and my own programs which were causing stack overflow in 64bits. Though /stack:2048 had not solve the problem of this original poster, it positively influenced my OpenGL parts of programs which before were getting stack overflow at minuscule array dimensions. I use slink64 with INI file

PaulLaidler

Posts: 7974 Salford, UK

Back to Top

2 Aug 2019 11:46 #24143

mecej4

I was mistaken. A simple change to FTN95 provides for an improved error message with the name of the offending array together with the line number where the array is declared.

mecej4

Posts: 1911

Back to Top

2 Aug 2019 12:03 #24144

Paul, that is a very welcome piece of news, and is certain to cause many users to look forward to the next version of the compiler.

As we have seen recently from this thread and others, users are applying the Silverfrost compiler to larger and more ambitious projects. As a result, we are seeing more instances of pushing against the boundaries of what the compiler can handle. It is gratifying that in almost every case the design of the new 64-bit FTN95 appears to be such that a fix for a reported problem is being made in time for the next release.