Topic: Support for 4gb available memory on 64-bit OS in General

JohnCampbell

Posts: 2526 Sydney

Back to Top

2 Dec 2011 1:58 #9336

Paul,

I have developed a memory mapping program which identifies all available memory in the range 0:4gb using FTN95. With Windows 7_64 ( and XP_64 ) memory between 2gb and 4gb is available using ALLOCATE if compiled as ftn95 program /link. (This is a useful extension to FTN95 when running on 64 bit OS.) However, if I use ftn95 program /debug /link only memory between 0gb and 2gb is available. Can you indicate why /debug excludes the availability of this extended memory. I extensively use /debug to provide better call-back error report of code addresses (line numbers) if the program crashes. Is there a possibility that 2gb to 4gb memory on 64 bit OS could be available when using the /debug option to provide line numbers.

John

PaulLaidler

Posts: 7975 Salford, UK

Back to Top

2 Dec 2011 8:32 #9338

/check and /debug use a different memory allocation process and this would explain the difference.

Given time and some investigation, it ought to be possible for me to provide you with an option that forces the alternative but this may bypass some of the memory checking features (particularly for /check).

JohnCampbell

Posts: 2526 Sydney

Back to Top

2 Dec 2011 9:50 #9339

Paul,

The difference I am reporting is between /debug and no option, not between /debug and /check. ( I think with the /3gb option, /debug also cancelled it out.) I am looking for an option that includes code line number references but still allows the extended addressing beyond 2gb. Does /debug provide more than code line number referencing for call back reporting and sdbg ? I thought any code checking would have required at least /check. I have not found /debug to provide a noticeable run time penalty, which I took to mean there was minimal run time checks associated with /debug. My approach is to use /debug for the bulk of the code, then use /opt for the few areas of code (typically in static libraries) where the bulk of the computation takes place. Mixing the compilation options through .obj files turns off the extended memory access. It would still be good to provide the extended memory addressing capability at least with /debug capability. I'm not sure why the restriction applies.

John

PaulLaidler

Posts: 7975 Salford, UK

Back to Top

2 Dec 2011 1:03 #9340

I have logged this for investigation. I cannot do anything right now but hopefully soon.

JohnCampbell

Posts: 2526 Sydney

Back to Top

6 Dec 2011 7:28 #9349

Paul,

I note you are unable to do anything right now, but I am puzzled as to why /debug could restrict access to above 2gb memory for ALLOCATE. My interpretation of /debug is to only provide an index structure for source files and code line numbers.

Having above 2gb memory access is not as good as 64-bit but it can extend the life of 32-bit access to Clearwin+ graphics.

John

PaulLaidler

Posts: 7975 Salford, UK

Back to Top

6 Dec 2011 8:30 #9350

Apparently /debug does restrict the memory access to 2GB, perhaps for no good reason, but I need to work through the relevant code before I can comment.

Robert

Posts: 450 Manchester

Back to Top

13 Dec 2011 5:53 #9371

One reason would be the top-bit NULL pointer checking. I must admit I didn't realise it restricted memory in /debug.

JohnCampbell

Posts: 2526 Sydney

Back to Top

14 Dec 2011 11:31 #9378

Robert,

/debug restricting memory was a problem with /3gb, although I did not find /3gb very useful. Now with Win x64, there is access to an extra 2gb of memory using FTN95, which allows me to develop 64bit solutions using FTN95, but limited to a 2gb ALLOCATE pool. As I much prefer the code checking of FTN95 to other compilers, I can not use /check or /debug ( in any .obj or library) when scaling up above 2gb. It would be good to at least have line number reports while developing this code. For my applications, I am finding big memory problems typically involve only one large array, so it is the maximum ALLOCATE array size and not the total memory that is important. /3gb did not increase the maximum array size. I had considered trying to use 2 ALLOCATE arrays with FTN95 or /3gb, above and below the 2gb address, but saw little future in this approach.

I would also recommend x64 OS for FTN95 or 32 bit applications, as the better memory management and disk buffering improves performance.

Thank you for your help.

John

DanRRight

Posts: 2877 South Pole, Antarctica

Back to Top

19 Dec 2011 7:17 #9384

Well, this is the most painful problem with FTN95 to me for the last almost a decade. I actually stopped active code development due to that pondering if i have to switch compilers, or change the methods, or start the new code completely rebuilding older one (which may take another decade)

I expected this team will be the first to embrace 64bit as they were first who made 32bit Fortran compiler which was able virtually allocate all 2 GB in the 80th when maximum RAM memory on PC were limited to notorious 640 K. Well...Now even free G77 is a 64bit compiler while PCs can get 50 GB of RAM. Not mentioning the hopes for something like older 'virtual common' to be able to allocate 4 billion times more RAM in a snap then these 4GB.

JohnCampbell

Posts: 2526 Sydney

Back to Top

20 Dec 2011 11:47 #9385

Dan,

I have looked at Intel and Portland 64-bit compilers and neither of them allow COMMON above 2gb. I think that the limit could be that Microsoft does not allow defined arrays above 2gb with the linking facility they provide for 64-bit .exe files. The only way I can get more than 2gb is via ALLOCATE, which I have successfully used. It has required re-writing my memory management approaches. I have developed an approach to use a memory pool that is related to the physical memory available, as Window's virtual memory approach does not work very well. Once you run out of memory, 32-bit is as good as 64-bit. I've also found that the vector instructions could be a significant factor in improved run time performance vs FTN95. It is interesting to see the focus that Intel have on the Polyhedron benchmark set, which appear to be a fairly narrow range of Fortran computation.

John

DanRRight

Posts: 2877 South Pole, Antarctica

Back to Top

22 Dec 2011 4:53 #9388

It's like '640K is enough for anyone' LOL Same mental shezzz over and over again.

Yes, by some generally absolutely not justifiable reason 64bit IVF does not allow static arrays >2GB, only allocatable ones. Hope if FTN95 will ever become 64bit it will not follow them (they at the end have their own C compiler). I afraid that million times per run allocation and deallocation of huge arrays (sparse by the way) in my code will tremendously slow it down.

Sebastian

Posts: 177

Back to Top

3 Feb 2012 9:37 #9581

Once you run out of memory, 32bit is as good as 64-bit.

No it's not since standard 64bit PCs have at least 8GB (workstations here have 24GB by default) which is significantly different from what 32bit systems (ftn95) can use.

But official statements regarding 64bit support are plain depressing.

JohnCampbell

Posts: 2526 Sydney

Back to Top

3 Feb 2012 10:44 #9582

I can't recall where I said that, but what I meant was that when my 64bit program requires more memory than is installed, then it goes to a disk based solution. In this case both the 32bit or 64bit solutions have similar solution times. For the FE problems I solve, 1.5gb of addressable memory for my out of core solution is adequate. Recently, I have found running programs requiring 2gb of memory and using 5gb for disk cacheing provides good buffering of disk I/O with performance times similar to the 64bit solver. My recent conclusion is that the 32bit solver still has some life left. My view of this has changed over the years. At 2002, with only 2gb of physical memory, there was no significant amount of memory available for disk cacheing. By about 2006, with processor improvement outpacing disk I/O, I did a lot of work to mimise disk I/O, which identified the attraction of 64-bit solutions. Now in 2012, improved disk cacheing and SSD means that 32-bit approaches are more effective. 64-bit still has an advantage, as for a new type of analysis, it is easier to write a in-core 64-bit solution than develop a 32-bit out-of-core solution. I find that my controling the 'out-of-core' solution is much better than the OS paging, although I am yet to try a 64-bit paged solution using SSD. I also think that there needs to be a new definition of the 64bit memory implementation for linking. Why can't we have COMMON > 2gb?

John

DanRRight

Posts: 2877 South Pole, Antarctica

Back to Top

6 Feb 2012 7:28 #9587

'Why can't we have COMMON > 2gb?'

Apple patented it 😃

As to using SSD -- my advice is to use RAMdisk (or RAMdrive) instead with 64bit Windows. That's 10 times faster (6GB/sec) and allows to get a lot of RAM dedicated to RAMdrive. With 8GB SIMs on 6 memory slot motherboard it's 48GB. I use and like most QSoft RAMdisk which is speed king and is free, it's so great (i plan to pay the author anyway, when he will implement delayed write to backup almost in realtime RAMdrive to harddrive without RAMdrive's speed drop. Yes, you do not lose your RAMdrive content when you reboot computer and there is very little chance you lose anything besides the current write stream when computer crashes)

There also exist RAMdisk which allows >4GB allocation for RAMdisk with 32bit Windows but i did not try it.

JohnCampbell

Posts: 2526 Sydney

Back to Top

6 Feb 2012 11:22 #9588

Dan,

Thanks very much for the advice. I am about to get a SSD and will be able to test this option. I also looked for QSoft RAMdisk, but was directed to a download site that Mcafee did not like ? I am interested in understanding the relative read and write speeds for these options. Windows 7 (and to a lesser extent XP) appears to provide good disk cacheing if you don't demand too much memory for your program, which is a simple option. Last year I tested one of the cheap netbooks with a SSD, but their performance was extremely bad for disk I/O. Not sure why, I assumed that the bandwidth of the disk I/O service was the problem. We continue to learn !

John

Sebastian

Posts: 177

Back to Top

8 Feb 2012 8:31 #9589

For the FE problems I solve, 1.5gb of addressable memory for my out of core solution is adequate.

That's nice for you and I really hope you can stick to that requirement for a long time. For other people the requirements increase over time, it happens that the model size doubles or whatever, and all means of avoiding temporary memory have been exhausted so you simply can't live without allocating a contiguous field that is larger than 2GB. Even if you can force your customers now to tweak their models as far as possible to make things work, I have to tell them that there is no viable solution and things are plain not future-proof. Don't have to express what that means.

JohnCampbell

Posts: 2526 Sydney

Back to Top

8 Feb 2012 11:44 #9590

Sebastian,

Thanks very much for your feedback. I think I understand what you are saying. I know I can not speak for your field, but in my areas of computation there are alternatives. Out of core solutions can be done. However, moving to in-core can result in significant speed increases offering the opportunity for larger problems or more itterations and greater accuracy. One thing that has worried me is, if the matrix size is proportional to n^3, it does not take much scaling to run out of physical memory, even in a 64-bit solution. Perhaps out-of-core and a 256gb SSD offers a more practical solution when scaleing up the problem. One thing I have recently noticed that with improved disk cacheing reducing the effective disk I/O delays, my focus has returned to computational speed. Having access to the vector instructions could speed up run times, especially for dot_product and other array intrinsics.

I don't know about you, but I do not have the budget to test many options so I have to target solutions which suit a broard range of problems I am solving. I'd be interested in other experiences.

John

PaulLaidler

Posts: 7975 Salford, UK

Back to Top

22 Feb 2012 3:18 #9664

I have investigated John's original query on this thread.

FTN95 allocates memory in the conventional way for /debug. It uses its own restricted allocation approach for /check. This means that the compiler does not limit the memory allocation for /debug.

SLINK cannot not distinguish between object code created by the compiler using /debug from that created using /check so it assumes the worst case. If you are running SLINK directly (rather than via the compiler /LINK) then you can use /3gb on the SLINK command line when you know that /debug but not /check has been used.

I have added /3gb to the SLINK command invoked by FTN95 (with /LINK). This means that, in the next release of FTN95, you will get the 3gb when /link is used together with /debug but without /check (otherwise 3gb is the SLINK default).

JohnCampbell

Posts: 2526 Sydney

Back to Top

22 Feb 2012 10:39 #9669

Paul,

Thank you for following that up. It will provide access to 4gb and a better error report for testing code. I will update my batch files and see what happens with the next release. When running on XP_64 or Windows7_64, you actually get access to 4gb with ALLOCATE, which is more useful.

John