Silverfrost Forums

Welcome to our forums

3gb switch update ?

29 Sep 2008 1:34 #3855

Paul,

I also tried various block sizes in your program with unusual results. While 80mg blocks gives 2,768mb, the allocated memory reduces with bigger block sizes. (> 597mg) 600mg blocks gives only 2 blocks and fails at only 1,887mb while 1,000mg blocks gives up to 2,097mb 1,200mg and above gives nothing !!

Unfortunately, I'm from old Fortran and am not familiar with pointers !

All this with FTN95_NEW_MEMORY

!     Last change:  JDC  29 Sep 2008   9:00 am
!  Program to test memory sizes
!
   real*8,    pointer :: gib(:) 
!
   integer*4  i, alstat, block_size
   integer*4, parameter :: mega_byte = 1024*1024/8
   integer*8, parameter :: eight     = 8
   integer*8  size, b8 
!
   write (*,*) 'megabyte ?'
   read  (*,*) i
   block_size =  i*mega_byte   ! defines 1 blocks
!
!   block_size =   80*mega_byte   ! defines 33 blocks 2,768,240,640    2,852m fail
!   block_size =  200*mega_byte   ! defines 12 blocks 2,516,582,400    2,726m fail
!   block_size =  300*mega_byte   ! defines 7 blocks  2,202,009,600    2,516m fail
!   block_size =  400*mega_byte   ! defines 5 blocks  2,097,152,000    2,516m fail
!   block_size =  500*mega_byte   ! defines 4 blocks  2,097,152,000    2,621m fail
!   block_size =  550*mega_byte   ! defines 3 blocks  1,730,150,400    2,306m fail
!   block_size =  597*mega_byte   ! defines 3 blocks  1,877,999,616    2,503m fail
!   block_size =  600*mega_byte   ! defines 2 blocks  1,258,291,200    1,887m fail
!   block_size =  800*mega_byte   ! defines 2 blocks  1,677,721,600    2,516m fail
!   block_size = 1000*mega_byte   ! defines 2 blocks  2,097,152,000    3,115m fail
!   block_size = 1100*mega_byte   ! defines 1 blocks  1,153,433,600    2,306m fail
!
!   block_size = 1200*mega_byte   ! defines no blocks
   b8         = block_size
   size       = 0
!
   do i = 1,1000
     allocate (gib(block_size),stat=alstat) 
     size = size + eight*b8
     print*, i, size, alstat
     if (alstat /= 0) exit
   end do 
   end
29 Sep 2008 4:05 (Edited: 29 Sep 2008 4:09) #3856

Paul,

I again modified your example to allocate multiple real*8 arrays and test the gaps between them using LOC.

Using 312,000,000 byte chunks, the allocation started at about 258 mb, it allocated 4 blocks from 258mb to 1448 mb it then left a 41mb gap at 1449mb and another 261mb gap at 1787 mb finally 3 blocks from 2048mb to 2940 mb

Do you know why it started at 258mb and what are the 2 other gaps ? I'd expect if I used 80mb blocks, there may be other gaps. This also shows a maximum contiguous area of 1200 mb without any influence. Presumably when we get arrays of 1.8gb, some of these gaps are moved around.

The following is the code I used, which gave these interesting results.

!     Last change:  JDC  29 Sep 2008    1:30 pm
!  Program to test memory sizes
!   Allocate the blocks then test for contiguous memory
!
!   real*8,    pointer :: gib(:) 
     real*8, allocatable, dimension(:) :: gib_1
     real*8, allocatable, dimension(:) :: gib_2
     real*8, allocatable, dimension(:) :: gib_3
     real*8, allocatable, dimension(:) :: gib_4
     real*8, allocatable, dimension(:) :: gib_5
     real*8, allocatable, dimension(:) :: gib_6
     real*8, allocatable, dimension(:) :: gib_7
     real*8, allocatable, dimension(:) :: gib_8
!
     integer*4  i, alstat, block_size
     integer*4, parameter :: mega_byte = (1000*1000)/8 ! 10^6 looks better
     integer*8  size_8
!
     write (*,*) 'megabyte ?'    ! 313 for all
     read  (*,*) i
     block_size =  i*mega_byte   ! defines 1 blocks
!
     i    = 0
     size_8 = 0
!
     allocate (gib_1(block_size),stat=alstat)
     call report (i, gib_1, size_8, block_size, alstat)
      if (alstat /= 0) goto 100
!
     allocate (gib_2(block_size),stat=alstat)
     call report (i, gib_2, size_8, block_size, alstat)
      if (alstat /= 0) goto 100
!
     allocate (gib_3(block_size),stat=alstat)
     call report (i, gib_3, size_8, block_size, alstat)
      if (alstat /= 0) goto 100
!
     allocate (gib_4(block_size),stat=alstat)
     call report (i, gib_4, size_8, block_size, alstat)
      if (alstat /= 0) goto 100
!
     allocate (gib_5(block_size),stat=alstat)
     call report (i, gib_5, size_8, block_size, alstat)
      if (alstat /= 0) goto 100
!
     allocate (gib_6(block_size),stat=alstat)
     call report (i, gib_6, size_8, block_size, alstat)
      if (alstat /= 0) goto 100
!
     allocate (gib_7(block_size),stat=alstat)
     call report (i, gib_7, size_8, block_size, alstat)
      if (alstat /= 0) goto 100
!
     allocate (gib_8(block_size),stat=alstat)
     call report (i, gib_8, size_8, block_size, alstat)
      if (alstat /= 0) goto 100
!
  100 continue
     end

!     subroutine report (n, gib, size_8, block_size, alstat)
29 Sep 2008 4:06 #3857

Again there was too much code:

     subroutine report (n, gib, size_8, block_size, alstat)
!
     integer*4 n, block_size, alstat
     real*8    gib(block_size)
     integer*8 size_8
!
     integer*8, parameter :: zero     = 0
     integer*8, parameter :: eight    = 8
     integer*8, parameter :: one_gb   = 1024*1024*1024
     integer*8  b8, m1, m2, m0, m_alloc, m_gap, two_gb, four_gb
     intrinsic  loc
     save       m0
!
     b8   = block_size
     n    = n + 1
     size_8 = size_8 + eight*b8
!     print*, n, size_8, alstat
      if (alstat /= 0) return
!
     m1 = loc (gib(1))
     m2 = loc (gib(block_size))
!
     two_gb  = one_gb + one_gb
     four_gb = two_gb + two_gb
     if (m1 < zero) m1 = m1 + four_gb
     if (m2 < zero) m2 = m2 + four_gb
!
     m2      = m2 + eight
     m_alloc = (m2-m1)
     m_gap   = 0
     if (n > 1) m_gap = m1 - m0
     m0      = m2
!
     write (*,1001) 'memory location', n, m1, m2, m_alloc, m_gap
1001 format (a,i3,2i12,i11, i10)
     end subroutine
30 Sep 2008 8:28 #3860

Paul,

I again took your program and tried to extend the 'cheating' on multiple allocates for pointer arrays, to test the memory availability. I put the program in as a subroutine, so that when it exits, I thought all the allocates would be automatically deallocated (fortran standard ?). This does not appear to work. However with a block size of 20mb, it shows in more detail what areas of memory are not available. The program also appears to confirm that the /3gb switch is providing the 3gb of accessible memory. I'm assuming the contiguous areas can be addressed as merged blocks. I will take the previous real*8 array program in an earlier post and confirm if this works.

It remains to be able to shift the 'unavailable parts' to a better area to maximise the size of contiguous available memory. Isn't this what Slink does ?

!     Last change:  JDC  30 Sep 2008    9:53 am
!
   call test_contiguous (20)
   call test_contiguous (40)   ! this call failed
   call test_contiguous (80)
   call test_contiguous (120)
   call test_contiguous (200)
   call test_contiguous (400)
   call test_contiguous (600)
   call test_contiguous (800)
   end

   subroutine test_contiguous (block_mb)
!
!  subroutine to test memory sizes and contiguous allocation
! 
   integer*4  block_mb
!
   real*8,    pointer :: gib(:) 
! 
   integer*4  i, alstat, block_size
   integer*4, parameter :: mega_byte = 1024*1024/8 
   integer*8, parameter :: zero      = 0
   integer*8, parameter :: four      = 4 
   integer*8, parameter :: eight     = 8 
   integer*8, parameter :: one_gb    = 1024*1024*1024
   integer*8, parameter :: four_gb   = one_gb*four
   integer*8  size, b8, js,je
   real*8     mgs, mge, mgg, mgl
! 
   write (*,*) 'Testing block size ',block_mb,'mb'
   block_size = block_mb*mega_byte    ! defines 1 block in real*8
! 
   b8   = block_size*eight
   size = 0 
   mgl  = 0    ! last end
! 
   do i = 1,1000 
     allocate (gib(block_size),stat=alstat) 
! cumulative size (bytes)
     size = size + b8                                                   
! start and end address (bytes)
     js = loc(gib(1))           ; if (js < zero) js = js + four_gb      
     je = loc(gib(block_size))  ; if (je < zero) je = je + four_gb      
! start and end address (mb)
     mgs = dble(js) / 1024./1024.                                     
     mge = dble(je) / 1024./1024.                                     
! gap size (mb)
     mgg = mgs - mgl
     mgl = mge
     write (*,2001) i, size, alstat, js,je, mgs, mgl, mgg
2001 format (i5, i12, i5, 2i12, 3f10.2)
     if (alstat /= 0) exit 
   end do 
! hopefully release allocated memory
   end subroutine test_contiguous
1 Oct 2008 2:35 #3862

Paul,

I took the memory test subroutine and placed it in the large program I am wanting to use for extended memory. I ran it at the start for a number of different working array sizes, with and without /3gb switch in slink.

  1. without the 3gb in slink, only 2gb of memory is accessible.

  2. with 300mb working array, i get a memory usage of : first 371 mg is taken by program ( 300mb + 71mb) at 1429mb, there is a 24.6mb gap at 1486mb, a 3.5mb gap at 1874mb, a 46.1mb gap at 1984mb, a 63.9mb gap

  3. with 1000mb working, a similar set of gaps

  4. with 1600mb working array, i get a memory usage of : first 1664 mg is taken by program ( 1600mb + 64mb) gaps at 1429mb and 1486mb have been overwritten, ie omitted at 1889mb, 46.1mb gap has been reduced to 30.8mb at 1984mb, a 63.9mb gap

  5. with 1750mb, similar (.map version)

ie, the gaps have been removed and do not appear to have been shifted

I did not run the program past the memory test. There appears to be 3 parts of memory reserved for a small program, but only 2 for a larger program. If the program encroaches on the 2 beyond 1889mb, then it becomes an illegal .exe What are these 3 parts of memory being used for ? Can they be relocated, perhaps to 2900mb or possibly to 64mb, and place the .bss section after them ?

I am attaching to an email the memory availability dump I generated from the memory scanning subroutine for these 5 tests and the last .map file for the program.

I did not run the program. I have not yet tested the earlier real*8 array test to see if the memory allocated is useable.

Does this make any sense ?

Regards

John

6 Oct 2008 10:55 #3875

Ian

I have not managed to find your sample code. Can you send it again please to my hotmail account.

Paul

6 Oct 2008 9:10 #3881

Paul,

It appears that the attachments may have been deleted, as they contained .exe files. I will send it again, without the .exe. I have included all the recent test programs. Please let me know if the attachments are not there this time. Otherwise I will have to use my home email account !

John

8 Oct 2008 4:08 #3888

Paul,

I have done a review of the large program I am wanting to run in 3gb of memory. The following is an approximate breakdown of the memory usage in the .map file:-

0.64mb .text - which is my code 0.12mb .data - which is salflibc/kernel32 0.69mb .salfdbg - presumably due to link map and /debug compilation 1.93mb .bss - my common excluding the big array

The total memory usage is 3.44 mb, excluding the big array

In summary, the memory usage, excluding the big array would be 0.1% of memory.

To use anything significantly above 2gb, the big array must be 99.9% of this. Either I must be able to have a single array bigger than 2gb or I have a major rewrite to use 2 big arrays or 'chunks' in my algorithm.

The example problem I have been targeting to use for this /3gb approach requires an array of about 2.4gb in size. Will this be a possibility ?

Are any other forum users interested in this /3gb facility ?

Paul, thanks again for your assistance with this.

John

8 Oct 2008 7:17 #3889

John

My understanding is that you cannot have a single array of more than 2GB. Apart from rewriting your code, the only possible way forward would be to try running your program under WOW on a 64 bit machine and operating system. I don't know if anyone has tried this but you may be able to get 4GB this way. Alternatively you would have to switch to a 64 bit compiler, machine and operating system.

Paul

11 Oct 2008 4:52 #3895

Quoted from PaulLaidler

try running your program under WOW

World of Warcraft? 😄 For me 32bit is dead 15 years ago when I painfully hit its 1.6GB limit. Great that 32bit is finally dead starting with this fall shopping season. No single laptop in Best Buy or Frys right now is selling with 32bit XP (only one is the pocket EEE laptop), all are with 64bit Vista running 3-4 GB RAM. For $99 total you can upgrade your desktop PC motherboard and 64bit processor, for $49 you'll get 4GB RAM (this month prices in Frys). You can get 64bit OEM Vista for cheap or use 64bit XP. The only what lagging are 64bit compilers. What is general obstacle to rewrite compiler for 64bit OS?

Quoted from JohnCampbell

Are any other forum users interested in this /3gb facility ?

waste of time, I am sure you push a wrong button. With 64bit compiler you will get not factor 1.5-2 but many orders of magnitude more (well, of course Microsoft will restrict you just to not to forget who is the boss). That will totally change the way of thinking about your future algorithms

11 Oct 2008 5:11 #3896

Quoted from DanRRight

Quoted from PaulLaidler

try running your program under WOW

World of Warcraft? 😄 For me 32bit is dead 15-20 years ago when I painfully hit its 1.6GB limit and survived only thanks to Virtual Common only Salford had in their compilers. Great that 32bit is finally dead for everyone starting with this fall shopping season. No single laptop in Best Buy or Frys right now is selling with 32bit XP (only one is the pocket EEE laptop), all are with 64bit Vista running 3-4 GB RAM. For just $99 total everyone can upgrade their desktop PC motherboard and 64bit processor, for $49 you'll get 4GB RAM (this month prices in Frys). You can get 64bit OEM Vista for cheap or use 64bit XP. The only what lagging are 64bit compilers. What is general obstacle to rewrite compiler for 64bit OS? AFAIK Intel Fortran is already 64bit

Quoted from JohnCampbell

Are any other forum users interested in this /3gb facility ?

waste of time, I am sure you push a wrong button. With 64bit compiler you will get not factor 1.5-2 but many orders of magnitude more (well, of course Microsoft will restrict you just to not to forget who is the boss). That will totally change the way of thinking about your future algorithms

13 Oct 2008 12:08 #3897

Dan,

Thank you for the feedback, although I find your comments somewhat extreme. 'For me 32bit is dead 15-20 years ago'. That's 1988-1993. I certainly was not contemplating vitrual disk space of >2gb back then and 3gb came out with XP.

It is good to see that there are a lot of 64-bit pc's being sold now, although where I work is still committed to 32bit architecture.

'With 64bit compiler you will get not factor 1.5-2 but many orders of magnitude more'. Hard to believe, but I would like to know any comparison of run-time performance. You would need to differentiate between compute only performance and applications that involved significant disk accessing to run on 32-bit machines.

However, the disk I/O is what is causing me to consider 64-bit. The sample problem I am trying to target, can run on existing 32-bit machines, but when it requires 'disk buffering' with blocks of about 1gb in size, there is a lot of time to think wouldn't it be good if this info could stay in memory, especially when you see the price of extra memory.

FTN95-64 would be a great thing to have. To me, certainly much better than FTN95.net. Unfortunately .net appears to have downsized a number of fortran venders. There are a number of new features in fortran 95 + that I am still yet to use (or need), without yet contemplating the benefits of .net. It's a shame that .net has changed the fortran landscape so much.

I am certainly going to investigate what is available in XP64 computing.

Thanks again Dan for your thoughts.

John

13 Oct 2008 11:27 #3898

John

I have had a look at your program big_array.f95 which gives the error report '<Program> is not a valid Win32 program'.

I suspect that this is a result of the amount of memory being allocated directly in the code.

If you allocate the memory dynamically (you will also have to avoid using COMMON for the arrays) then the dynamic allocation also fails because the chunks are too large.

As before, the point is that you can get more that 2GB in total but you will be limited in chunck size by the existing usage.

I will try to get a more informed opinion about this but it may take time.

Paul

13 Oct 2008 8:52 #3899

Paul,

Can we identify all the reserved parts of memory and shift them in Slink, to provide fewer bigger chunks of memory to allocate ? There is the heap and the stack. What others ? Where is the code of all the routines in salflibc.dll ? Is this in the OS above 3gb ?

John

13 Oct 2008 9:13 #3900

As far as I know there is no way to do this. The system can move 'movable' memory as it deems fit but other programs can allocate fixed memory that no-one can move.

As I keep saying, I am not an expert on this matter and I will try to get a more informed opinion.

25 Oct 2008 10:57 #3916

I now have a better understanding of the issues involved in memory management.

When an executable starts up, it initially has access to the full address space that is available and it does not matter if other executables are already running. The full address space (not used by the operating system) is normally 2GB but this can be increased to 3GB.

However, when the executable loads up it will invoke DLLs, e.g. salflibc.dll and Microsoft DLLs, and there is no way for SLINK to control how the memory used by the DLLs is allocated. The result is that the programmer has to make do with whatever the operating system can provide.

Regarding your program (big_array.f95), this allocates memory for a large array directly (i.e. at compile time rather than via ALLOCATE at run time). The direct memory allocation is also in the main program which means that the array effectively has the SAVE attribute. SAVEd memory is allocated on a stack that can be increased via a SLINK command. The critical value is the stack reserve that has a default size of 50MB. You can increase this but you will be limited in how far you can push this value for essentially the same reasons that you can only go so far with dynamic memory allocation at runtime.

For your purposes it would be simpler to use dynamic memory allocation (via ALLOCATE) since this would give you a better control and feel for what is happening.

The error message that you get (Program is not a valid Win32 program) is simply a result of the stack being too small. This is a message from the operating system so it is not under our control.

I hope that this helps. I know that this may not solve your problems but I will have to leave it to others with more experience in these matters to advise you further.

26 Oct 2008 10:33 #3917

John,

How about setting up a ramdisk? This technology has been around a long time. http://www.superspeed.com/desktop/ramdisk.php has one that they claim can use unmanaged space (i.e. above 4Gb in XP/Vista 32). You are still going to have to transfer huge chunks of RAM contents, and you need a lot of RAM, but you should see some improvement in run times.

Eddie

27 Oct 2008 5:58 #3919

Paul, Thanks for your update. I will need to think about it for a while.

Eddie, I shall follow up what you have said. The idea of a 8gb virtual disk sounds great. I'm not sure of the reality of it all.

John

Please login to reply.