soccer jersey forums.silverfrost.com :: View topic - Fails to save arrays > 4GB
forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Fails to save arrays > 4GB
Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
DanRRight



Joined: 10 Mar 2008
Posts: 2863
Location: South Pole, Antarctica

PostPosted: Tue Jul 18, 2023 11:50 am    Post subject: Reply with quote

Well, I have the same crashing behavior on Windows and Linux. These are two absolutely independed installations which do not know about each other. Plato was not used. Can anyone else try to run my read/write test?

My Linux noticed inconsistency in latest MOD files update, they now use capital letters, but in Windows that does not matter
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 8011
Location: Salford, UK

PostPosted: Tue Jul 18, 2023 12:53 pm    Post subject: Reply with quote

Dan

Which version of FTN95 are you using and what is the date for clearwin64.dll and salflibc64.dll?
Back to top
View user's profile Send private message AIM Address
mecej4



Joined: 31 Oct 2006
Posts: 1896

PostPosted: Tue Jul 18, 2023 1:24 pm    Post subject: Re: Reply with quote

DanRRight wrote:
Well, I have the same crashing behavior on Windows and Linux. These are two absolutely independent installations which do not know about each other. Plato was not used. Can anyone else try to run my read/write test?


Dan, this thread now spans ten months, and contains multiple test programs with different objectives and in the posts there are reported multiple problems that were encountered with different versions of the compiler. I think that you should, if you wish others to reproduce an issue that you see, post a test program, name the compiler versions and build options used, and a short description of the failure(s) that you observe.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2580
Location: Sydney

PostPosted: Tue Jul 18, 2023 1:51 pm    Post subject: Reply with quote

Dan,

I have now run your program on page 4, "Posted: Wed Jul 12, 2023 11:44 am"

It runs to completion, as did Paul's example, but with significantly different performance rates. I do not get your "Access violation"
I think Paul is correct to ask if you have consistent .exe and .dll files in your path.

Thinking about an aspect of this problem, I also changed your "Method 1" to write larger chunks of arr4 using:
Code:
      num = 0
      do i1 = 1,nB, ni
        i2 = min ( i1+ni-1,nB )
        write (11,err=910) Arr4(:,i1:i2)
        num = num+1
      end do

This will provide a comprmise, that doesn't need extremely long records.

Another problem with these tests examples is the significant variation in estimates of GB/sec. There are not only problems with using CPU time, but also changing the test environment by the sequence of writes, then read tests. The basic test approach does not test the same read or write conditions, due to other influences, especially OS buffering and file re-use.

Code:
 Trying to allocate GB of RAM :          8.80000000000   
 Allocation success
 Trying to save the data Method 1
 Write OK. Speed of write Method 1 =    0.411696   
 =====================
 ================ N O W    R E A D ====================
 READ OK. Speed of read Method 1 =    0.457886   
 =============================
 Trying to save the data Method 2
 Write OK.  Speed of write  Method 2=     3.16404   
 =====================
 ================ N O W    R E A D ==================
 READ OK. Speed of read   Method 2 =     5.80619   
 ======================
**** PAUSE: File LargeFile.dat created OK
Press ENTER to continue:
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2863
Location: South Pole, Antarctica

PostPosted: Tue Jul 18, 2023 8:46 pm    Post subject: Reply with quote

John, Finally! That was the only what i was asking for: to check if my latest test with latest FTN95 works.

I ran latest FTN95 update, all dates on files are 06/28 and 06/29, the update was from 8.92, and my support contract was and still is current

And i ran my latest test posted a week ago (my date on this post is shown as Tue Jul 11, 2023 5:44 pm, in other places the date and hour may vary, but minutes have to be 44)

After confirmations that all works I found the error in my compilation settings, and now all works (edited and forgot to return back /64. May be it's time to switch to /64 as a default ? Other compilers switched there long ago ).
Code:

 Trying to allocate GB of RAM :          8.80000000000   
 Allocation success
 Trying to save the data Method 1
 Write OK. Speed of write Method 1 =    0.554156   
 =====================
 ================ N O W    R E A D ====================
 READ OK. Speed of read Method 1 =    0.575916   
 =============================
 Trying to save the data Method 2
 Write OK.  Speed of write  Method 2=     4.97175     
 =====================
 ================ N O W    R E A D ==================
 READ OK. Speed of read   Method 2 =     10.3530 


Timings on fast Method2 with the speed are a bit odd but stable within 10%. Caching related probably. That's a different matter though
Thanks to all.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2580
Location: Sydney

PostPosted: Thu Jul 20, 2023 1:45 pm    Post subject: Reply with quote

Dan,

So the answer is the latest FTN95 does not crash with your latest example.

There are 2 problems with this test example which you should also consider.

1) CPU_TIME is not appropriate for estimating disk I/O performance. This can be clearly seen from your method 2 where the processor is waiting for the IO operations to complete. This is demonstrated by different cpu time vs wall clock time which I included in my modified example.

2) The order of disk tests is also an issue as repeating the test changes the disk and memory environment. For method 2 read, the last write operation uses the identical (large) buffer as the previous write, so the information is already in the IO buffer and apparently the Windows I/O recognises this and does not do any read. This GByte/sec is not a realistic read rate estimate.

Stream I/O satisfies the portability of FTN95 and Gfortran.

Perhaps the next horizon is endianess !!
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2863
Location: South Pole, Antarctica

PostPosted: Fri Jul 21, 2023 12:41 am    Post subject: Reply with quote

Paul, Based on your timings i was scratching the head on what kind of computers do you run this test? I do not know if in the nature yet existed such a computer Smile. Aliens may have one which run Test1 with such huge boost. This was because of caching ? But i have not seen such tremendous effect of caching on Test1, all tests ran well below 1GB/s. Was there some secret compiler keys used?

Code:
Trying to allocate GB of RAM : 8.80000000000
Allocation success
Trying to save the data Method 1
Write OK. Speed of write Method 1 = 1.36699
=====================
================ N O W R E A D ====================
READ OK. Speed of read Method 1 = 3.04432
=============================
Trying to save the data Method 2
Write OK. Speed of write Method 2= 6.54884
=====================
================ N O W R E A D ==================
READ OK. Speed of read Method 2 = 1.23239
======================
**** PAUSE: File LargeFile.dat created OK


If this is timer problem may be would be good to introduce some new more realistic timer.
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 8011
Location: Salford, UK

PostPosted: Fri Jul 21, 2023 6:51 am    Post subject: Reply with quote

Dell Inspiron 27 7710 All-in-One

Processor 12th Gen Intel(R) Core(TM) i7-1255U 1.70 GHz
Installed RAM 16.0 GB (15.7 GB usable)
System type 64-bit operating system, x64-based processor
Pen and touch No pen or touch input is available for this display

Windows 11 Home

512 GB Solid State Drive (M.2 SSD) + 1 TB Serial ATA (SATA)
Intel Iris Xe Graphics
Back to top
View user's profile Send private message AIM Address
DanRRight



Joined: 10 Mar 2008
Posts: 2863
Location: South Pole, Antarctica

PostPosted: Fri Jul 21, 2023 8:35 am    Post subject: Reply with quote

I have no explanation for your numbers vs my AMD 5950X processor, 128GB RAM and WD850X NVMe storage besides you have DDR5 vs my DDR4:

https://nanoreview.net/en/cpu-compare/intel-core-i7-1255u-vs-amd-ryzen-9-5950x

But I have never seen any even super-duper fast memory could give more than 10% difference.

Have you modified my test? I see some signs of editions

Do anybody here have 12-13th generations of Intel Core processors and see such speeds with Method 1 ? Or AMD 5950X/7950X with DDR5 and PCIe 5.0 ? Use my test from page 4 above, nothing else


Last edited by DanRRight on Fri Jul 21, 2023 9:26 pm; edited 3 times in total
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2580
Location: Sydney

PostPosted: Fri Jul 21, 2023 11:42 am    Post subject: Re: Reply with quote

DanRRight wrote:
If this is timer problem may be would be good to introduce some new more realistic timer.


There is already a timer that solves the timer problem : SYSTEM_CLOCK

I have provided a modified version of your test that shows the ratio of CPU time to wall clock time and also tests a variety of block sizes.

https://www.dropbox.com/s/kxk7e0z1fbiuyq4/read_write2.f90?dl=0

https://www.dropbox.com/s/exqqpwzsb0efiwg/stream_tests.log?dl=0

You were incorrectly estimating much higher transfer rates because the cpu was waiting for the IO buffer delays.

With your old method 1, you were writing 200 million records of 44 bytes, which is 200 million writes. That is why it was so much slower. It is not hard to do fewer writes of larger records, as I have demonstrated in this example.

The method 2 read performance is very interesting, as there is actually no read from the file, as the file buffers contain this last record from the write. So not a true read test.

It is not a timer problem but a test problem.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2863
Location: South Pole, Antarctica

PostPosted: Tue Jul 25, 2023 6:53 am    Post subject: Reply with quote

Paul, Your test results raised few questions because they are very unusual for the Test1. What memory type your PC has? Is it DDR4 or DDR5? Is bus PCIe 4 or PCIe 5? What storage type NVMe or harddrive? Did you edit my test on Page 4? Can you repeat my test one more time also measuring time by hand watch(approximately is also OK) or can you use SYSTEM_CLOCK (not adding to the test but substituting mine ) as JohnC claims this is problem with the timer (this claims look doubtful for the results of Test1 because test lasts for a long time and due to that the different timers differences have to be small)?
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 8011
Location: Salford, UK

PostPosted: Tue Jul 25, 2023 7:51 am    Post subject: Reply with quote

My machine is "out of the box" as described above. I don't recall needing to change your code.

With my limited knowledge in this area I think that the main issue is that it has "512 GB Solid State Drive (M.2 SSD)" and 16 GB of RAM.

Also "12th Gen Intel(R) Core(TM) i7-1255U 1.70 GHz",
Back to top
View user's profile Send private message AIM Address
JohnCampbell



Joined: 16 Feb 2006
Posts: 2580
Location: Sydney

PostPosted: Tue Jul 25, 2023 1:13 pm    Post subject: Reply with quote

Dan,

Just publish a test that uses system_clock, rather than cpu_time, as if the processor is idle for I/O interupts, you just get a wrong result. It is BS to persist with a cpu_time test.

A 12th gen motherboard may benefit from PCIe 4.0 NVMe connection.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2863
Location: South Pole, Antarctica

PostPosted: Tue Jul 25, 2023 10:26 pm    Post subject: Reply with quote

cpu_time - bad timer?
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2580
Location: Sydney

PostPosted: Wed Jul 26, 2023 8:03 am    Post subject: Re: Reply with quote

DanRRight wrote:
cpu_time - bad timer?


Why ask the question again.
You have been quoting disk I/O transfer rate as GBytes per second, not bytes per used processor cycle.

Clearly, ignoring the time when the processor is waiting for the disk IO availability should not be excluded from the estimate of disk I/O transfer rate.

The other problem with these tests is we are reporting the rate for transferring information to or from the operating system file buffers. We don't know how well this equates to the disk reading or writing speeds. I am not aware of how to measure that the read or write action includes buffers have been emptied.

For method 2 : read; we are basically retrieving the same block of information that was just written. The OS should recognise this is still available in the buffer and so no disk access is required. Unless the disk buffers are not big enough. The 2 pc's I use have 32 gb or 64 gb of memory, so my tests never exhaust the disk buffers. Paul's pc with 16 GB memory might exhaust the buffers and so report a lower read rate.

Yet another problem is there are also buffers in the SSD drives, so we can get very different times if we overflow the SSD (faster memory) buffers.

All considered, it is very difficult to know what is being reported, which I think the SSD manufacturers rely on when reporting rates.

I also have a HP notebook with an SSD but only 8 GB memory> This reports much lower transfer rates, which is not surprising.

I think we can conclude writing 2.e8 buffers of 44 bytes is much slower than one large block, but there is possubly/probably a middle ground block size that better suits the Fortran unformatted read/write library.

Write then read tests always benefit from pre-charged disk buffering.
A different test could be reading a terrabyte stream file, but the logistics of doing the test can be difficult.
And what would it show ?

Remaining puzzled by the results !!
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next
Page 6 of 7

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group