forums.silverfrost.com

mecej4 · Joined: 31 Oct 2006 Posts: 1899

For the following program, SCC /64 generates two false warnings.

DanRRight · Posted: Tue Dec 13, 2016 12:27 am Post subject:

Mecej4, Since you are familiar with SCC I have the following suggestion/request if you have some free time. Can the CrystalDiskMark be compiled successfully with C and show how it works? How it checks I/O speed ? By this we will know how read/write speed test works, where are bottlenecks and is there potential for improvement.

Question #1 is: test shows the read or write speeds 10 GB per second on RAMDrives. That means that the read/write itself (as overhead) must go with even faster speeds! Is this true with C ?

http://crystalmark.info/software/CrystalDiskMark/index-e.html

mecej4 · Joined: 31 Oct 2006 Posts: 1899

That is a full-fledged Windows GUI program, and I do not think that SCC can compile the thing from source code without a lot of pampering. Besides, why on earth do you want to compile it from source?

Frankly, I do not understand your fixation on I/O benchmarks when there are so many other aspects of your programs that are more worthy of your attention.

A disk I/O benchmark program is justified in shoving random data to and fro and timing the movement. You cannot do the same, however, in any real application that does something useful. Real programs tend to consume and/or produce buckets of data. If you want to assess how fast your application would perform with almost infinite I/O speed, simply set the output file to NUL: (on Windows; /dev/null on Unix/Linux) and time a simulation run of your program. If the time of the run is not drastically less than it was with output to a real file, you will have proved that you are barking up the wrong tree.

You can also try this MS Technet command-line utility to time I/O to a specific file of your choice:

https://gallery.technet.microsoft.com/DiskSpd-a-robust-storage-6cd2f223

DanRRight · Posted: Fri Dec 16, 2016 8:16 am Post subject:

I more or less know how my app will behave with infinite I/O speed. It will go at least ~2-3x faster. I can stretch additional factor of 2 probably by switching off some extra not always needed calculations during the load. And that's the reason of my interest. My Fortran loading speed with even *unformatted* load is hell annoying because it's slow, around 300KB/s.

This is the program which visualizes existing data, and there is many TBs of data. When you try to find something in this forest, ideally visualization must go with instant speed because a lot of data you click on is just not what you need to find. As a result you even do not want to touch the data so sickening boring the loading process is. As soon as data is loaded the OpenGL visualization is almost instant thanks to its very good OpenGL implementation and fast hardware (thanks to realistic 3D games).

With few Fortran compilers we do not see speeds faster than mentioned above with any settings. C code like with this benchmark though somehow shows speeds 30x faster. Question remains - how C reach that speeds and why Fortran can't ?

mecej4 · Joined: 31 Oct 2006 Posts: 1899

That note clears up some questions. It also clarifies that by using I/O devices and software that are "30X faster", your effective overall speed gain may be about 2X. And, because the I/O is mostly input of massive amounts of data, you cannot use the NUL device to test the best achievable speed.

Other speed-ups such as those coming from avoiding or delaying calculations are not relevant at this point of the discussion. You can implement them or not, independently of the I/O problem and solution.

This kind of situation is standard in searching in a database. The usual solution is to compile an index to the data tables. These indices are much smaller than the main tables, and so one can search fast in the index and, when an exact or partial match is found, the corresponding portion of the main table is read into memory and processed further.

The indices do take time to build, but they need to be rebuilt/refreshed only when the new data is loaded or old data is deleted. Therefore, for the "create once, use many times" scenario, they are definitely worthwhile.

To define and create an effective index, you have to know your data intimately, and you must have a good idea of the access patterns of your users (including yourself). You have probably used an old dictionary that had thumb indices cut into the edge of the pages. So, if you want to look up 'Dan', you put your thumb on the 'D' notch and open the book. The same idea should be tried on your data.

Your reaction?

DanRRight · Posted: Sat Dec 17, 2016 4:34 am Post subject:

I still hope for getting 5x from just the software I/O speed bump alone. Because if C really can read GBs per second Fortran literally MUST do that even faster, this is what users expect from Fortran - to beat all others in speed in science and engineering area.

If this will fail - the only other way to me to speedup the navigation would be to make small thumbnail images of all parameters same like with the photography. I can not imagine how it is possible to make indexing for fast search other way.

mecej4 · Joined: 31 Oct 2006 Posts: 1899

JohnCampbell · Joined: 16 Feb 2006 Posts: 2619 Location: Sydney

Dan,

I would like to agree with mecej4.

In the benchmarking I did for you, I showed that even basic numerical conversion of text, with no file I/O processes about 100 million bytes per second. ( some gFortran are very poor and convert F and ES at 4 MB/sec; while FTN95 /64 and FTN95 "/32" do much better)
With a processor clock rate of 3 giga hertz, I don't see how you could achieve multiple giga bytes per second. ( The C code rate claims don't look realistic or if are real they can't be utilised by even basic processing of the info. Your quoting multiple GB is not feasible, as you can not process them at that speed.)
When quoting transmission rates, there is always the difference between MB (mega bytes) and Mb (mega bits) or Gb (giga bits), so there is always the uncertainty of what speed is really being quoted.

My impression was that you were struggling with 1 MB (megabyte per second) read and processing, which could be increased to 50 to 100 with stream I/O on HDD or 200-500 MB with SSD.
BUT, as you can only process the characters at about 100 MB, does it matter ?

Also, you have not identified the source of this data, How do you get it ?
If it is via the internet, the transmission rate for receiving the files is much slower than you can read them from disk.

In summary, you need to identify where the bottleneck is, and I doubt if it is with SSD or HDD transmission rates. It will probably be with processing or receiving the files.

It sounds to me that you need to have multiple PC's to process all the different files into summary or indexed forms.

John

DanRRight · Posted: Sun Dec 18, 2016 3:23 am Post subject:

Ok, mecej4 and agreeing with you John,

Please show me read and write speeds at least a half what CrystalDiskMark measures, or 5-6 GBytes per second in my case on ramdrives (yes bytes not bits per second like all C tests show) with any your methods using Fortran and then we will continue conversation about Fortran delivering almost the same speeds as C.

I even don't read and process characters, John, I use unformatted read. You are welcome to use it too for your tests to make your life easier. Processing speed after data was loaded should be absolutely different topic and is not discussed here.
PM me your address and I will send you 12 beers for the effort. Smile

mecej4 · Joined: 31 Oct 2006 Posts: 1899

Dan, I think that you are still tilting at windmills, as you can see with these tiny example programs. Both write a 64 MByte "binary" file. I ran the programs on a laptop with an i5-4200U CPU and a 128 MB ramdisk.

The Fortran code:

mecej4 · Joined: 31 Oct 2006 Posts: 1899

Dan, there is something else that I don't understand about your "problem statement". You said in #3 that your reading speed with unformatted Fortran files was 300 KB/s, and that you needed to process "terabytes of data". If so, you will have to run your computer nonstop 24/7 for over five weeks for one run. Really? Are you doing this now?

DanRRight · Posted: Mon Dec 19, 2016 4:51 am Post subject:

LOLOLOL Now i know why you did not want my present. Clearly like me on my North Pole you have no lack of booze in your place Smile

. Well, a holiday season, anyway.

This KB/s was of course a typo. I mentioned MB/s before many times but not this one.

Please keep going. Still 1.8GB/s on my PC, not 5-6 let alone 10-12 but the steps are encouraging.

mecej4 · Joined: 31 Oct 2006 Posts: 1899

DanRRight · Posted: Mon Dec 19, 2016 1:53 pm Post subject:

Something deeply wrong is here

1) Does the CrystalDiskMark test use parallelism to leave all the test results here in shameful misery?

2) How in the world it is possible to write 12GB of data to disk drive in one single second while it is not possible to load same 12 GB in second into RAM of the computer? Fun also is that RAMdrive is made out the same RAM and generally in practice I/O was never faster then RAM bandwidth.

Also as a note, ReadF@ and ReadFA@ may be fast to read big chunk or data but they are still very slow in reading line by line (10 numbers or ~160 characters per line)

PaulLaidler · Posted: Mon Dec 19, 2016 3:26 pm Post subject:

mecej4

Thanks for the feedback. I have made a note of your original post.