 |
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2615 Location: Sydney
|
Posted: Thu Jan 14, 2021 7:19 am Post subject: |
|
|
Dan,
I bought a Ryzen 5900X last Dec-20 and it is much faster than my previous i7-8700K. Actually it�s between 50% to 100% faster for my FE analysis, depending on the type of calculation.
I first got a 5900X + 64GB 3600MHz memory, but it kept crashing on multi-thread calcs. Changed to 3200MHz memory and it now doesn't crash. Presumably the quality of the silicon in the 3600MHz memory was a problem. I am not sure of the silicon quality of the 5900X !!
For my large array calcs using more threads, but only 2 memory channels is a significant bottleneck. I don't get much better performance above threads = cores, (which is similar with the 8700K)
I tried to find an easy problem to define and apply OpenMP! I have been doing testing of large matrix multiply using my developed code:
C[15000,12000] = A[15000,11000] x B[11000,12000] (see equation.com),
where partitioning is essential to reduce the memory<>cache bottleneck. (Vectors must be in cache for AVX to work efficiently and there are 3 levels of cache!) My main measure of performance is to calculate the number of floating point multiplies per second, as GFLOPS (10^9 flop/Sec). My coding approaches at partitioning produce 50 Gflop/s for i7-8700 and 100 Gflop/s for 5900X. These are significantly slower than MKL - DGEMM claimed performance (250+ Gflop/s for similar i5 processors), that I cannot approach (even allowing for MKL benchmarks count additions) (Equation.com report 22 Gflop/s for Opteron and 9.7 Gflop/s for Xeon which is slow)
Interesting that the Ryzen shows significant variability in gflops vs threads for my coding approaches, especially as threads exceeds cores. I7-8700 similarly stalls as threads exceed cores. This is an area I need to investigate further. My next processor will have more memory channels.
OpenMP with large arrays is not an easy coding problem. (large is array size >> cache size)
I will try to post some results when I can better describe the problem.
You can't just buy a different processor and use it. There is lots of tuning to do. |
|
Back to top |
|
 |
DanRRight
Joined: 10 Mar 2008 Posts: 2927 Location: South Pole, Antarctica
|
Posted: Sun Jan 17, 2021 6:53 am Post subject: |
|
|
John,
So in summary you have got twice more cores inside Ryzen and 50 to 100% increase vs Intel ? Does this mean that the Ryzen single core performance is around the same as with Intel ?
Unfortunately i do not have anyone nearby with larger memory channel PCs. I have access to 10000 core Linux supercomputer which uses older Intel 12 core Xeon processors which would be not so interesting to test, and the code we use is written in C. Fortran version 19 with AVX should be there too but there is no one to ask how to use it, the good sysadmin left the team.
The only person i know by contacting him few years ago who has broad access to all world existing processors and who is also interested to test them is Ian Cutress from Anandtech. The UK guy by the way, former scientist, nice and easy going person, at least he was in the past before he started interviewing all the top CEOs in the IT industry. Try to convince him to run the test on 4, 6 and 8 memory channel computers. His own 3D particle moving code got huge benefits from AVX512. Plus he knew the former engineer at Intel who adjusted his code with AVX to get 3-4 even 5x increase in performance vs no-AVX. If he will find that some processors favor significantly cache size, memory channels or AVX with such important task as linear algebra i am sure there will be huge buzz in the industry. He touted his AVX speed increase with Intel processors vs AMD which do not have AVX512 last few years, and Intel clearly liked this. When we implemented in our codes AVX512 though the increase in performance was just 20% or less. |
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2615 Location: Sydney
|
Posted: Tue Jan 19, 2021 5:29 am Post subject: Re: |
|
|
DanRRight wrote: | Does this mean that the Ryzen single core performance is around the same as with Intel ? |
I think that is too general a question. Ryzen is probably better, but I am comparing to Intel 8th gen.
I am finding Ryzen 5900X to be significantly faster than i7-8700K for the test cases I am considering. However there is considerable variability in the Ryzen performance.
My test cases involve large arrays/vectors; 100Mb to 3.5Gb. They appear to be too big to identify a benefit from 2x cache size (which I was hoping would be a plus)
At present (still in the learning phase), the variability in Ryzen performance appears to be due to a combination of variability in boost frequency and higher temperature with many threads. (high GFLOP matrix multiply is a compute intensive calculation) I have selected a Nocuta D15 air cooler, while a higher capacity water cooler might mitigate this. (I did not expect this to be as significnt a problem with 7nm silicon)
My other test case with an actual FEA calculation does show at least 50% improvement vs 8700, which is a plus for Ryzen. |
|
Back to top |
|
 |
DanRRight
Joined: 10 Mar 2008 Posts: 2927 Location: South Pole, Antarctica
|
Posted: Tue Jan 19, 2021 8:20 am Post subject: |
|
|
Noctua is good air cooler, one of the best, but i still recommend to use reliable good company water cooler. |
|
Back to top |
|
 |
LitusSaxonicum
Joined: 23 Aug 2005 Posts: 2403 Location: Yateley, Hants, UK
|
Posted: Thu Jan 21, 2021 11:32 pm Post subject: |
|
|
JC,
Is that a self-build, or a commercial pre-built system? If you built it, what case and fans did you use? A system built into a tower case shouldn't have thermal throttling.
Eddie |
|
Back to top |
|
 |
DanRRight
Joined: 10 Mar 2008 Posts: 2927 Location: South Pole, Antarctica
|
Posted: Fri Jan 22, 2021 12:01 am Post subject: |
|
|
Also, i suggest to find on the internet some PC sellers with Threadrippers and ask them to run your benchmark. For example 3960x has 2x more cores, 2x more cache, 2x more memory channels. Processor also costs 2x vs 5900x and consumes at peak 2.5x more but still the whole PC will cost probably just 30-50% more if build it by yourself. There are a lot of testers on the internet who might be interested. Threadripper Pro is also coming in few months (i do not see first samples of it are much faster though)
On the net I find some insanely expensive prebuilt workstations for $8k with Threadrippers, i would make them myself for 2-3x less |
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2615 Location: Sydney
|
Posted: Sun Mar 07, 2021 6:11 am Post subject: Re: |
|
|
John-Silver wrote: | We aint got any better almost a quarterof a century later ! |
John-S,
I think you know I don't agree !
I am presently analysing vibration transmission from a underground train tunnel into a high-rise above, using linear elastic direct integration, to estimate the transfer mobility. 150,000 nodes and 5,000 time steps for 10 different frequencies would not have been practical in 80's but I am getting useful results now.
Rather than NASTRAN, I use my own software and it has been interesting to undertake the computation effort required. |
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|