Silverfrost Forums

Welcome to our forums

AVX512 and Linear Algebra

10 Jul 2021 5:41 #28074

I suspected that the fashion to add LEDs could be harmful. When there are just two RAM modules or they are smaller capacity the cooling is less a problem. But clearly 4-ipack of 32GB overvoltaged (by XMP specification, it's not my choice) RAM modules are way hotter and due to heatsinks the distance between them is smaller. Good I did not buy memory with LED plastic beautification on top of heatsinks. Will try to intentionally rise the temperature inside the case and then if crash rate increase will reduce the frequency or memory voltage at the cost of CAS16 --> CAS18 or will install one more fan. Damn, it is a mess when your work disappear due to crashes. It is like a return to old DOS/Windows95

RAM overclock visibly increases the RAMdisk speed in tests but could be actually useless for me as this is not a bottleneck in most cases

/* There exist 7980X Intel processor. 18 cores. AVX512. Prices dropped to $400+ on eBay. Overclockable. Right now several people buzzing about it on the net. Search on Youtube 'This MUST Be Fake'

https://www.youtube.com/watch?v=arcGebjgM_k&t=6s

I think that if dual chip motherboards for this chip exist, it could be even better choice than AMD which has no AVX512. This requires older processor and motherboards with previous generation PCIe, but in most cases 3.5GB/second speed for MVNe is also super good

17 Jul 2021 8:02 #28088

Quoted from DanRRight On single core tests AMD is often even slower than my old laptops

I have only found this to happen for a .exe tuned for the intel processor. All my tests tuned for Ryzen Zen3 are faster for both single core and mult-thread, although optimising for the cache available is my latest challenge.

My understanding is Zen3 does not support AVX512 and AVX2 performance requires data in the cache. As the memory bandwidth is about 50 gByte/sec, PCIe does not look to be a solution for AVX efficiency.

Unfortunately I don't have the budget for or access to i97980X or i910920X to find out what performance they would provide.

4 Aug 2021 12:01 #28140

My AMD crap still suddenly crashing ones per day or two being almost idle. Followed few advices of other people with the same problem but no cure so far. I am losing a lot of time due to crashes.

Next cheapest step is to try changing motherboard. Someone have done so and the crashing stopped. What mobo do you use ?

4 Aug 2021 9:13 #28141

Dan,

I have used AMD exclusively for many years in my home-built computers. I find everything adequately fast these days without resorting to overclocking or even buying the latest and allegedly fastest components.

Currently, I am sitting at a computer using an Asrock B450Pro4 board with a Ryzen 2600 and 16Gb of Corsair Vengeance (!) 2400 DDR4 RAM. My main spare is slower: an Athlon 3000G on an Asrock A320M-DVS R4.0 with only 8Gb RAM (same spec). Out of interest, the notionally slower computer boots much faster and has better airflow as there isn't a big graphics card in it, and the notionally faster computer won't 'fast boot' (a BIOS setting). The Athlon 3000G is only 35W. Asrock is cheap and cheerful, and one of the video out connections on the A320M doesn't work.

My previous generation computer wouldn't update Windows because of a cheap Chinese USB 3.0 expansion card for which there were no suitable drivers, otherwise I would still use it.

I have built faster computers for one of my sons who is what you might call a power user as he is in the computer games business. I would normally go for a big name brand, Gigabyte, Asus or MSI with a really hot system. It is possible to buy internal fans for RAM, and the heatsink and fan on the CPU must be properly installed and functional. Different cases have very varied arrangements for external fans, and I tend to use every available space to be better safe than sorry.

My suggestion to you is to update to the latest BIOS., Windows Update and drivers If the BIOS allows, check temperatures of the chipset and CPU. Buy an aftermarket heatsink & fan or use water cooling. Add heatsinks and a fan to your RAM. Use as many fans as your case permits, balacing inflow and outflow remembering that the graphics card and PSU are outflow. Check that the heatsinks are not clogged with dust or hair (cat hair is awful). Also check the PSU. I have a 1000W PSU that is unreliable, but cheaper PSUs that work faultlessly.

If you do go for a different mainboard, consider building it in a different case (i.e. one that allows more fans) and test it with a cheaper G series Ryzen to be sure that it all works before dismantling your existing system to re-use any parts. Then transfer them one at a time, otherwise finding what may have been a faulty component is made much more difficult.

Depending on how much time you want to spend on it, consider replacing the CPU in your system with something slower (and cheaper, therefore discardable), or taking out 2 sticks of RAM to allow better airflow. Then test it. For what it's worth, my main system is in a Thermaltake Versa H21 case which has loads of fitments for fans, is dirt cheap (£30 in the UK) and has nice fittings for drives. If I were you then I would also consider that the PSU could be causing the problem.

Eddie

4 Aug 2021 9:13 #28142

Dan,

I have used AMD exclusively for many years in my home-built computers. I find everything adequately fast these days without resorting to overclocking or even buying the latest and allegedly fastest components.

Currently, I am sitting at a computer using an Asrock B450Pro4 board with a Ryzen 2600 and 16Gb of Corsair Vengeance (!) 2400 DDR4 RAM. My main spare is slower: an Athlon 3000G on an Asrock A320M-DVS R4.0 with only 8Gb RAM (same spec). Out of interest, the notionally slower computer boots much faster and has better airflow as there isn't a big graphics card in it, and the notionally faster computer won't 'fast boot' (a BIOS setting). The Athlon 3000G is only 35W. Asrock is cheap and cheerful, and one of the video out connections on the A320M doesn't work.

My previous generation computer wouldn't update Windows because of a cheap Chinese USB 3.0 expansion card for which there were no suitable drivers, otherwise I would still use it.

I have built faster computers for one of my sons who is what you might call a power user as he is in the computer games business. I would normally go for a big name brand, Gigabyte, Asus or MSI with a really hot system. It is possible to buy internal fans for RAM, and the heatsink and fan on the CPU must be properly installed and functional. Different cases have very varied arrangements for external fans, and I tend to use every available space to be better safe than sorry.

My suggestion to you is to update to the latest BIOS., Windows Update and drivers If the BIOS allows, check temperatures of the chipset and CPU. Buy an aftermarket heatsink & fan or use water cooling. Add heatsinks and a fan to your RAM. Use as many fans as your case permits, balacing inflow and outflow remembering that the graphics card and PSU are outflow. Check that the heatsinks are not clogged with dust or hair (cat hair is awful). Also check the PSU. I have a 1000W PSU that is unreliable, but cheaper PSUs that work faultlessly.

If you do go for a different mainboard, consider building it in a different case (i.e. one that allows more fans) and test it with a cheaper G series Ryzen to be sure that it all works before dismantling your existing system to re-use any parts. Then transfer them one at a time, otherwise finding what may have been a faulty component is made much more difficult.

Depending on how much time you want to spend on it, consider replacing the CPU in your system with something slower (and cheaper, therefore discardable), or taking out 2 sticks of RAM to allow better airflow. Then test it. For what it's worth, my main system is in a Thermaltake Versa H21 case which has loads of fitments for fans, is dirt cheap (£30 in the UK) and has nice fittings for drives. If I were you then I would also consider that the PSU could be causing the problem.

Eddie

4 Aug 2021 1:23 #28143

Eddie, thanks for suggestions. Problem is that PC does not crash immediately on changes you apply, but might stay sometimes for a week ok. So you think you have finally resolved the issue but actually nothing happened. Only with the experience of many on the internet it is possible to find such kind of rare crashes. There a lot of people complaining about instability with latest AMD chips. Some motherboards are also rated by users as highly unstable.

One guy installed BIOS and bricked motherboard claiming that ASUS now is not the one it was before and its customer service is now nonexistent. I do not like to do that with BIOS because will lose too much time if this will also happen with me. So i will instead buy another mobo and see if stability is related to motherboard. Many complain now about quality of motherboards. ASUS were less crashing according to ratings

Another person filled RMA with AMD and replaced processor but if i do that that also will require me to lose time, while i will not buy another 5950x easily now, even 5900x is always out of stock. And decently i do not like to deal with AMD anymore, it gave me much less than i expected. New Intel chips soon will surpass AMD again by reaching same instructions-per-clock as AMD. And definitely will have no problems with parallel libraries, not saying that will be overclockable to 5.3 GHz. AMD is not overclockable at all beyond its current peak. The PCIe ver4 is already here with Intel. Intel sucked the entire last decade but now it's time to wakeup for it or die

Memory seems is not a reason for crash, i tried overvolting-undervolting, manual settings - auto settings, heating-cooling, other settings which someone claimed helped him - no difference.

Cooling is probably also not the reason, lot of fans, water cooling, huge PC case, keeping open or keeping closed... Crashes happen always at light usage. Stress tests kept all components overheated by a lot but PC was stable. It is difficult to reduce memory by 2 times, i need pretty large amount of memory to run comfortably. It seems though that when crash happens PC uses almost 100GB of RAM. Crash error report tells something that coherency of cache memory was violated, claimed it's a hardware problem. That same reason was enough for someone to fill the RMA

Some blame power supply too. Mine was highly rated based on Amazon reviews, 1000W, high efficiency one. But brand is not well known. Will try to change it too, but so far i never had any problems with power supplies

5 Aug 2021 5:48 #28144

Dan,

My AMD configuration I obtained in Dec-2020 includes: AMD Ryzen 9 5900X CPU Asus ROG Strix B550-E Gaming Motherboard Nocuta D15 Super Cooler 64G Kingston 3600 Mhz Kit Samsung 500GB 970 EVO

When I received this it was crashing when I initiated multi-thread computation. It also crashed when installing Windows O/S upgrade which was a problem.

When it was crashing, I realised how little I knew about how to fix it !

We changed to 64G Kingston 3200 Mhz Kit after 3 days and it became stable. XMP is enabled. (you could turn it off and see what happens) I have updated the bios twice ( Dec-20 and Mar-21 ) No crashes since the memory update.

I don't know how to identify which hardware component is failing, but there are many options.

Do you use HWMonitor ? Anything could have a poor connection or overheating problem (poor thermal paste on the cooler?)

5 Aug 2021 1:15 #28145

Quoted from DanRRight It seems though that when crash happens PC uses almost 100GB of RAM

How much Virtual memory / paging file have you allocated? I allocated larger than physical memory on Win 7, although I only have 32 GB on the AMD pc. You could try 128 GB on C: drive and see if that has any effect.

6 Aug 2021 1:45 #28148

Thanks John, for info and suggestions. I will install this hardware health monitor. I only monitor NVMe health and temperature, and use CPU-Z to see electric parameters but this thing does not show temperature. Stress test shows that but you can not use it permanently. This time i did not manually set the swap file size, it should be set automatically to the same size as RAM size 128GB. Task Manager shows that i never get beyond 100 GB RAM usage lately so swapping is not needed. Thermal paste used i think was the best recommended on the net

And may try BIOS upgrade next time get furious that all my work disappeared kicking me back for a day. Unfortunately in the times when all get some vacation it's typically the busiest times here and i can not afford my computers being completely bricked by something like failed BIOS upgrade. Even changing motherboard will be a headache when you will lose a half a day at least

/* one of reasons of being busy by the way is exactly the vacation of most of researchers, students and professors and the last two besides having different things to worry about - new school years. As a result all supercomputers queues are empty, and it's a shame not to use this little known trick. Next time this opportunity will be only in the New Year holidays. I use it for many decades 😃

6 Aug 2021 12:34 #28151

29 screenshots on their website explaining how to update BIOS on ASUS mobo. I do not know in which world these crazy manufacturers live when all tend to update everything typically in one click or even without it all, purely in automatic regime in background. And they even called this process 'EZ Update'. Totally out of mind

Please login to reply.