forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

AVX512 and Linear Algebra
Goto page Previous  1, 2
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Fri Jul 09, 2021 2:12 pm    Post subject: Reply with quote

Dan,

I have not tried MKL library.

I bought 5900X with 3600 mHz memory Dec-20; with XMP but no overclocking.
Initially it kept crashing, so replaced with 3200 mHz memory and now stable ever since. Also had USB keyboard problems.
Have updated bois twice and now runs well.
Much faster than Intel for my FE work.

Doing large (1 to 3 gb memory) matrix calculations. It saturates at about 10 threads, but still about 80% faster than 8700K, which also has similar problem.

Could be memory bandwidth limit, but have tried adapted algorithms to use L3 cache sized chunks. Use matmul tests as a way to identify ways of improving thread efficiency.

Also use skyline linear equation solver (reducer) and multiple solution vectors with differing efficency, but all have memory bottleneck effect with higher thread count.

24 threads with dual channel memory is not efficient for my large memory array calcs, but can't afford to investigate the alternatives.
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2244
Location: Yateley, Hants, UK

PostPosted: Fri Jul 09, 2021 4:49 pm    Post subject: Reply with quote

John,

The problem with a USB keyboard is probably not cpu related, but is a mainboard issue. I expect the memory issue is similar. Sometimes such issues relate to BIOS settings, but not always, and is sometimes fixable with a BIOS update. SOmetimes a RAM fault is because the sticks are not seated well enough.

As a 'for instance', my main machine has a Ryzen 2600 in an Asrock B450 board. This machine won't do a wakeup from the keyboard, but a much cheaper board (also Asrock, in this case an A320M) in my backup machine will, both with that cpu in it, or with a cheaper, slower, cpu (and Athlon 3000). It''s the mainboard and its firmware that is at fault.

Dan's issues are probably also board and chipset related rather than cpu.

Incidentally, the A320M boots faster than the B450 even though the latter has M.2 and the former only a SATA SSD.
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1562

PostPosted: Fri Jul 09, 2021 5:13 pm    Post subject: Reply with quote

One more item to check is the CMOS battery (usually a CR2032).

This February, during a cold wave we lost power and heating for almost 24 hours. After power came back, my desktop PC would not boot Windows properly. The BIOS settings were being reset to default values. Even though the PC was only four years old, the battery voltage had dropped to 0.4 V (normal: 3 V).
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Fri Jul 09, 2021 9:21 pm    Post subject: Reply with quote

Mecej4,
All sh$t is absolutely new. ASUS mobo has no crashing complains based on reviews, RAM - dual 3600MHz 16 and dual 18 CAS XMP memory.

Spacious PC case, 4 fans + 3 fans for water cooling. NVMe drive has personal heatsink on top of it with fan to cool it to ~44C

Crashes unexpectedly with no activity. Besides may be 80-100GB filled in memory of different tasks, mostly idle, like 3-4 browsers and other stuff

Ran stress test, no problems

Event viewer show no info besides stating at login that computer recovered from unexpected event

We have hot days lately, 32C at home (which i like because hate cold weather), probably 40C inside the PC box. Memory is very hot though because of overvoltage to 1.45V from usual 1.35V to achieve CAS 16. All 4 memory SIMS are packed too close to each others though but they have heatsinks. Fans send air to cool them and motherboard very strongly.

Still I suspect memory right now. May be will need to update BIOS. People complain on sudden crashes with latest two generations AMD RYZENs. Could be the motherboards also adding uncertainty - i understand this is not just the AMD problem but all together work less reliably because AMD is less used in the world, less complains, which is a recipe for the proverbial devilry to easily squeeze inside somewhere. Same problem like with FTN95 - you and couple others do but otherwise little who sends their bug reports and suggestions for improvement Smile

May be software problem too. Will remove RAMdrive next, then half of memory, then update BIOS - this is my plan.

Will see how it will behave next few days when there will be 44C outside

John,
One of tests i just made with computer slightly running other things shows 3.5x speedup with LAIPE on 5950X (16 cores) vs overclocked to 4.4GHz 4770k (4 cores)

On single core tests AMD is often even slower than my old laptops

By the way i also ran single core test by DaveGemini, there was one funny Fortran fan years ago producing noise and hot air on Fortran forums. Test was compiled by Intel compiler and for Intel processors 15-20 years back. It ran some subtests but refused to run LAPACK subtest. Same problems like with MKL i suspect
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1562

PostPosted: Sat Jul 10, 2021 12:01 am    Post subject: Reply with quote

Are you aware that adding "heat sinks" with plastic components can reduce heat flow and, conversely, adding "insulation" can increase heat flow? See https://www.nuclear-power.net/nuclear-engineering/heat-transfer/thermal-conduction/critical-thickness-of-insulation-critical-radius/ .

Check if the heat sinks on your memory modules have reduced the space between the modules so much as to reduce air flow.


Last edited by mecej4 on Sat Jul 10, 2021 8:57 am; edited 1 time in total
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Sat Jul 10, 2021 6:41 am    Post subject: Reply with quote

I suspected that the fashion to add LEDs could be harmful. When there are just two RAM modules or they are smaller capacity the cooling is less a problem. But clearly 4-ipack of 32GB overvoltaged (by XMP specification, it's not my choice) RAM modules are way hotter and due to heatsinks the distance between them is smaller. Good I did not buy memory with LED plastic beautification on top of heatsinks. Will try to intentionally rise the temperature inside the case and then if crash rate increase will reduce the frequency or memory voltage at the cost of CAS16 --> CAS18 or will install one more fan. Damn, it is a mess when your work disappear due to crashes. It is like a return to old DOS/Windows95

RAM overclock visibly increases the RAMdisk speed in tests but could be actually useless for me as this is not a bottleneck in most cases

/* There exist 7980X Intel processor. 18 cores. AVX512. Prices dropped to $400+ on eBay. Overclockable. Right now several people buzzing about it on the net. Search on Youtube "This MUST Be Fake"
Quote:
https://www.youtube.com/watch?v=arcGebjgM_k&t=6s


I think that if dual chip motherboards for this chip exist, it could be even better choice than AMD which has no AVX512. This requires older processor and motherboards with previous generation PCIe, but in most cases 3.5GB/second speed for MVNe is also super good
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Sat Jul 17, 2021 9:02 am    Post subject: Re: Reply with quote

DanRRight wrote:
On single core tests AMD is often even slower than my old laptops

I have only found this to happen for a .exe tuned for the intel processor. All my tests tuned for Ryzen Zen3 are faster for both single core and mult-thread, although optimising for the cache available is my latest challenge.

My understanding is Zen3 does not support AVX512 and AVX2 performance requires data in the cache.
As the memory bandwidth is about 50 gByte/sec, PCIe does not look to be a solution for AVX efficiency.

Unfortunately I don't have the budget for or access to i97980X or i910920X to find out what performance they would provide.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Wed Aug 04, 2021 1:01 am    Post subject: Reply with quote

My AMD crap still suddenly crashing ones per day or two being almost idle. Followed few advices of other people with the same problem but no cure so far. I am losing a lot of time due to crashes.

Next cheapest step is to try changing motherboard. Someone have done so and the crashing stopped. What mobo do you use ?
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2244
Location: Yateley, Hants, UK

PostPosted: Wed Aug 04, 2021 10:13 am    Post subject: Reply with quote

Dan,

I have used AMD exclusively for many years in my home-built computers. I find everything adequately fast these days without resorting to overclocking or even buying the latest and allegedly fastest components.

Currently, I am sitting at a computer using an Asrock B450Pro4 board with a Ryzen 2600 and 16Gb of Corsair Vengeance (!) 2400 DDR4 RAM. My main spare is slower: an Athlon 3000G on an Asrock A320M-DVS R4.0 with only 8Gb RAM (same spec). Out of interest, the notionally slower computer boots much faster and has better airflow as there isn't a big graphics card in it, and the notionally faster computer won't 'fast boot' (a BIOS setting). The Athlon 3000G is only 35W. Asrock is cheap and cheerful, and one of the video out connections on the A320M doesn't work.

My previous generation computer wouldn't update Windows because of a cheap Chinese USB 3.0 expansion card for which there were no suitable drivers, otherwise I would still use it.

I have built faster computers for one of my sons who is what you might call a power user as he is in the computer games business. I would normally go for a big name brand, Gigabyte, Asus or MSI with a really hot system. It is possible to buy internal fans for RAM, and the heatsink and fan on the CPU must be properly installed and functional. Different cases have very varied arrangements for external fans, and I tend to use every available space to be better safe than sorry.

My suggestion to you is to update to the latest BIOS., Windows Update and drivers If the BIOS allows, check temperatures of the chipset and CPU. Buy an aftermarket heatsink & fan or use water cooling. Add heatsinks and a fan to your RAM. Use as many fans as your case permits, balacing inflow and outflow remembering that the graphics card and PSU are outflow. Check that the heatsinks are not clogged with dust or hair (cat hair is awful). Also check the PSU. I have a 1000W PSU that is unreliable, but cheaper PSUs that work faultlessly.

If you do go for a different mainboard, consider building it in a different case (i.e. one that allows more fans) and test it with a cheaper G series Ryzen to be sure that it all works before dismantling your existing system to re-use any parts. Then transfer them one at a time, otherwise finding what may have been a faulty component is made much more difficult.

Depending on how much time you want to spend on it, consider replacing the CPU in your system with something slower (and cheaper, therefore discardable), or taking out 2 sticks of RAM to allow better airflow. Then test it. For what it's worth, my main system is in a Thermaltake Versa H21 case which has loads of fitments for fans, is dirt cheap (30 in the UK) and has nice fittings for drives. If I were you then I would also consider that the PSU could be causing the problem.

Eddie
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2244
Location: Yateley, Hants, UK

PostPosted: Wed Aug 04, 2021 10:13 am    Post subject: Reply with quote

Dan,

I have used AMD exclusively for many years in my home-built computers. I find everything adequately fast these days without resorting to overclocking or even buying the latest and allegedly fastest components.

Currently, I am sitting at a computer using an Asrock B450Pro4 board with a Ryzen 2600 and 16Gb of Corsair Vengeance (!) 2400 DDR4 RAM. My main spare is slower: an Athlon 3000G on an Asrock A320M-DVS R4.0 with only 8Gb RAM (same spec). Out of interest, the notionally slower computer boots much faster and has better airflow as there isn't a big graphics card in it, and the notionally faster computer won't 'fast boot' (a BIOS setting). The Athlon 3000G is only 35W. Asrock is cheap and cheerful, and one of the video out connections on the A320M doesn't work.

My previous generation computer wouldn't update Windows because of a cheap Chinese USB 3.0 expansion card for which there were no suitable drivers, otherwise I would still use it.

I have built faster computers for one of my sons who is what you might call a power user as he is in the computer games business. I would normally go for a big name brand, Gigabyte, Asus or MSI with a really hot system. It is possible to buy internal fans for RAM, and the heatsink and fan on the CPU must be properly installed and functional. Different cases have very varied arrangements for external fans, and I tend to use every available space to be better safe than sorry.

My suggestion to you is to update to the latest BIOS., Windows Update and drivers If the BIOS allows, check temperatures of the chipset and CPU. Buy an aftermarket heatsink & fan or use water cooling. Add heatsinks and a fan to your RAM. Use as many fans as your case permits, balacing inflow and outflow remembering that the graphics card and PSU are outflow. Check that the heatsinks are not clogged with dust or hair (cat hair is awful). Also check the PSU. I have a 1000W PSU that is unreliable, but cheaper PSUs that work faultlessly.

If you do go for a different mainboard, consider building it in a different case (i.e. one that allows more fans) and test it with a cheaper G series Ryzen to be sure that it all works before dismantling your existing system to re-use any parts. Then transfer them one at a time, otherwise finding what may have been a faulty component is made much more difficult.

Depending on how much time you want to spend on it, consider replacing the CPU in your system with something slower (and cheaper, therefore discardable), or taking out 2 sticks of RAM to allow better airflow. Then test it. For what it's worth, my main system is in a Thermaltake Versa H21 case which has loads of fitments for fans, is dirt cheap (30 in the UK) and has nice fittings for drives. If I were you then I would also consider that the PSU could be causing the problem.

Eddie
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Wed Aug 04, 2021 2:23 pm    Post subject: Reply with quote

Eddie, thanks for suggestions. Problem is that PC does not crash immediately on changes you apply, but might stay sometimes for a week ok. So you think you have finally resolved the issue but actually nothing happened. Only with the experience of many on the internet it is possible to find such kind of rare crashes. There a lot of people complaining about instability with latest AMD chips. Some motherboards are also rated by users as highly unstable.

One guy installed BIOS and bricked motherboard claiming that ASUS now is not the one it was before and its customer service is now nonexistent. I do not like to do that with BIOS because will lose too much time if this will also happen with me. So i will instead buy another mobo and see if stability is related to motherboard. Many complain now about quality of motherboards. ASUS were less crashing according to ratings

Another person filled RMA with AMD and replaced processor but if i do that that also will require me to lose time, while i will not buy another 5950x easily now, even 5900x is always out of stock. And decently i do not like to deal with AMD anymore, it gave me much less than i expected. New Intel chips soon will surpass AMD again by reaching same instructions-per-clock as AMD. And definitely will have no problems with parallel libraries, not saying that will be overclockable to 5.3 GHz. AMD is not overclockable at all beyond its current peak. The PCIe ver4 is already here with Intel. Intel sucked the entire last decade but now it's time to wakeup for it or die

Memory seems is not a reason for crash, i tried overvolting-undervolting, manual settings - auto settings, heating-cooling, other settings which someone claimed helped him - no difference.

Cooling is probably also not the reason, lot of fans, water cooling, huge PC case, keeping open or keeping closed... Crashes happen always at light usage. Stress tests kept all components overheated by a lot but PC was stable. It is difficult to reduce memory by 2 times, i need pretty large amount of memory to run comfortably. It seems though that when crash happens PC uses almost 100GB of RAM. Crash error report tells something that coherency of cache memory was violated, claimed it's a hardware problem. That same reason was enough for someone to fill the RMA

Some blame power supply too. Mine was highly rated based on Amazon reviews, 1000W, high efficiency one. But brand is not well known. Will try to change it too, but so far i never had any problems with power supplies
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Thu Aug 05, 2021 6:48 am    Post subject: Reply with quote

Dan,

My AMD configuration I obtained in Dec-2020 includes:
AMD Ryzen 9 5900X CPU
Asus ROG Strix B550-E Gaming Motherboard
Nocuta D15 Super Cooler
64G Kingston 3600 Mhz Kit
Samsung 500GB 970 EVO

When I received this it was crashing when I initiated multi-thread computation.
It also crashed when installing Windows O/S upgrade which was a problem.

When it was crashing, I realised how little I knew about how to fix it !

We changed to 64G Kingston 3200 Mhz Kit after 3 days and it became stable. XMP is enabled. (you could turn it off and see what happens)
I have updated the bios twice ( Dec-20 and Mar-21 )
No crashes since the memory update.

I don't know how to identify which hardware component is failing, but there are many options.

Do you use HWMonitor ? Anything could have a poor connection or overheating problem (poor thermal paste on the cooler?)
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Thu Aug 05, 2021 2:15 pm    Post subject: Re: Reply with quote

DanRRight wrote:
It seems though that when crash happens PC uses almost 100GB of RAM

How much Virtual memory / paging file have you allocated?
I allocated larger than physical memory on Win 7, although I only have 32 GB on the AMD pc.
You could try 128 GB on C: drive and see if that has any effect.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Fri Aug 06, 2021 2:45 am    Post subject: Reply with quote

Thanks John, for info and suggestions. I will install this hardware health monitor. I only monitor NVMe health and temperature, and use CPU-Z to see electric parameters but this thing does not show temperature. Stress test shows that but you can not use it permanently. This time i did not manually set the swap file size, it should be set automatically to the same size as RAM size 128GB. Task Manager shows that i never get beyond 100 GB RAM usage lately so swapping is not needed. Thermal paste used i think was the best recommended on the net

And may try BIOS upgrade next time get furious that all my work disappeared kicking me back for a day. Unfortunately in the times when all get some vacation it's typically the busiest times here and i can not afford my computers being completely bricked by something like failed BIOS upgrade. Even changing motherboard will be a headache when you will lose a half a day at least

/* one of reasons of being busy by the way is exactly the vacation of most of researchers, students and professors and the last two besides having different things to worry about - new school years. As a result all supercomputers queues are empty, and it's a shame not to use this little known trick. Next time this opportunity will be only in the New Year holidays. I use it for many decades Smile
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Fri Aug 06, 2021 1:34 pm    Post subject: Reply with quote

29 screenshots on their website explaining how to update BIOS on ASUS mobo. I do not know in which world these crazy manufacturers live when all tend to update everything typically in one click or even without it all, purely in automatic regime in background. And they even called this process "EZ Update". Totally out of mind
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group