Here is good picture which explains why this compiler must support GPUs. This diagram shows how works typical modern supercomputer, this was from the most powerful one called Frontier but in essence all is almost the same for other servers or even HPC workstations.
On their baseboards they have one or two server CPU chips connected to 8 GPUs via fast links. The CPU essentially is just the input/output controller for the GPUs. There are 10,000 such baseboards in supercomputers like Frontier and typically few or even just one in high performance workstations. Even our PC with usual graphics card basically works same way. To save on cost of fast networking inside the baseboard, on workstations often is enough to use standard PCIe bus same way like with regular graphics cards. Little more complex motherboards can use NVIDIA's NVLink which has 300MB/s speed, or a bit more modern ones have special faster switches allowing more GPUs work in parallel.
You can see why supercomputers with GPUs are faster than CPU based ones: first, each GPU is faster than CPU and second, their amount is 8x larger than the number of CPUs.
So essentially all modern supercomputers are GPU supercomputers and this will be the trend for some time due to the fact that GPUs are also specifically good for AI. The reason for that is that GPUs besides 64bit arithmetic also have 32bit, 16bit and 8bit arithmetic and hence each time you use less bits you obviously get the speed doubled and AI often needs just the 8bit one and sometimes even 4bit is enough.
With CPUs though a decade or two ago what happened was unthinkable: Intel killed native support for single precision 32bit arithmetic and started to use just the double precision 64bit one and then truncating the final result to 32bit. I used so called mixed precision before when most of operations were in 32bit and only minor amount with 64bit and even larger precisions. Decently, 64bit is easier to use, no problems with the overflows or denormal numbers. But the price for that is factor of 2 speed decline.
What is good with the new processors is that by adding more cores they become as fast as all these incarnations of previous generations of GPUs which are still hell expensive. So hopefully all of them will drop in prices. Why we will need GPU if multi-core CPUs becoming as powerful as GPUs? Because besides monopoly in GPU NVIDIA also overjumped everyone with the fast interconnect. It is easy to add 2,3,...,8...16 GPUs to the existing system and improve with that the performance accordingly while you will not find a single manufacturer which makes more than dual-CPU motherboards and you will not find how to connect even two motherboards to make a parallel system. Hence with the CPUs you are screwed if you will try to get more performance
One thing is specifically good for this compiler. With it supporting CUDA and GPUs the speed of CPU will have no difference. All will depend on GPU and not on that CPU is old, slow or does not support multuthreading. Win-win for FTN95 !