First of all, it is quite likely that Sebastian and Dan are having to deal with different issues so what might satisfy one may not satisfy the other.
Second, I have no real understanding of what is going wrong in either case.
However, using masked underflow should not, in my opinion, significantly affect the numerical results. If it does then this suggests that the numerical calculation is unstable (i.e. significantly affected by round-off error).
For 32 bit REALs, the smallest exponent is 1E-38 for unmasked underflow and 1E-46 for masked underflow. If you test a computed REAL z against zero then in one case you will be testing abs(z) < 1E-38 and abs(z) < 1E-46 in the other. Clearly this is not good programming practice and the compiler warns against this. Not only will the outcome depend on the masking but (more importantly) on the KIND of the REAL.
On another track, if using masked underflow does not help, then I have one other suggestion to try. It is based on the possibility that some third party computation (from OpenGL and/or the graphics card) is leaving the coprocessor in an unclean state. The following code, placed before the crash point, will reset the coprocessor...
CODE
fclex;
finit;
fround;
EDOC
Again, this is just shooting in the dark since I don't have anything of substance to work on, not being able to reproduce the issues in question.