Quoted from mecej4
I agree completely; it would, however, be worthwhile to know how inefficient it is.
Well I have done some testing using gFortran with options -g -fimplicit-none -Wall -O3 -march=native -fopenmp
and for a number of variations of mecej4's variation on Arjen Markus' concise code.
Using FTN95 /64, it compiles but crashes at 'real, dimension(1:size(data)) :: sorted'
The results are 'surprising' in that mecej4's code is quite fast, and performs better that my quick sort for N = 2^27.
I say 'surprising' based on the sort uses both COUNT and PACK 3 times, so does a 6x scan of the data array. ( not surprising that mecej4's code is fast )
I have tested a number of variations of the quick sort:
- Markus' original sort : this has recursion within an array generation, so is very memory hungry, (although I didn't get a stack overflow ?) (pivot selection poor for sorted data)
- Mecej4's sort, divides the 2 recursive sub-sorts to seperate statements, which would have a less complex memory allocation, but still uses a lot of temporary memory (heap rather than stack ?).
- Modified pivot selection : uses median of 3 : first:middle:last to select a better pivot, although this only offers a slight improvement.
- My indexed quick sort, which uses a bubble sort when the array size < 9. This is much faster for small sorts (5.8x N= 1024, 3.7x N= 1M (220), 2.0x N=16M, but 0.93x for N=227), so I stoped there ) My sort returns an order index and maintains duplicate order.
- An improved version of mecej4's sort that uses only 2 x COUNT + PACK, avoiding DATA=pivot and also introduces small bubble sorts for 15% improvement.
Note: these tests are for a list of real*4 random numbers, so doesn't test some extreme cases (eg ordered, reverse order, lots or repetition)
So the use of COUNT and PACK for quick sort does have some advantages. Although a duplicated search, it could be that PACK and COUNT have been optimised.
Paul, It would be good to get FTN95 able to compile (assuming it is not a bug or F03+ code)
recursive function mecej4_qsort_reals ( data ) result ( sorted )
!
! real version of mecej4's qsort_int
! we assume ilt+ieq+igt == size(data)
! we need ilt or igt < size(data)
!
real, dimension(:), intent(in) :: data
real, dimension(1:size(data)) :: sorted
integer ilt, ieq, igt, n
I have created a dropbox link to my test program and a batch file (make_run) to rebuild the test using gFortran.
The program updates a file 'markus_sort.txt' with timing results. I did the tests on an i5-2300 and i7-8700K up to N = 2^27 (make sure you have enough memory for the temporary heap arrays)
https://www.dropbox.com/s/qblccz5z253u8wc/test_mecej4_sort.zip?dl=0
I have found my hybrid quick sort to be a robust sort for n = 21 to 230 in size.
My quick sort has replaced recursion with a pairs stack array, which should be of size k > log(N)/log(2), so very small.
As mecej4, has pointed out, PACK needs COUNT to be used effectively. It works very well for this sort.
Please let me know if have different findings.
John