bsr64 of q1 = 34
bsr64 of q2 = 34
bsr64 of q3 = 31
Yuck.
AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G (SSE4)
867 cycles for 100 * bsr64a, bit 31
893 cycles for 100 * bsr64b, bit 31
468 cycles for 100 * bsr64a, bit 34
460 cycles for 100 * bsr64b, bit 34
882 cycles for 100 * bsr64a, bit 31
850 cycles for 100 * bsr64b, bit 31
425 cycles for 100 * bsr64a, bit 34
445 cycles for 100 * bsr64b, bit 34
844 cycles for 100 * bsr64a, bit 31
856 cycles for 100 * bsr64b, bit 31
443 cycles for 100 * bsr64a, bit 34
440 cycles for 100 * bsr64b, bit 34
855 cycles for 100 * bsr64a, bit 31
847 cycles for 100 * bsr64b, bit 31
432 cycles for 100 * bsr64a, bit 34
437 cycles for 100 * bsr64b, bit 34
840 cycles for 100 * bsr64a, bit 31
842 cycles for 100 * bsr64b, bit 31
441 cycles for 100 * bsr64a, bit 34
445 cycles for 100 * bsr64b, bit 34
851 cycles for 100 * bsr64a, bit 31
852 cycles for 100 * bsr64b, bit 31
440 cycles for 100 * bsr64a, bit 34
439 cycles for 100 * bsr64b, bit 34
24 bytes for bsr64a, bit 31
26 bytes for bsr64b, bit 31
24 bytes for bsr64a, bit 34
26 bytes for bsr64b, bit 34
31 = eax bsr64a, bit 31
31 = eax bsr64b, bit 31
34 = eax bsr64a, bit 34
34 = eax bsr64b, bit 34