Author Topic: Retpoline  (Read 503 times)

vengy

  • Regular Member
  • *
  • Posts: 5
Retpoline
« on: February 28, 2018, 04:00:07 AM »
I was wondering if this code below is truly optimized.
I'm thinking there might be some opcode hacks that may reduce the size or speed.

For indirect calls/jmps, here's the code that I'm using based upon this:  https://patchwork.kernel.org/patch/10143779/

NOSPEC_JMP MACRO target:REQ
                PUSH            target
                JMP             x86_indirect_thunk
ENDM


NOSPEC_CALL MACRO target:REQ
                LOCAL           nospec_call_start
                LOCAL           nospec_call_end

                JMP             nospec_call_end

nospec_call_start:
                PUSH            target
                JMP             x86_indirect_thunk

nospec_call_end:
                CALL            nospec_call_start
ENDM


.CODE

;; This is a special sequence that prevents the CPU speculating for indirect calls.

x86_indirect_thunk:
                CALL            retpoline_call_target

capture_speculation:
                PAUSE
                JMP             capture_speculation

retpoline_call_target:
                IFDEF WIN64
                LEA             RSP,[RSP+8]
                ELSE
                LEA             ESP,[ESP+4]
                ENDIF

                RET

AW

  • Member
  • *****
  • Posts: 1516
  • Let's Make ASM Great Again!
Re: Retpoline
« Reply #1 on: February 28, 2018, 04:59:06 AM »
It is interesting, particularly this part

capture_speculation:
                PAUSE
                JMP             capture_speculation

which is never actually executed.

Namely, it is could be (possibly?) transposed for some speed tests within tight loops to clear up the predictive branches.