News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

store flags in stack?

Started by gelatine1, December 11, 2013, 12:03:17 AM

Previous topic - Next topic

gelatine1

Hello I was wondering if it is possible to push and pop the EFLAGS register? or anything similar?

jj2007

\Masm32\help\opcodes.chm (look for instructions starting with "p")

gelatine1


Tedd

There's PUSHFD and POPFD, but LAHF and SAHF may also be useful, depending on your needs.
Potato2

dedndave

operations that read the flags are reasonably fast
those that set them may not be so fast
PUSHFD - fast
POPFD - takes several clock cycles

i suspect it has to do with security "privilege level", etc

some instructions that manipulate flags are ok - STC, CLC, CMC - a few cycles on my P4
some are slow - CLD, STD, SAHF, POPFD - 80 or more cycles on my P4

jj2007

AMD Athlon(tm) Dual Core Processor 4450B (SSE3)

1514    cycles for 100 * pushfd+popfd
1540    cycles for 100 * pushf+popf
504     cycles for 100 * lahf+sahf

2       bytes for pushfd+popfd
4       bytes for pushf+popf
2       bytes for lahf+sahf


If you need that in an innermost loop with a Million iterations, lahf/sahf is faster, but you can't use eax in that loop. May look different on other CPUs, though.

dedndave

prescott w/htt
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)

10014   cycles for 100 * pushfd+popfd
10005   cycles for 100 * pushf+popf
1050    cycles for 100 * lahf+sahf

9999    cycles for 100 * pushfd+popfd
9980    cycles for 100 * pushf+popf
1080    cycles for 100 * lahf+sahf

9976    cycles for 100 * pushfd+popfd
10007   cycles for 100 * pushf+popf
1053    cycles for 100 * lahf+sahf


ok, SAHF isn't so bad   :P

FORTRANS

Hi,

   P-III, Pentium M, and P-MMX.  Older likes PUSHF/POPF, newer
likes LAHF/SAHF?


pre-P4 (SSE1)

2434 cycles for 100 * pushfd+popfd
2467 cycles for 100 * pushf+popf
719 cycles for 100 * lahf+sahf

2435 cycles for 100 * pushfd+popfd
2466 cycles for 100 * pushf+popf
717 cycles for 100 * lahf+sahf

2434 cycles for 100 * pushfd+popfd
2467 cycles for 100 * pushf+popf
718 cycles for 100 * lahf+sahf

2 bytes for pushfd+popfd
4 bytes for pushf+popf
2 bytes for lahf+sahf


--- ok --- Intel(R) Pentium(R) M processor 1.70GHz (SSE2)

2665 cycles for 100 * pushfd+popfd
2670 cycles for 100 * pushf+popf
488 cycles for 100 * lahf+sahf

2663 cycles for 100 * pushfd+popfd
2672 cycles for 100 * pushf+popf
491 cycles for 100 * lahf+sahf

2664 cycles for 100 * pushfd+popfd
2671 cycles for 100 * pushf+popf
490 cycles for 100 * lahf+sahf

2 bytes for pushfd+popfd
4 bytes for pushf+popf
2 bytes for lahf+sahf


--- ok --- pre-P4
926 cycles for 100 * pushfd+popfd
959 cycles for 100 * pushf+popf
621 cycles for 100 * lahf+sahf

932 cycles for 100 * pushfd+popfd
947 cycles for 100 * pushf+popf
620 cycles for 100 * lahf+sahf

977 cycles for 100 * pushfd+popfd
949 cycles for 100 * pushf+popf
624 cycles for 100 * lahf+sahf

2 bytes for pushfd+popfd
4 bytes for pushf+popf
2 bytes for lahf+sahf


--- ok ---


Regards,

Steve N.

jj2007

One more - lahf/sahf combined with a push/pop eax:
   lahf
   push eax
   pop eax
   sahf

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)

2519    cycles for 100 * pushfd+popfd
2520    cycles for 100 * pushf+popf
479     cycles for 100 * lahf+sahf
992     cycles for 100 * lahf+push eax, pop eax+sahf

dedndave

Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)

9974    cycles for 100 * pushfd+popfd
10009   cycles for 100 * pushf+popf
1050    cycles for 100 * lahf+sahf
1471    cycles for 100 * lahf+push eax, pop eax+sahf

10012   cycles for 100 * pushfd+popfd
10012   cycles for 100 * pushf+popf
1051    cycles for 100 * lahf+sahf
1470    cycles for 100 * lahf+push eax, pop eax+sahf

10047   cycles for 100 * pushfd+popfd
9999    cycles for 100 * pushf+popf
1056    cycles for 100 * lahf+sahf
1468    cycles for 100 * lahf+push eax, pop eax+sahf