Hi,

Three systems that respond somewhat differently. I had expected

the second and third to be similar, so the change in the "mov+mov"

test was interesting. The P-MMX really does not like the "xchg eax, [esp]"

test, very slow. The P-MMX rather likes the "mov+mov" test though.

`P-MMX`

pre-P4

203 cycles for 100 * pop+push

199 cycles for 100 * mov+mov

7092 cycles for 100 * xchg eax, [esp]

204 cycles for 100 * pop+push

198 cycles for 100 * mov+mov

7119 cycles for 100 * xchg eax, [esp]

201 cycles for 100 * pop+push

201 cycles for 100 * mov+mov

7095 cycles for 100 * xchg eax, [esp]

203 cycles for 100 * pop+push

201 cycles for 100 * mov+mov

7098 cycles for 100 * xchg eax, [esp]

204 cycles for 100 * pop+push

200 cycles for 100 * mov+mov

7085 cycles for 100 * xchg eax, [esp]

3 bytes for pop+push

7 bytes for mov+mov

6 bytes for xchg eax, [esp]

--- ok ---

P-III

pre-P4 (SSE1)

103 cycles for 100 * pop+push

203 cycles for 100 * mov+mov

1718 cycles for 100 * xchg eax, [esp]

102 cycles for 100 * pop+push

202 cycles for 100 * mov+mov

1719 cycles for 100 * xchg eax, [esp]

102 cycles for 100 * pop+push

201 cycles for 100 * mov+mov

1721 cycles for 100 * xchg eax, [esp]

103 cycles for 100 * pop+push

201 cycles for 100 * mov+mov

1734 cycles for 100 * xchg eax, [esp]

102 cycles for 100 * pop+push

203 cycles for 100 * mov+mov

1723 cycles for 100 * xchg eax, [esp]

3 bytes for pop+push

7 bytes for mov+mov

6 bytes for xchg eax, [esp]

--- ok ---

Intel(R) Pentium(R) M processor 1.70GHz (SSE2)

120 cycles for 100 * pop+push

408 cycles for 100 * mov+mov

1811 cycles for 100 * xchg eax, [esp]

113 cycles for 100 * pop+push

407 cycles for 100 * mov+mov

1804 cycles for 100 * xchg eax, [esp]

121 cycles for 100 * pop+push

406 cycles for 100 * mov+mov

1815 cycles for 100 * xchg eax, [esp]

119 cycles for 100 * pop+push

405 cycles for 100 * mov+mov

1809 cycles for 100 * xchg eax, [esp]

121 cycles for 100 * pop+push

406 cycles for 100 * mov+mov

1810 cycles for 100 * xchg eax, [esp]

3 bytes for pop+push

7 bytes for mov+mov

6 bytes for xchg eax, [esp]

--- ok ---

Cheers,

Steve N.