News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

GetThreadContext (64 bit)

Started by FlySky, March 09, 2015, 02:24:23 AM

Previous topic - Next topic

Antariy

Quote from: vertograd on March 12, 2015, 06:53:16 AM
Thank you, Alex
That's an  interesting and helpful link !

Here it is:

:t

And thank you for finding the source! :biggrin:
The description in the blog was attracting - so I thought that it maybe probably some unique technique, so it was interesting to see the source, but it was unavailable from the link pointed in the blog. So now we can see that it uses pretty straight, a bit "oversimplyfied" way, but the seeing of code is a good thing - now we know WHAT exactly that code is, without code there were thoughts possible "which algo it was? maybe it is some revolutionary thing?", but now we see things and see that the technics used are more or less well known to some of the members, but, still this code maybe useful to anyone as it is simple and straightforward, not entangled with advanced techniques etc.

Antariy

Quote from: rrr314159 on March 12, 2015, 01:13:22 PM
I'll be darned - this is exactly the idea we've been beating to death in the laboratory! I'd say William Chan stole it from me, except he's 9 years prior, so it would be a hard sell. But this algo has major drawbacks,

Quote from: Zach sawNote though, that you'll need to give it 16-byte aligned memory and it copies in 128-byte blocks.

Also prefetchnta seems useless, and movntdq worse-than-useless on my modern machine. Admittedly only tried them once but also saw ref's saying the same thing, that modern processors don't get much from them. (Of course u can't trust ref's)

Yes, actually it was not so useful as it was told (you know, the "loud words" on the "technology advances" are usually much part just a words with a little true) even on not very modern hardware.

Quote from: rrr314159 on March 12, 2015, 01:13:22 PM
I find that incrementing edi and esi midway through the list of mov's is better. Keeps the max offset down to 30h, no reason it should make a difference, but seems to help. And of course u should dec ebx long b4 the jnz branch, maximizes processor's ability to predict branch correctly in advance. Minor points, of course; see laboratory thread for a couple dozen more if interested

Not too big offset has the influence on timing, yes, thought there is not "obvious reason", but it does so.

Much more big point: the code doesn't support a "precise copying" - it copies just in the 128 bytes blocks and doesn't support the precise tails copying that less than 128 bytes. Very simple code.

Quote from: rrr314159 on March 12, 2015, 01:13:22 PM
Dunno what this is doing here, would be more relevant over in the laboratory, but it was such a surprise to see it I had to comment.

It was on the blog which was pointed as a reference on the wikipedia's article, which was pointed by Jochen, probably it should be clear from the posts above. And, being a "Real Lazy Coder" (TM), I did not bother to point that link in the thread with memcopy as it was not open in the browser. I did read it earlier (did not posted there as tend to agree with Hutch's and Jochen's point of view on that subject), too, so knew about that topic on the forum going at the time, so that's why pointed the link to some "unknown memcopy" algo here.

hutch--

Something you must do with some values when you are using /LARGEADDRESSAWARE is to write the value to a 64 bit register then write the register to the 64 bit variable. It is not an assembler issue but part of the Win64 ABI.

LiaoMi

Quote from: hutch-- on December 31, 2018, 06:05:20 PM
Something you must do with some values when you are using /LARGEADDRESSAWARE is to write the value to a 64 bit register then write the register to the 64 bit variable. It is not an assembler issue but part of the Win64 ABI.

Reply #16 on: March 13, 2015, 05:59:04 AM »  :biggrin:

hutch--

 :biggrin:

Strangely enough I read ordinary English with no problems.  :P