News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

GetThreadContext (64 bit)

Started by FlySky, March 09, 2015, 02:24:23 AM

Previous topic - Next topic

FlySky

Guys,

Seems I am having an issue with GetThreadContext returning error code 0x3E6:

ERROR_NOACCESS998 (0x3E6)
Invalid access to memory location.

I can't seem to figure out why this is being caused:

1) I am retrieving the threadid.

2) Than I am opening a handle to it using (eax is the thread identifier):
invoke OpenThread, THREAD_SET_CONTEXT | THREAD_GET_CONTEXT | THREAD_QUERY_INFORMATION | THREAD_SUSPEND_RESUME, NULL, eax
mov [MainThreadIdHandle], eax
this call is made succesfully

3) I than suspend the process using:
invoke SuspendThread, [MainThreadIdHandle]

4) I than try to retrieve the context of the process by using:
mov [ctx.ContextFlags], CONTEXT_AMD64 | CONTEXT_CONTROL | CONTEXT_INTEGER | CONTEXT_FLOATING_POINT
invoke GetThreadContext, [MainThreadIdHandle], offset ctx

It always returns with errorcode 3E6 on all of my running x64 processes.

Any ideas?



Yuri

HANDLEs are 64-bit in x64, so I think you should store rax rather than eax in MainThreadHandle. Otherwise the high dword of it will be some random garbage when you pass it to GetThreadContext.

FlySky

I've changed the code as you suggested, saving rax (64 bit handle) instead of just the lower dword of it.
The end result is still the same, a very weird situation.

qWord

Quote from: FlySky on March 09, 2015, 02:24:23 AMSeems I am having an issue with GetThreadContext returning error code 0x3E6
The return type is BOOL.
Quote from: msdnIf the function succeeds, the return value is nonzero.
MREAL macros - when you need floating point arithmetic while assembling!

FlySky

Sorry guys,

I expressed myself wrong.
The return value is 0 meaning failure and calling GetLastError shows me the error code 3E6.
Maybe it's a problem with my header file where it tries to read more bytes from the CONTEXT than it is allowed and
maybe that causes the Invalid access to memory location.
Will keep you guys informed.

Yuri

It looks like the CONTEXT structure must start at a 16-bit boundary, otherwise the call fails.

FlySky

what does that mean Yuri, in relation to Donkeys header file:

The context structure is defined in winnt.h:

CONTEXT STRUCT

    //
    // Register parameter home addresses.
    //
    // N.B. These fields are for convience - they could be used to extend the
    //      context record in the future.
    //

   P1Home DQ
   P2Home DQ
   P3Home DQ
   P4Home DQ
   P5Home DQ
   P6Home DQ

    //
    // Control flags.
    //

   ContextFlags DD
   MxCsr DD

    //
    // Segment Registers and processor flags.
    //

   SegCs DW
   SegDs DW
   SegEs DW
   SegFs DW
   SegGs DW
   SegSs DW
   EFlags DD

    //
    // Debug registers
    //

   Dr0 DQ
   Dr1 DQ
   Dr2 DQ
   Dr3 DQ
   Dr6 DQ
   Dr7 DQ

    //
    // Integer registers.
    //

   Rax DQ
   Rcx DQ
   Rdx DQ
   Rbx DQ
   Rsp DQ
   Rbp DQ
   Rsi DQ
   Rdi DQ
   R8 DQ
   R9 DQ
   R10 DQ
   R11 DQ
   R12 DQ
   R13 DQ
   R14 DQ
   R15 DQ

    //
    // Program counter.
    //

   Rip DQ

    //
    // Floating point state.
    //

   UNION
      FltSave XMM_SAVE_AREA32
      STRUCT
         Header DB 16*2 DUP ; M128A
         Legacy DB 16*8 DUP ; M128A
         Xmm0 M128A
         Xmm1 M128A
         Xmm2 M128A
         Xmm3 M128A
         Xmm4 M128A
         Xmm5 M128A
         Xmm6 M128A
         Xmm7 M128A
         Xmm8 M128A
         Xmm9 M128A
         Xmm10 M128A
         Xmm11 M128A
         Xmm12 M128A
         Xmm13 M128A
         Xmm14 M128A
         Xmm15 M128A
      ENDS
   ENDUNION

    //
    // Vector registers.
    //

   VectorRegister DB 16*26 DUP ; M128A
   VectorControl DQ

    //
    // Special debug control registers.
    //

   DebugControl DQ
   LastBranchToRip DQ
   LastBranchFromRip DQ
   LastExceptionToRip DQ
   LastExceptionFromRip DQ
ENDS

dedndave

i don't know what the syntax is for GoAsm
but, for Masm....

    ALIGN   16

ctxt CONTEXT <>


by the way, nice catch, Yuri   :t

Yuri

Thanks, Dave. :icon_cool:

The syntax for GoAsm is the same. There is nothing to change in the headers, FlySky, only align the structure definition in your source code.

jj2007

Not sure if it's relevant (source):

QuoteA bug in the translation layer of the x64 version of WoW64[1][2] also renders all 32-bit applications that rely on the Windows API function GetThreadContext incompatible. Such applications include application debuggers, call stack tracers (e.g. IDEs displaying call stack) and applications that use garbage collection (GC) engines

FlySky

Thanks for all the replies.
You seem to have nailed it perfectly Yuri :t.
aligning the structure definition fixed the issue!.


Antariy

Quote from: jj2007 on March 12, 2015, 04:10:11 AM
Not sure if it's relevant (source):

QuoteA bug in the translation layer of the x64 version of WoW64[1][2] also renders all 32-bit applications that rely on the Windows API function GetThreadContext incompatible. Such applications include application debuggers, call stack tracers (e.g. IDEs displaying call stack) and applications that use garbage collection (GC) engines

It's not relevant here (there the call to GTC returns wrong contents, not fails, as it was described in further references pointet at wikipedia), but it's very useful info. Thank you for pointing that out, Jochen :t
Btw, in one of the blogs referenced there at wikipedia, there is a a post http://zachsaw.blogspot.com/2010/11/fast-memcpy-for-large-blocks.html - memcopy, that post refers to the other link, but there are "no such file" - so the code isn't available.

GoneFishing

#12
Thank you, Alex
That's an  interesting and helpful link !

Quote from: Antariy on March 12, 2015, 06:28:49 AM
...
Btw, in one of the blogs referenced there at wikipedia, there is a a post http://zachsaw.blogspot.com/2010/11/fast-memcpy-for-large-blocks.html - memcopy, that post refers to the other link, but there are "no such file" - so the code isn't available.
Here it is:



/*
  Copyright(C) 2006, William Chan
  All rights reserved.

      Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions are met:
   
      1) Redistributions of source code must retain the above copyright
        notice, this list of conditions and the following disclaimer.
      2) Redistributions in binary form must reproduce the above copyright
        notice, this list of conditions and the following disclaimer in the
        documentation and/or other materials provided with the distribution.
      3) Redistributions of source code must be provided at free of charge.
      4) Redistributions in binary forms must be provided at free of charge.
      5) Redistributions of source code within another distribution must be
        provided at free of charge including the distribution which is
        redistributing the source code. Also, the distribution which is
        redistributing the source code must have its source code
        redistributed as well.
      6) Redistribution of binary forms within another distribution must be
        provided at free of charge.

      THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
    IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
    THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
    PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS
    BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
    CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
    SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
    INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
    CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
    ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
    POSSIBILITY OF SUCH DAMAGE.
*/

void X_aligned_memcpy_sse2(void* dest, const void* src, const unsigned long size_t)
{

  __asm
  {
    mov esi, src;    //src pointer
    mov edi, dest;   //dest pointer

    mov ebx, size_t; //ebx is our counter
    shr ebx, 7;      //divide by 128 (8 * 128bit registers)


    loop_copy:
      prefetchnta 128[ESI]; //SSE2 prefetch
      prefetchnta 160[ESI];
      prefetchnta 192[ESI];
      prefetchnta 224[ESI];

      movdqa xmm0, 0[ESI]; //move data from src to registers
      movdqa xmm1, 16[ESI];
      movdqa xmm2, 32[ESI];
      movdqa xmm3, 48[ESI];
      movdqa xmm4, 64[ESI];
      movdqa xmm5, 80[ESI];
      movdqa xmm6, 96[ESI];
      movdqa xmm7, 112[ESI];

      movntdq 0[EDI], xmm0; //move data from registers to dest
      movntdq 16[EDI], xmm1;
      movntdq 32[EDI], xmm2;
      movntdq 48[EDI], xmm3;
      movntdq 64[EDI], xmm4;
      movntdq 80[EDI], xmm5;
      movntdq 96[EDI], xmm6;
      movntdq 112[EDI], xmm7;

      add esi, 128;
      add edi, 128;
      dec ebx;

      jnz loop_copy; //loop please
    loop_copy_end:
  }
}


rrr314159

I'll be darned - this is exactly the idea we've been beating to death in the laboratory! I'd say William Chan stole it from me, except he's 9 years prior, so it would be a hard sell. But this algo has major drawbacks,

Quote from: Zach sawNote though, that you'll need to give it 16-byte aligned memory and it copies in 128-byte blocks.

Also prefetchnta seems useless, and movntdq worse-than-useless on my modern machine. Admittedly only tried them once but also saw ref's saying the same thing, that modern processors don't get much from them. (Of course u can't trust ref's)

I find that incrementing edi and esi midway through the list of mov's is better. Keeps the max offset down to 30h, no reason it should make a difference, but seems to help. And of course u should dec ebx long b4 the jnz branch, maximizes processor's ability to predict branch correctly in advance. Minor points, of course; see laboratory thread for a couple dozen more if interested

Dunno what this is doing here, would be more relevant over in the laboratory, but it was such a surprise to see it I had to comment.

BTW Yuri dedndave's right re GetThreadContext 16 bit alignment, nice catch
I am NaN ;)

MichaelW

The structure (external) alignment is only one of the potential problems. For the compiled structure to work correctly its internal layout must be precisely as the Microsoft compilers would lay it out. No problem for a good compiler, but...

Below is the output for this source:

#include <windows.h>
#include <stdio.h>
#include <stddef.h>

int __cdecl main(void)
{
    CONTEXT context;
    printf("sizeof(CONTEXT)                 \t%I64d\n", sizeof(CONTEXT));
    printf("__alignof(context)              \t%I64d\n\n", __alignof(context));
    printf("offsetof(p1Home)                \t%I64d\n", offsetof(CONTEXT,P1Home));
    printf("offsetof(p2Home)                \t%I64d\n", offsetof(CONTEXT,P2Home));
    printf("offsetof(p3Home)                \t%I64d\n", offsetof(CONTEXT,P3Home));
    printf("offsetof(p4Home)                \t%I64d\n", offsetof(CONTEXT,P4Home));
    printf("offsetof(p5Home)                \t%I64d\n", offsetof(CONTEXT,P5Home));
    printf("offsetof(p6Home)                \t%I64d\n", offsetof(CONTEXT,P6Home));   
    printf("offsetof(p6Home)                \t%I64d\n", offsetof(CONTEXT,P6Home));
    printf("offsetof(ContextFlags)          \t%I64d\n", offsetof(CONTEXT,ContextFlags));   
    printf("offsetof(MxCsr)                 \t%I64d\n", offsetof(CONTEXT,MxCsr));   
    printf("offsetof(SegCs)                 \t%I64d\n", offsetof(CONTEXT,SegCs));     
    printf("offsetof(SegDs)                 \t%I64d\n", offsetof(CONTEXT,SegDs));   
    printf("offsetof(SegEs)                 \t%I64d\n", offsetof(CONTEXT,SegEs));   
    printf("offsetof(SegFs)                 \t%I64d\n", offsetof(CONTEXT,SegFs));   
    printf("offsetof(SegGs)                 \t%I64d\n", offsetof(CONTEXT,SegGs));   
    printf("offsetof(SegSs)                 \t%I64d\n", offsetof(CONTEXT,SegSs));   
    printf("offsetof(EFlag)s                \t%I64d\n", offsetof(CONTEXT,EFlags));   
    printf("offsetof(Dr0)                   \t%I64d\n", offsetof(CONTEXT,Dr0));   
    printf("offsetof(Dr1)                   \t%I64d\n", offsetof(CONTEXT,Dr1));   
    printf("offsetof(Dr2)                   \t%I64d\n", offsetof(CONTEXT,Dr2));   
    printf("offsetof(Dr3)                   \t%I64d\n", offsetof(CONTEXT,Dr3));   
    printf("offsetof(Dr6)                   \t%I64d\n", offsetof(CONTEXT,Dr6));   
    printf("offsetof(Dr7)                   \t%I64d\n", offsetof(CONTEXT,Dr7));   
    printf("offsetof(Rax)                   \t%I64d\n", offsetof(CONTEXT,Rax));   
    printf("offsetof(Rcx)                   \t%I64d\n", offsetof(CONTEXT,Rcx));   
    printf("offsetof(Rdx)                   \t%I64d\n", offsetof(CONTEXT,Rdx));   
    printf("offsetof(Rbx)                   \t%I64d\n", offsetof(CONTEXT,Rbx));   
    printf("offsetof(Rsp)                   \t%I64d\n", offsetof(CONTEXT,Rsp));   
    printf("offsetof(Rbp)                   \t%I64d\n", offsetof(CONTEXT,Rbp));   
    printf("offsetof(Rsi)                   \t%I64d\n", offsetof(CONTEXT,Rsi));   
    printf("offsetof(Rdi)                   \t%I64d\n", offsetof(CONTEXT,Rdi));   
    printf("offsetof(R8)                    \t%I64d\n", offsetof(CONTEXT,R8));   
    printf("offsetof(R9)                    \t%I64d\n", offsetof(CONTEXT,R9));   
    printf("offsetof(R10)                   \t%I64d\n", offsetof(CONTEXT,R10));   
    printf("offsetof(R11)                   \t%I64d\n", offsetof(CONTEXT,R11));   
    printf("offsetof(R12)                   \t%I64d\n", offsetof(CONTEXT,R12));   
    printf("offsetof(R13)                   \t%I64d\n", offsetof(CONTEXT,R13));   
    printf("offsetof(R14)                   \t%I64d\n", offsetof(CONTEXT,R14));   
    printf("offsetof(R15)                   \t%I64d\n", offsetof(CONTEXT,R15));   
    printf("offsetof(Rip)                   \t%I64d\n\n", offsetof(CONTEXT,Rip));   
    printf("sizeof(M128A)                   \t%I64d\n", sizeof(M128A));   
    printf("__alignof(FltSave)              \t%I64d\n", __alignof(context.FltSave));   
    printf("sizeof(FltSave)                 \t%I64d\n", sizeof(context.FltSave));
    printf("sizeof(DUMMYSTRUCTNAME)         \t%I64d\n\n", sizeof(context.Header) +
                                                          sizeof(context.Legacy) + 
                                                          sizeof(context.Xmm0) +
                                                          sizeof(context.Xmm1) +
                                                          sizeof(context.Xmm2) +
                                                          sizeof(context.Xmm3) +
                                                          sizeof(context.Xmm4) + 
                                                          sizeof(context.Xmm5) +
                                                          sizeof(context.Xmm6) +
                                                          sizeof(context.Xmm7) +
                                                          sizeof(context.Xmm8) +
                                                          sizeof(context.Xmm9) + 
                                                          sizeof(context.Xmm10) +
                                                          sizeof(context.Xmm11) +
                                                          sizeof(context.Xmm12) +
                                                          sizeof(context.Xmm13) +
                                                          sizeof(context.Xmm14) + 
                                                          sizeof(context.Xmm15));
    printf("offsetof(FltSave)               \t%I64d\n", offsetof(CONTEXT,FltSave));   
    printf("offsetof(Header[2])             \t%I64d\n", offsetof(CONTEXT,Header));   
    printf("offsetof(Legacy[8])             \t%I64d\n", offsetof(CONTEXT,Legacy));   
    printf("offsetof(Xmm0)                  \t%I64d\n", offsetof(CONTEXT,Xmm0));   
    printf("offsetof(Xmm1)                  \t%I64d\n", offsetof(CONTEXT,Xmm1));   
    printf("offsetof(Xmm2)                  \t%I64d\n", offsetof(CONTEXT,Xmm2));   
    printf("...\n");   
    printf("offsetof(Xmm15)                 \t%I64d\n", offsetof(CONTEXT,Xmm15));   
    printf("offsetof(VectorRegister)        \t%I64d\n", offsetof(CONTEXT,VectorRegister));   
    printf("offsetof(VectorControl)         \t%I64d\n", offsetof(CONTEXT,VectorControl));   
    printf("...\n");   
    printf("offsetof(LastExceptionFromRip)  \t%I64d\n\n", offsetof(CONTEXT,LastExceptionFromRip));   
    return 0;
}

/* THESE FROM WINNT.H:

typedef struct DECLSPEC_ALIGN(16) _M128A {
    ULONGLONG Low;
    LONGLONG High;
} M128A, *PM128A;

typedef struct DECLSPEC_ALIGN(16) _XSAVE_FORMAT {
    WORD ControlWord;
    WORD StatusWord;
    BYTE TagWord;
    BYTE Reserved1;
    WORD ErrorOpcode;
    DWORD ErrorOffset;
    WORD ErrorSelector;
    WORD Reserved2;
    DWORD DataOffset;
    WORD DataSelector;
    WORD Reserved3;
    DWORD MxCsr;
    DWORD MxCsr_Mask;
    M128A FloatRegisters[8];
#ifdef _WIN64
    M128A XmmRegisters[16];
    BYTE Reserved4[96];
#else
    M128A XmmRegisters[8];
    BYTE Reserved4[192];
    DWORD StackControl[7];
    DWORD Cr0NpxState;
#endif
} XSAVE_FORMAT, *PXSAVE_FORMAT;

typedef struct DECLSPEC_ALIGN (16) _CONTEXT {
    DWORD64 P1Home;
    DWORD64 P2Home;
    DWORD64 P3Home;
    DWORD64 P4Home;
    DWORD64 P5Home;
    DWORD64 P6Home;
    DWORD ContextFlags;
    DWORD MxCsr;
    WORD SegCs;
    WORD SegDs;
    WORD SegEs;
    WORD SegFs;
    WORD SegGs;
    WORD SegSs;
    DWORD EFlags;
    DWORD64 Dr0;
    DWORD64 Dr1;
    DWORD64 Dr2;
    DWORD64 Dr3;
    DWORD64 Dr6;
    DWORD64 Dr7;
    DWORD64 Rax;
    DWORD64 Rcx;
    DWORD64 Rdx;
    DWORD64 Rbx;
    DWORD64 Rsp;
    DWORD64 Rbp;
    DWORD64 Rsi;
    DWORD64 Rdi;
    DWORD64 R8;
    DWORD64 R9;
    DWORD64 R10;
    DWORD64 R11;
    DWORD64 R12;
    DWORD64 R13;
    DWORD64 R14;
    DWORD64 R15;
    DWORD64 Rip;
    union {
        XMM_SAVE_AREA32 FltSave;
        struct {
            M128A Header[2];
            M128A Legacy[8];
            M128A Xmm0;
            M128A Xmm1;
            M128A Xmm2;
            M128A Xmm3;
            M128A Xmm4;
            M128A Xmm5;
            M128A Xmm6;
            M128A Xmm7;
            M128A Xmm8;
            M128A Xmm9;
            M128A Xmm10;
            M128A Xmm11;
            M128A Xmm12;
            M128A Xmm13;
            M128A Xmm14;
            M128A Xmm15;
        } DUMMYSTRUCTNAME;
    } DUMMYUNIONNAME;
    M128A VectorRegister[26];
    DWORD64 VectorControl;
    DWORD64 DebugControl;
    DWORD64 LastBranchToRip;
    DWORD64 LastBranchFromRip;
    DWORD64 LastExceptionToRip;
    DWORD64 LastExceptionFromRip;
} CONTEXT, *PCONTEXT;
*/

Compiled to a 64-bit app with Pelles C Version 8.00.33 Release Candidate #7 (Win64):

sizeof(CONTEXT)                         1232
__alignof(context)                      16

offsetof(p1Home)                        0
offsetof(p2Home)                        8
offsetof(p3Home)                        16
offsetof(p4Home)                        24
offsetof(p5Home)                        32
offsetof(p6Home)                        40
offsetof(p6Home)                        40
offsetof(ContextFlags)                  48
offsetof(MxCsr)                         52
offsetof(SegCs)                         56
offsetof(SegDs)                         58
offsetof(SegEs)                         60
offsetof(SegFs)                         62
offsetof(SegGs)                         64
offsetof(SegSs)                         66
offsetof(EFlag)s                        68
offsetof(Dr0)                           72
offsetof(Dr1)                           80
offsetof(Dr2)                           88
offsetof(Dr3)                           96
offsetof(Dr6)                           104
offsetof(Dr7)                           112
offsetof(Rax)                           120
offsetof(Rcx)                           128
offsetof(Rdx)                           136
offsetof(Rbx)                           144
offsetof(Rsp)                           152
offsetof(Rbp)                           160
offsetof(Rsi)                           168
offsetof(Rdi)                           176
offsetof(R8)                            184
offsetof(R9)                            192
offsetof(R10)                           200
offsetof(R11)                           208
offsetof(R12)                           216
offsetof(R13)                           224
offsetof(R14)                           232
offsetof(R15)                           240
offsetof(Rip)                           248

sizeof(M128A)                           16
__alignof(FltSave)                      16
sizeof(FltSave)                         512
sizeof(DUMMYSTRUCTNAME)                 416

offsetof(FltSave)                       256
offsetof(Header[2])                     256
offsetof(Legacy[8])                     288
offsetof(Xmm0)                          416
offsetof(Xmm1)                          432
offsetof(Xmm2)                          448
...
offsetof(Xmm15)                         656
offsetof(VectorRegister)                768
offsetof(VectorControl)                 1184
...
offsetof(LastExceptionFromRip)          1224



Well Microsoft, here's another nice mess you've gotten us into.