Win7; detecting shutdown in an MASM program

MtheK · November 27, 2014, 05:54:41 AM

I've determined what is at least created when the first
SM_SHUTTINGDOWN call is done in the main thread dynamically.

Using WinDbg with PROCessEXPlorer (PROCEXP)/Handle view and:

INVOKE NtQuerySystemInformation,
SystemHandleInformation,...

both before and after the call (in PROCEXP, a new handle is green in
the 1st interval), along with batch PSLISTs,etc, then comparing all
their results, shows that an un-named infinite (Create)Event handle,
protected from close, was added:

d0 0b 00 00 0c 01 2c 00 a8 bc c0 8a 03 00 1f 00
Handle 00000028 = HANDLE_FLAG_PROTECT_FROM_CLOSE
(so can't be closed until thread exit?)
Type Event
Attributes 0
GrantedAccess 0x1f0003:
Delete,ReadControl,WriteDac,WriteOwner,Synch
QueryState,ModifyState
HandleCount 2
PointerCount 4
Name <none>
No object specific information available

Unfortunately, closing that handle, after un-protecting it, makes
no difference. PROCEXP shows it gone (from grey to nothing) and my
before and after handle lists do NOT increase or change (the # of
handles for my PID stay the same, and the # of events stay the same)
after subsequent calls. So a new Event handle is NOT made, even tho
the previous Event handle is now gone. However, the abend handler
(ABENDXIT) is STILL NOT driven at phase 1 PowerOff (before the
shutdown .wav and "program close" dark screen), so something else is
probably set somewhere which the force-killer also uses, probably
first.

This explains why my "subtask" thread CAN do the call and exit, yet
still get ABENDXIT driven for the main thread which does NOT do the
call (to keep PROCessMONitor (PROCMON) alive long enough for me to
trace everything during shutdown).

This is probably part of what the force-killer uses when it decides
to screw you by NOT driving ABENDXIT at phase 1 PowerOff.

I also found 2 more commands that re-acts as SM_SHUTTINGDOWN does:
INVOKE PlaySound
INVOKE MessageBeep
in that, if coded, ABENDXIT will NOT be driven at phase 1 PowerOff!
Finding those took effort: ended up comparing the "imports" with a
program lacking the SM_CALL (wasn't being killed), eliminating the
duplicates, and testing each leftover until ABENDXIT was driven.
No documentation anywhere on what commands to not even code for this?

Another interesting item is that CreateProcess will get an rc19
(ERROR_WRITE_PROTECT) when EventLog13 is written. And why is WTS
starting my task so near powering off and not driving ABENDXIT?

When NOT doing the call in the main thread, to get ABENDXIT driven,
the time given does not appear to be alterable. I tried adding these
registry entries with changed values and re-booted:

HKEY_CURRENT_USER\Control Panel\Desktop
PowerOffTimeout
WaitToKillAppTimeout

WaitToKillTimeout (default 5000ms)
WaitToKillServiceTimeout
(NOT changed, since already at 12000ms and not a factor at phase
1 PowerOff since TESTFK.exe gets killed if it waits more than 5
seconds with -1073741510/C000013Ah)
HungAppTimeout (default 5000ms)

when I then can see them changed (2 of the last 3 set to 10000 and
15000ms respectively) in my "subtask" thread using:

INVOKE SystemParametersInfo,SPI_GET...

dynamically, but all to no avail as the OS still gives 5 seconds.

I ended up removing the SM_CALL from all my productional programs.
I kept 1 test program using it, mainly for point-of-reference. It
also is "read-only" in that it has no cleanup to do, outside of the
lost .BAT ECHO and occasionally even my own program, which is still
annoying. I appropriately renamed it to "SCREWJOB", and all it does
is capture all lost "heartbeats" which says when (apx1s) it was last
OK. The results are "very interesting":
. kicked up SLEEP interval to 499ms
. at first PowerOn of NEW day, them immediate PowerOff, it usually
does NOT get killed; this day "span" may be just over an hour?
. subsequent PowerOn/PowerOffs in the same "span" are mostly killed
. manually X'ing the window and re-starting sometimes is NOT killed
It appears that a dynamic timeout for a certain span is being used?

So, in summary, if you need to do cleanup at phase 1 PowerOff, you
can pick your poison:

. if you want 100% (so far since the change) assurance that you won't
be killed by the force-killer, use SetConsoleCtrlHandler ONLY, which
gives at least 5 seconds (un-alterable?).

However, since, once that thread exits, the main thread (if still
active) and its' parent is killed within 2 THOUSANDths of a second,
if you are expecting an ECHO from the parent (ie: cmd.exe which runs
the .BAT), DON'T COUNT ON IT!!! Most of the time, nothing occurs! If
you're lucky, you may get the ERRORLEVEL 1073807364/40010004h. This
means that ALL info you want to collect MUST be done by YOUR
programs' ABENDXIT; at least you are given a decent amount of time
for this. In retrospect, this answers the original question of why I
was getting killed. As long as I ensure the handler thread exits
LAST, all is OK. Also, the handler thread can independently capture
fields in the main thread, as for debugging.

The ERRORLEVEL that is passed (when the ECHO DOES work) is from the
LAST thread that exits, so sync'ing up w/the main thread may be
necessary.

Making your program a WTS SERVICE, even with the imbedded SM_CALL,
will also seemingly spare you from the phase 1 force-killer.

. if you want the .BAT ECHO as well, then using the SM_CALL, with the
proper pre-setup beforehand, lets you race the force-killer, and, in
my case, usually win, say, 9 out of 10 times w/a quick SLEEP interval.

However, you DO risk getting killed, either your program, and/or
your parent. You risk losing data integrity (LOST DATA) if ANY(!)
"long" (apx 2 TENTHs of a second!) wait occurs, perhaps due to:

1. erratic 'INVOKE Sleep' results (if using this B 4 the PowerOff)
2. hang in DeregisterEventSource
this was moved to a separate "subtask" thread so that, if
killed, oh well, except that it still prevents the PID from
exiting which prevents the parent from running. Fortunately, if
it was just not done, to let the OS do it, then there is no
hang, and the PID exits normally?!
3. who knows what else (that 10%)!!!

Merely running PROCMON seems to extend this "long" wait briefly,
which, ALSO, does NOT appear to be alterable (haven't found a
registry entry for this "WAITTOKILL" time that works, especially now
that I have the proof by being able to keep PROCMON alive?).

Too bad the OS, which considers killing "dangerous", doesn't seem
to log its' own killings for data integrity purposes! IBM mainframe
always has, even at the expense of delays. Perhaps that says it all.

MtheK · December 25, 2014, 10:40:35 PM

I believe I may have found a way to help identify if a program is
vulnerable to the PowerOff force-killer or not (at least in my case).

If your .exe (specifically, your .map) contains
user32.dll
then you may be killed rather quickly in this scenario, depending on
whether the command(s) used is a "trigger" command that won't drive
ABENDXIT (SetConsoleCtrlHandler) at phase 1 PowerOff.

In my case, I've gathered logs from my PROD external monitor since
December. This PROD version, as well as all my other programs that
have an ABENDXIT, do NOT have SM_SHUTTINGDOWN assembled into it.
However, I've also been running a "CLONE" in parallel w/the PROD
version, the ONLY difference in source being:

Code Select


.asm
GETSHUTSTATFLAGM     EQU   1         ; (to imbed SM_SHUTTINGDOWN, else 0)

.lst (PROD)
                             2         IF GETSHUTSTATFLAGM EQ 1
                             2  ;* WARNING: INCLUDING THIS IN CODE WILL NOT(!) DRIVE ABENDXIT @ POWEROFF!!!...
                             2           INVOKE GetSystemMetrics,
                             2                 SM_SHUTTINGDOWN               ; x'2000'
                             2         ELSE
 000008B5  B8 00000000       2           MOV   EAX,0                  ; NEVER UPDATE ABENDWB FROM HERE!
                             2         ENDIF

.lst (CLONE)
                             2         IF GETSHUTSTATFLAGM EQ 1
                             2  ;* WARNING: INCLUDING THIS IN CODE WILL NOT(!) DRIVE ABENDXIT @ POWEROFF!!!...
 000008B5  68 00002000     *        push   +000002000h
 000008BA  E8 00000000 E   *        call   GetSystemMetrics
                             2           INVOKE GetSystemMetrics,
                             2                 SM_SHUTTINGDOWN               ; x'2000'
                             2         ELSE
                             2           MOV   EAX,0                  ; NEVER UPDATE ABENDWB FROM HERE!
                             2         ENDIF

.map (PROD)
N/A

.map (CLONE)
 0001:0000cfba       _GetSystemMetrics@4        0040dfba f   user32:user32.dll
5 in total.

and using a higher SLEEP interval (.499s; to induce more killings?).

The results are "very interesting":
0. log records are "paired" per run
. by PID; has a start and either an end OR a lost heartbeat record

1. the PROD runs
. 150 start records, 150 "paired" end records
. ZERO lost heartbeats detected (100% NOT killed)
. last one on NOV13 when SM_SHUTTINGDOWN was removed
. all detections via ABENDXIT

2. the CLONE runs
. 157 start records, 157 "paired" end records
. 68 lost heartbeats detected ( 57% NOT killed; just over HALF!)
. all allowed detections via SM_SHUTTINGDOWN

For the rc19 (ERROR_WRITE_PROTECT) problem, the external monitor
isn't started like WTS starts my other tasks (at startup opposed to
every 10min), so it doesn't need the SPECIAL TECHNIQUE I had to come
up with to preserve data integrity if the rc19 does NOT occur (which
is actually WORSE):

. if I'm a fraction of a second earlier, I actually start, log data,
then get killed (DUH! 1 SECOND B 4 the screen goes black and the
power is physically removed)
. I've also been on the other side of the fence, when I was started a
tenth AFTER(!) EV13; in this case, it didn't get very far. The single
output file only had the 1st .BAT ECHO. :(

Since an Event only lasts until the PID exits, but I need something
that would cover the length of a typical shutdown, I made this.

Now, I run an isolated program first, which fakes what ABENDXIT
would do; set ERRORLEVEL 9944-7. This way, the main program never
starts, so data integrity is no longer an issue.

Basically, I have a "global" .txt file containing the system TOD
and GetTickCount, written by the external monitor when IT gets its'
ABENDXIT driven. So, under WTS, if this isolated program runs, if,
after 3 SLEEP minimum (15ms) intervals, its' ABENDXIT handler is not
driven (possibly due to the next check) and SM_SHUTTINGDOWN is not
set (both checked after each interval), it reads the file, and, after
a few checks, if it's within a few seconds (default=30), then fake
the shut down detection myself. The .BAT sees the bad RC and exits.
One second later, the power is gone!!!

This works quite well on my machines. In my case, a WTS task runs
every 10min from 12:05. If I PowerOff at 12:04:36 (length of my
TESTFK, .wav, 5 of 7 seconds of blue screen, etc), based on an
external clock and EventLogs, WTS starts my task at 12:05:00.002 and
the power is removed at 12:05:03.715 (EV13). Since I now detect
shutdown myself with my program, and so don't start it, now there is
no chance of loss of data integrity. The .BAT ECHO (ie: NOT killed!)
even appends the date, time, and the fake RC to a .txt file:
CHECKSHU Tue 12/23/2014 12:05:03.20 TERMINATED by Ctl-C! / 9947

I have to say, though, that all this runs perfectly, in spite of
most everything else probably gone. I can envision why programs may
want to run here, but there should be an option in WTS for this, to
prevent loss of data integrity when killed. Of course, if ABENDXIT
was driven during this time, or SM_SHUTTINGDOWN was set, OR BOTH(!),
this would not be required. :)

The MASM Forum

News:

Win7; detecting shutdown in an MASM program

MtheK

MtheK