News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Win7 W calls to the registry are MUCH SLOWER than A calls

Started by MtheK, July 16, 2015, 11:36:02 PM

Previous topic - Next topic

MtheK

  I don't know why my registry W calls are about 10-fold slower?!

  Same ASCII-built MASM program, doing these W calls over A calls
based on a switch in an options file:

         INVOKE RegOpenKeyEx,
         INVOKE RegOpenKeyExW,

         INVOKE RegEnumKeyEx,
         INVOKE RegEnumKeyExW,

collected under TICKIO (sum of (# of ticks diff from just B 4 to just
after the cmd)); everything else is collected under TICKDIFF.
The A calls are the base (full data structures).

---

  When I run this by itself, hours into a boot, all classes, keys
and sub-levels, normal (non-WTS) PRI, ELEVATED/ADMIN (to eliminate
many RC2's from VirtualStore), minimum interference from other PIDs
(all WTS temp-disabled), they run this long (LT2B):


                                                                                       (if non-ASCII in key)
                                                               / <==  no W call "auto-conversion"  (show errors)
testloa2 Wed 07/15/2015 14:26:52.76 N starting...
testloa2 Wed 07/15/2015 14:27:18.89 N ended.
REGTODAYE.log
0000015736 0000007742 0000336668 0000336668 0000000002 0000000052 0000000000
     #ticks     #ticks    #A Open    #A Enum     #A bad    #W Open    #W Enum
                           w/data
   TICKDIFF  +  TICKIO = apx resp (+.bat,O/H,tick res,etc) = apx ECHO diffs

                                                               / <==  W call auto-conversion (dft) per Class
                                                               /                       (do not show errors)
testloa2 Wed 07/15/2015 14:31:16.47   starting...
testloa2 Wed 07/15/2015 14:31:44.20   ended.
0000016755 0000007612 0000336670 0000336670 0000000004 0000000004 0000050133
Apx the same.                                                      (in 2 Classes) 

                                                                / <==  mirror w/W calls (do not show errors)
testloa2 Wed 07/15/2015 14:28:13.37 Y starting...
testloa2 Wed 07/15/2015 14:30:27.94 Y ended.
0000020775 0000110859 0000336670 0000336670 0000000004 0000000004 0000336670
Apx 6-fold slower.



  When I run 7 at a time (separate DIRs, async (via DOS Prompt
START)), using the normal A calls, the 7th run takes this long:


testloa2 Wed 07/15/2015 14:53:21.82 N starting...
testloa2 Wed 07/15/2015 14:54:44.94 N ended.
0000031283 0000024846 0000336668 0000336668 0000000002 0000000052 0000000000
PROCEXP:            CPUR time = 17.862s


but using W calls, the 7th run takes this long:


testloa2 Wed 07/15/2015 15:24:17.84   starting...
testloa2 Wed 07/15/2015 15:25:50.22   ended.
0000041524 0000022920 0000336670 0000336670 0000000004 0000000004 0000050133
Apx the same.       CPUR time = 16.52s

testloa2 Wed 07/15/2015 15:12:55.12 Y starting...
testloa2 Wed 07/15/2015 15:22:42.35 Y ended.
0000060321 0000523154 0000336670 0000336670 0000000004 0000000004 0000336670
Apx 10-fold slower. CPUR time = 2m 38.138s (apx 9x)


  I get similar results from multiple tests. All 7 are btx 6-10x slower
(probably due to various interference). I can see my time increasing due
to all the stuff I have to mirror to make this work, but the "I/O"
increase in the registry W calls w/full mirroring (21x) seems absurd!

---

  So, related to this little trick:

http://masm32.com/board/index.php?topic=4404.0

has anyone noticed such a HUGE drag in performance when using W calls
over A calls for files? Maybe it's just a registry thing? Could all
W calls be so slow? And if so, why? The translation/CPUR? Tune-able?
Could some caching method be a factor? Maybe it's only noticable when
many hundreds of thousands of requests are made? Maybe it's by Class?
Perhaps that's another test for another time.

  For me, it seems that it's best to make an ASCII-built MASM program,
use the 8dot3 name for file names when unicode, and only use the W calls
when required yet try to maintain performance if possible. Whew!


jj2007

Interesting, especially in the light of the official mantra that everything under the hood of Windows is Unicode. One reason might be the slowness of case-insensitive instr() type functions in Windows.

adeyblue

I don't know why you're seeing what you're seeing but apart from the WinInet functions, the A functions allocate memory, convert the input to unicode and call the W version.

I have a C++ program which scans the registry to get some stats. I added some timings to it to see, here's what I get for the A functions:
Total Enum Calls: 635942
Total Enum Fails: 0
Total Enum Time (QPC Ticks): 8713439
Total Open Calls: 635942
Total Open Fails: 880
Total Open Time (QPC Ticks): 28296448

And this is the W:
Total Enum Calls: 635946
Total Enum Fails: 0
Total Enum Time (QPC Ticks): 4974513
Total Open Calls: 635946
Total Open Fails: 880
Total Open Time (QPC Ticks): 20551422

Enum is RegEnumKeyEx, Open is RegOpenKeyEx. It's a 64-bit program, and not running as admin or high priority or anything.

MtheK

  Hhhmmm...

  I was able to narrow it down to class 1 (HKEY_CLASSES_ROOT) as the culprit:


testloa2 Thu 07/16/2015 13:09:10.98 1,1,N starting...
0000003766 0000002536 0000090540 0000090540 0000000000 0000000000 0000000000
testloa2 Thu 07/16/2015 13:09:17.84 1,1,N ended.
testloa2 Thu 07/16/2015 13:09:40.76 1,1,Y starting...
0000005480 0000105889 0000090540 0000090540 0000000000 0000000000 0000090540
testloa2 Thu 07/16/2015 13:11:32.72 1,1,Y ended.

testloa2 Thu 07/16/2015 13:11:41.80 1,2,N starting...
0000000000 0000000000 0000000025 0000000025 0000000000 0000000000 0000000000
testloa2 Thu 07/16/2015 13:11:42.49 1,2,N ended.
testloa2 Thu 07/16/2015 13:11:46.93 1,2,Y starting...
0000000015 0000000000 0000000025 0000000025 0000000000 0000000000 0000000025
testloa2 Thu 07/16/2015 13:11:47.51 1,2,Y ended.

testloa2 Thu 07/16/2015 13:11:51.71 1,3,N starting...
0000000922 0000000482 0000025953 0000025953 0000000001 0000000026 0000000000
testloa2 Thu 07/16/2015 13:11:54.41 1,3,N ended.
testloa2 Thu 07/16/2015 13:11:58.93 1,3,Y starting...
0000001200 0000000594 0000025954 0000025954 0000000002 0000000002 0000025954
testloa2 Thu 07/16/2015 13:12:02.02 1,3,Y ended.

testloa2 Thu 07/16/2015 13:12:06.56 1,4,N starting...
0000008927 0000003413 0000169902 0000169902 0000000000 0000000000 0000000000
testloa2 Thu 07/16/2015 13:12:21.68 1,4,N ended.
testloa2 Thu 07/16/2015 13:12:26.06 1,4,Y starting...
0000011715 0000003604 0000169902 0000169902 0000000000 0000000000 0000169902
testloa2 Thu 07/16/2015 13:12:43.34 1,4,Y ended.

testloa2 Thu 07/16/2015 13:12:50.72 1,5,N starting...
0000000016 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000
testloa2 Thu 07/16/2015 13:12:52.02 1,5,N ended.
testloa2 Thu 07/16/2015 13:12:56.14 1,5,Y starting...
0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000
testloa2 Thu 07/16/2015 13:12:56.81 1,5,Y ended.

testloa2 Thu 07/16/2015 13:13:00.97 1,6,N starting...
0000001283 0000001306 0000050243 0000050243 0000000001 0000000026 0000000000
testloa2 Thu 07/16/2015 13:13:04.28 1,6,N ended.
testloa2 Thu 07/16/2015 13:13:08.54 1,6,Y starting...
0000002323 0000001203 0000050244 0000050244 0000000002 0000000002 0000050244
testloa2 Thu 07/16/2015 13:13:13.09 1,6,Y ended.


which also explains why the dft option is just as fast (in my case, classes
3 and 6 had the rc2's, forcing W calls for the remainder of each class),
but since I get no errors (rc=2) in the class 1 A calls, I don't know where
to go from here, other than to just try to avoid W calls in class 1 when
possible, as I ended up doing by coincidence.

  Hopefully, W calls for files don't have whatever this is. Perhaps my
systems are just missing a performance fix.  I'll search around. Thankx...


dedndave

HKEY_CLASSES_ROOT is a virtualized hive
it's created by overlaying several other keys
(for example, HKEY_LOCAL_MACHINE\Software\Classes and HKEY_CURRENT_USER\Software\Classes)

in other words, HKCR doesn't really exist, per se
it is created at boot time

so - perhaps you could increase the size of virtual memory to create a larger working buffer

MtheK

  I think I may have found the problem via 'WinDbg wt'.

  I set a BP where I detected 4 "high"s in a row (see below), g, wait
for break, stepped thru, then did 'wt's, which, for EnumW, would just
go on and on and on (w/o -nc) as compared to EnumA, which had its'
own lags (1st 2 a second or so) but didn't go on and on and on:



         CALL  REGENUMW
507778 instructions were executed in 507777 events (0 from other threads)

most instructions:
kernel32!EnumStateChooseNext                          1  165010  165010  165010
kernel32!EnumClassKey                              3837      30      30      30

Calls  System Call
3839  ntdll!KiFastSystemCall


         CALL  REGENUM
32384 instructions were executed in 32383 events (0 from other threads)

worst human-detected ("clocked") lags:
ntdll!RtlAllocateHeap                                 2      44      44      44
ntdll!RtlFreeHeap                                     2      44      44      44

most instructions:
ntdll!RtlFillMemoryUlong                              2    8199    8203    8201
ntdll!RtlCompareMemoryUlong                           1    8206    8206    8206
ntdll!_SEH_prolog4                                    9      21      21      21

Calls  System Call
    3  ntdll!KiFastSystemCall

It was in key CLSID\{xxx}.


  Stepping thru this produced some VERY interesting results:


772f6d5f e830fdffff      call    kernel32!EnumSubtreeStateClear (772f6a94)
772f6d64 8bc6            mov     eax,esi
772f6d66 5f              pop     edi
772f6d67 5e              pop     esi
772f6d68 5b              pop     ebx
772f6d69 5d              pop     ebp
772f6d6a c21c00          ret     1Ch
772f6d6d 90              nop
772f6d6e 90              nop
772f6d6f 90              nop
772f6d70 90              nop
772f6d71 90              nop
kernel32!EnumStateChooseNext:

Breakpoint 3 hit
eax=0012fb00 ebx=0012fb34 ecx=0012faec edx=00000004 esi=001ce1a0 edi=00000f01
eip=772f6d72 esp=0012fab8 ebp=0012fae0 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
kernel32!EnumStateChooseNext:
772f6d72 8bff            mov     edi,edi

772f6d74 55              push    ebp
772f6d75 8bec            mov     ebp,esp
772f6d77 51              push    ecx
772f6d78 8b4510          mov     eax,dword ptr [ebp+10h]
772f6d7b 8365fc00        and     dword ptr [ebp-4],0

stack trace:
0012fab4 772f6d2e 001ce1a0 00000f01 00000f01 kernel32!EnumStateChooseNext (FPO: [Non-Fpo])

0:000> !vprot 0012fb00
BaseAddress:       0012f000
AllocationBase:    00030000
AllocationProtect: 00000004  PAGE_READWRITE
RegionSize:        00001000
State:             00001000  MEM_COMMIT
Protect:           00000004  PAGE_READWRITE
Type:              00020000  MEM_PRIVATE

0012fab8 2e 6d 2f 77 a0 e1 1c 00 01 0f 00 00 01 0f 00 00 02 0f 00 00 01  .m/w.................
0012facd 00 00 00 00 fb 12 00 00 00 00 00 d8 65 37 77 dc 65 37 77 14 fb  ............e7w.e7w..
0012fae2 12 00 c1 6c 2f 77 01 0f 00 00 02 0f 00 00 01 00 00 00 5c fb 12  ...l/w............\..
0012faf7 00 20 01 00 00 54 fb 12 00 01 00 00 00 d8 fc 12 00 5c fb 12 00  . ...T...........\...
0012fb0c e0 fc 12 00 00 00 00 00 80 fc 12 00 3e 6c 2f 77 a0 e1 1c 00 3a  ............>l/w....:
0012fb21 00 00 00 01 0f 00 00 01 00 00 00 5c fb 12 00 20 01 00 00 01 00  ...........\... .....
0012fb36 00 00 00 00 00 00 94 f1 40 00 40 e7 40 00 00 00 00 00 40 e7 40  ........@.@.@.....@.@

then, it does this loop, on and on and on:
0:000> pct
eax=0012fd2c ebx=00000001 ecx=00000001 edx=77d96344 esi=001ce1a0 edi=8000001a
eip=7732e588 esp=0012fccc ebp=0012fce0 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
kernel32!EnumStateChooseNext+0x3e:
7732e588 e80785fcff      call    kernel32!EnumSubtreeStateClear (772f6a94)
0:000>
eax=001ce2f8 ebx=00000001 ecx=0000003a edx=77d96344 esi=001ce1a0 edi=8000001a
eip=772f6ddf esp=0012fcc8 ebp=0012fce0 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
kernel32!EnumStateChooseNext+0xb8:
772f6ddf e859000000      call    kernel32!EnumClassKey (772f6e3d)
0:000>
eax=00000000 ebx=00000001 ecx=0012fc94 edx=77d96344 esi=001ce1a0 edi=00000000
eip=772f6e0a esp=0012fcc4 ebp=0012fce0 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000206
kernel32!EnumStateChooseNext+0x116:
772f6e0a e8be030000      call    kernel32!EnumStateCompareSubtrees (772f71cd)

until end of loop (takes awhile):
0:000> pt
eax=00000000 ebx=0012fd60 ecx=00000001 edx=77d96344 esi=001ce1a0 edi=00000f0a
eip=772f6e35 esp=0012fce4 ebp=0012fd0c iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
kernel32!EnumStateChooseNext+0x145:
772f6e35 c21800          ret     18h
0:000> p
eax=00000000 ebx=0012fd60 ecx=00000001 edx=77d96344 esi=001ce1a0 edi=00000f0a
eip=772f6d2e esp=0012fd00 ebp=0012fd0c iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
kernel32!EnumStateGetNextEnum+0x42:
772f6d2e 33c9            xor     ecx,ecx

and shortly back to my program:
eax=00000000 ebx=00413000 ecx=772f88d1 edx=00000000 esi=00000000 edi=00000000
eip=772f88d1 esp=0012ff14 ebp=0012ff94 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
kernel32!RegEnumKeyExW+0x177:
772f88d1 c22000          ret     20h
0:000>
eax=00000000 ebx=00413000 ecx=772f88d1 edx=00000000 esi=00000000 edi=00000000
eip=00404332 esp=0012ff38 ebp=0012ff94 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
REGTODAY!mainCRTStartup+0x3332:
00404332 e879000000      call    REGTODAY!mainCRTStartup+0x33b0 (004043b0)

  Then, if I remove the latter BP and just go thru my own BP for
EnumW, holding down g (PF5) for awhile, then do 'wt' again, it's back
to "normal" (IE: like EnumA in # of system calls). The key is still
in CLSID\, buf different names in brackets.

  But wait, there's more. What's behind Door#3? "A watery grave",
courtesy ENigma! If I hold PF5 down some more, then do 'g ; wt -nc',
WHAM!!! BAM!!! It's back to VERY SLOW and thousands of system calls
(about a minute or so...zzzzzzzzzzzzz...hey, wake up...sorry, you
must have been boring)! Still in CLSID\. Again, then normal. Again,
then slow (maybe after a few (or more!) PF5's this time).  Then
normal. Then slow.  And it's not just CLSID\ any more.
Yadda-yadda-yadda! It's almost(!) funny! Lovely!

  So, I split my 2 EnumW calls into 2 separate buckets and added a
count of the # of "high"s instead of crashing. Now, it shows that:
. the calls using the pre-set key (ie: HKEY_CLASSES_ROOT)
  . are the fewest
  . while ticks are lower
  . but has the highest average and "high"s
. the deeper-down calls (using a handle from RegOpenKeyExA or W)
  . are 17x more
  . while ticks are 7x more
  . with the average 3050x less and 740x less "high"s.
While g'ing thru each pre-set EnumW call, I noticed that the # of
System Calls kept increasing, so, probably, all remaining successive
calls got progressively worse.

  From an ADMIN Y \7 class1 run:

   ticks             total        min        max    QPC min        max        avg
EnumW deeper   0000405878-0000000000-0000000499@0000000005-0001086314-0000000008
EnumW pre-set  0000055465-0000000016-0000000530@0000000016-0001161058-0000024399

EnumA (base)   0000005223-0000000000-0000000188@0000000006-0000384590-0000000037

   counts
EnumA          0000090560-0000090560-0000000000-0000000000

                              "high"s
EnumW deeper   0000085615-0000000003
EmunW pre-set  0000004945-0000002221

  While I was at it (and since MS didn't answer this question that
someone asked), I tried to open the pre-set key, but all it did was
to return the same thing I passed into it. I tried a run anyway,
adding a close at the end, but it made no difference.


###

  PROCMON didn't help much (when it didn't crash!). There are these
"abnormal" entries scattered in the trace, for both the A and W
calls:


REGTODAY.exe    3252    1464    RegQueryKeySecurity     BUFFER TOO SMALL        HKCR\*\OpenWithList             0.0000027       1452    5484    7:22:33.7231937 AM
7       REGTODAY.exe    mainCRTStartup + 0x367a, REGTODAY.ASM(2538)     0x40467a       
00003675  E8 00000000 E   *        call   RegQueryInfoKeyA
                             1           INVOKE RegQueryInfoKey,      ; top-level subkey
                             1                 KYHANDLE,
                             1                 OFFSET REGCLASS,
                             1                 OFFSET REGCLASSL,
                             1                 NULL,
                             1                 OFFSET REGSUBKEYS,
                             1                 OFFSET REGMSUBKEYL,    ; size of largest subkey
                             1                 OFFSET REGMCLASSL,
                             1                 OFFSET REGVALUES,
                             1                 OFFSET REGMVALUENL,
                             1                 OFFSET REGMVALUEL,
                             1                 OFFSET REGSECDESC,
                             1                 OFFSET SLEEPONEOF       ; is GMT
0000367A  0B C0           *        or  eax, eax


but these internal calls don't appear to come back to my program as,
like the below VALUE change, I terminate w/any un-expected error
(only accept SUCCESS and NO_MORE_ITEMS).

  Otherwise, both A and A/W runs appear the same; perhaps the
timings-ratio of keys are noticable, but on thousandths, it's hard
to quantify. Any interference is possible (a balancing algo; see
below?), but that usually averages out statistically, which is what
I report on.

###

  I forced crashes when the W QPC diff reached 4 consecutive "high"s
(10,000), but w/o a "SYSTRACE" (a la IBM MVS; a system programmers'
best friend. Does this even exist on the PC? Basically it would be an
internal enhanced PROCMON trace for all PIDs.  Maybe in a BSOD
.dmp?), I can't tell what they did specifically.

  Snap dumps of the non-ADMIN non-excluding runs (the worst (see
below)) all showed basically the same as 'wt', in EnumW:


0:000> .frame /r 0
00 0012fb2c 77c34b9c ntdll!KiFastSystemCallRet
eax=00000000 ebx=77c34b90 ecx=0012fb30 edx=77c36344 esi=006e6fe0 edi=006e6fe4
eip=77c36344 esp=0012fb30 ebp=0012fb2c iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCall:
77c36340 8bd4            mov     edx,esp
77c36342 0f34            sysenter
ntdll!KiFastSystemCallRet:
77c36344 c3              ret

0012fb30 764e6e68 00000066 00001716 00000001 ntdll!ZwEnumerateKey+0xc
0:000> .frame /r 1
01 0012fb30 764e6e68 ntdll!ZwEnumerateKey+0xc
eax=00000000 ebx=77c34b90 ecx=0012fb30 edx=77c36344 esi=006e6fe0 edi=006e6fe4
eip=77c34b9c esp=0012fb34 ebp=0012fb5c iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!NtEnumerateKey:
77c34b90 b874000000      mov     eax,74h
77c34b95 ba0003fe7f      mov     edx,offset SharedUserData!SystemCallStub (7ffe0300)
77c34b9a ff12            call    dword ptr [edx]
77c34b9c c21800          ret     18h


###

;MAX_VALUE_NAME EQU 16383+1         ; 16K (PROD) (Key)
MAX_VALUE_NAME EQU (2*(16383+1))-2  ; works OK
;MAX_VALUE_NAME EQU (2*(16383+1))-1 ; gets rc234 (ERROR_MORE_DATA) on 1st call

  as stupid as this sounds, maybe x'7FFF' (+ASCIIZ) wraps to 0
    (15bit) so it thinks the user wants the size? No max specified in
    API doc, but something similar IS in RegEnumValue/lpcchValueName
    doc (SHORT):


00001177 00007FFF               REGNAMEL    DWORD MAX_VALUE_NAME
         MOV   REGNAMEL,MAX_VALUE_NAME     ; reset length if chg'd by prev call
         INVOKE RegEnumKeyEx,         ; top-level subkeys
               REGKEY,
               REGINDEX,              ; record#
               OFFSET REGNAME,
               OFFSET REGNAMEL,       ; this gets set to # of CHARs (4 bytes)
               NULL,
               OFFSET REGCLASS,
               OFFSET REGCLASSL,
               OFFSET SLEEPONEOF


;MAX_KEY_LENGTH EQU 255+1           ; (PROD) (Class)
MAX_KEY_LENGTH EQU (4*(16383+1))-1  ; works OK but is slower

actual stg is in .data? (.map):
; 0003:00002000 000cc800H (837632) .bss                    DATA  (PROD)

0003:00002000 000f2800H (993280) .bss                    DATA

In any case, if a value per RegQueryInfoKey/lpcMaxSubKeyLen is
bigger than my VALUE buffer, I terminate w/an error.

###

  Based on various tests, 16-folding the VALUE buffers (max allowed?
did it even use/need it?) and changing the KEY buffers (256x, then
same), didn't change TICKIO, but made TICKDIFF (my time) HUGELY (24x,
then 8x) slower! This was on me tho. Back then, I nulled my buffers
B 4 the calls in case this would fix my unicode problem, which never
did.  Since I use the length anyway, now, there isn't any reason to
null them any more, and now, this HUGE increase of mine no longer
exists w/bigger buffers, so this suggestion helped tune my program;
thankx.

  After splitting my 6 cmds (Open,Close,Enum2&1 w/W&A) into their
own buckets, and VALUE and KEY buffers back to PROD, RegEnumKeyExW is
still clearly the offender in all Y tests for whatever reason.

  My final results for the virtualized class 1:


no exclusions (ie: ignoring VirtualStore):

ADMIN,     \7, N:
testloa2 Thu 07/23/2015 15:04:20.25 QCD3 7, ,1,N starting...
testloa2 Thu 07/23/2015 15:04:37.83 QCD3 7, ,1,N ended.
0000005302 0000000000 0000000000-0000000000-0000000000 0000007373-0000000000-0000000171 0000000000-0000000000-0000000000 0000000673-0000000000-0000000078 0000000000-0000000000-0000000000@0000000000-0000000000-0000000000 0000000000-0000000000-0000000000@0000000000-0000000000-0000000000 0000003749-0000000000-0000000140@0000000006-0000280705-0000000017/0000090560-0000090560-0000000000-0000000000 0000000000-0000000000-0000000000-0000000000-0000000000
     #ticks   TICKLOST    #ticks1=OpenW  min        max    #ticks2=OpenA                    #ticks3=CloseW (N/A)             #ticks4=CloseA                   #ticks5=EnumW2                    QPCmin     QPCmax     QPCavg    #ticks7=EnumW1                    QPCmin     QPCmax     QPCavg    #ticks6=EnumA                     QPCmin     QPCmax     QPCavg    #A Open    #A Enum     #A bad     #non-A    #W Open   #W Enum2    #"high"   #W Enum1    #"high"

ADMIN,     \7, Y:
testloa2 Thu 07/23/2015 15:04:51.42 QCD3 7, ,1,Y starting...
testloa2 Thu 07/23/2015 15:12:24.08 QCD3 7, ,1,Y ended.
0000063654 0000000000 0000000000-0000000000-0000000000 0000031420-0000000000-0000002402 0000000000-0000000000-0000000000 0000001059-0000000000-0000000140 0000349605-0000000000-0000000499@0000000005-0001095093-0000000007 0000057519-0000000016-0000000359@0000000017-0000782219-0000023871 0000006556-0000000000-0000000172@0000000006-0000396028-0000000019/0000090560-0000090560-0000000000-0000000000 0000000000-0000085615-0000000003-0000004945-0000002258

exclusions:

ADMIN,     \7, Y
testloa2 Thu 07/23/2015 15:17:19.34 QCD3 7, ,1,Y starting...
testloa2 Thu 07/23/2015 15:25:57.57 QCD3 7, ,1,Y ended.
0000078680 0000000000 0000000000-0000000000-0000000000 0000035290-0000000000-0000002278 0000000000-0000000000-0000000000 0000001527-0000000000-0000000187 0000396320-0000000000-0000000406@0000000005-0000893787-0000000007 0000070642-0000000015-0000000359@0000000015-0000778331-0000015479 0000006168-0000000000-0000000188@0000000006-0000384625-0000000018/0000090515-0000090516-0000000000-0000000000 0000000000-0000085571-0000000003-0000004945-0000002112

no exclusions

non-ADMIN, \7, Y (the worst of all)
testloa2 Thu 07/23/2015 15:29:13.89 QCD3 7, ,1,Y starting...
testloa2 Thu 07/23/2015 15:55:30.00 QCD3 7, ,1,Y ended.
0000107167 0000000000 0000000420-0000000000-0000000125 0000146745-0000000000-0000000905 0000000000-0000000000-0000000000 0000003442-0000000000-0000000203 0001290367-0000000000-0000000405@0000000005-0000902225-0000000008 0000067417-0000000000-0000000327@0000000016-0000716190-0000054362 0000027656-0000000000-0000000234@0000000006-0000525967-0000000017/0000204537-0000204537-0000000076-0000000000 0000002027-0000199592-0000001679-0000004945-0000002198

NOTE: Occasionally, a Y \7 run takes 1/4th as long (4x Cycles Delta).
I believe it's due to the Dynamic TIDs' PRI (PROCEXP) occasionally
changing from 8 (Normal) to 9 and back. Perhaps a "balancing algo"
gets temp-"stuck"?


Interesting that the OpenA and EnumA calls are also seemingly
affected by doing EnumW calls, kinda like quantum entanglement.
And what's up with the non-ADMIN non-excluding Y runs???!!!
The worst of all! In VirtualStore? Triple what's ALREADY many-fold
when I do NOT exclude it. Geez!

  Unless I'm doing something wrong (results match RegEdt32 whether or
not I do W calls), at least it seems that, based on your 64bit C++
run, a performance fix for this may exist somewhere, so that's good.
Maybe it's just 32bit? If anyone knows a KB#, that would be nice.

  Thankx to all who helped.

rrr314159

All this reminds me what a poorly designed OS Windows is; except the parts they stole from Unix and MacOS, 30 years ago when Bill was one of the best crooks in the business, and not a bad programmer. The registry is a disaster, those values belong in (human-readable ASCII) .ini files.
I am NaN ;)

jj2007

#7
Quote from: rrr314159 on July 25, 2015, 02:02:26 AMThe registry is a disaster, those values belong in (human-readable ASCII) .ini files.

3rPi,

Although I fully agree, you exaggerate, as usual: A Google search for windows "registry" pile of shit yields a miserable 2,190,000 results. So it can't be that bad 8)

I wonder if it's fair use to quote such a long para from a former Windows developer?

QuoteDisclaimer: I worked on Windows 98, 98SE, and 2000. I quit before having to spend much time working on ME, so that one's not my fault.

Windows definitely had a lot of warts, and we all acknowledged it. But we were all pretty passionate about trying to make it as good as we possibly could. It's not a trivial problem to solve.

The biggest culprits were typically bad drivers (yeah, the pre-WDM driver model sucked, but even I managed to write VXDs that didn't crash). Those were exceptionally adept at taking out the system - usually by getting confused and jumping off into space like some kind of Worst of Star Trek episode.

The other culprit was bad hardware. The amount of hardware that simply flagrantly violated the bus PCI/ISA/etc bus specs amazed us. Fully 30% of a certain core system component was code that was implemented to work around a certain hardware vendor's crappy hardware.

The final culprit was bad software. Especially virus checkers - don't bother running a kernel debugger and a virus checker at the same time. There was so much major software out there that relied on bugs or undefined behavior that parts of the OS had to detect the crap software and make specific exceptions just so the app wouldn't crash.

The only people I really don't blame are end users that installed a ton of crappy software and couldn't figure out why their computer didn't work. They'd usually been sold a bill of sale by the worst snake oil salesmen who made cherry flavored cyanide, then blamed the bottle manufacturer when people died drinking it.

That, friends, is the reality of being a company trying to sell a product that's as widely compatible as possible. You make compromises, swallow your pride, and make shit work as well as possible.

And no one defends the registry. Even the VP of the Windows division once publicly said, basically "we screwed that one up pretty well."

rrr314159

Windows registry pile of crap - 3,890,000
Windows registry load of shit - 20,300,000 (!)

Keep going, you'll be up to 7.1 billion b4 2 long

It's all very well and good to admit they "screwed up that one" re. the Registry but you simply can't have good software with such a mistake in such a central role. It's like saying someone is very healthy except, admittedly, for the pancreatic cancer

[edit] actually I'm using "Bing" - got pukably sick of Google's little cartoons - so your mileage may vary
I am NaN ;)

MtheK

  Interesting that, on Win10, with no W calls, about 150K entries,
a run takes 30-40 seconds, but WITH W calls, it takes 30-40 MINUTES!?
Very strange...perhaps C++ and MASM 32bit console programs are treated
differently, possibly similar to my return code problem?

Vortex

Quote from: rrr314159 on July 25, 2015, 02:02:26 AM
All this reminds me what a poorly designed OS Windows is; except the parts they stole from Unix and MacOS, 30 years ago when Bill was one of the best crooks in the business, and not a bad programmer. The registry is a disaster, those values belong in (human-readable ASCII) .ini files.

Hi NaN,

I agree with you. The registry is a total crap. Even the Windows boot configuration data ( BCD ) is a registry hive. Why not plain text files for all the configurations? Easier to understand and modify.

hutch--

Registry = disaster. Make anything installed in the registry immovable.

Registry = security leak. How to find what software is installed, search the registry.

Private format INI files in the same directory are safe, reliable and portable.

I have never trusted the registry and have never written an app that uses it.

nidud

deleted

rrr314159

Quote from: VortexWhy not plain text files for all the configurations?

- M$ wants Windows to be hard to understand and modify so you have to pay them for upgrades etc, and they can charge companies outrageous consultant fees ...

Quote from: VortexEasier to understand and modify.

- you've answered your own question ;)
I am NaN ;)

npnw

Hutch,

Similar to what Unix, linux was doing.... probably why Microsoft didn't do it. Although bad design choice.