News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Unaligned memory copy test piece.

Started by hutch--, December 06, 2021, 08:34:55 PM

Previous topic - Next topic

LiaoMi

Quote from: jj2007 on December 16, 2021, 01:17:53 AM
WMIC works, but GetLogicalProcessorInformation returns rubbish :sad:

---------------------------
GetLogicalProcessorInformation output:
---------------------------
eax 0

pinfo.NodeNumber 0

b:pinfo.Cache.Level 00000000

b:pinfo.Cache.LineSize 0000000000000000

b:pinfo.Cache._Size 00000000000000000000000000000000

b:pinfo.Cache._Type 00000000000000000000000000000000

$Err$() The data area passed to a system call is too small.__


---------------------------
OK   Cancel   
---------------------------

jj2007

I've seen this error, but 400 bytes works on Win7-64. Try version 2, with a 1000-byte buffer

LiaoMi

Quote from: jj2007 on December 16, 2021, 03:57:49 AM
I've seen this error, but 400 bytes works on Win7-64. Try version 2, with a 1000-byte buffer

The program is closed after the start, the message remains the same with the debugger.

jj2007

It works on my new machine, Win10:

Wmic:
L2CacheSize  L3CacheSize
1024         4096


GetLogicalProcessorInformation output:
eax             1
pinfo.NodeNumber        3
b:pinfo.Cache.Level     00000000
b:pinfo.Cache.LineSize  0000000000000000
b:pinfo.Cache._Size     00000000000000000000000000000001
b:pinfo.Cache._Type     00000000000000000000000000000000
$Err$()         Operazione completata.__

HSE

#79
Hi JJ!

Can be a problem with wow64?

What happen with program in 64 bits?

LATER:
Apparently structure is different:typedef struct _SYSTEM_LOGICAL_PROCESSOR_INFORMATION {
  ULONG_PTR                      ProcessorMask;
  LOGICAL_PROCESSOR_RELATIONSHIP Relationship;
  union {
    struct {
      BYTE Flags;
    } ProcessorCore;
    struct {
      DWORD NodeNumber;
    } NumaNode;
    CACHE_DESCRIPTOR Cache;
    ULONGLONG        Reserved[2];
  } DUMMYUNIONNAME;
} SYSTEM_LOGICAL_PROCESSOR_INFORMATION, *PSYSTEM_LOGICAL_PROCESSOR_INFORMATION;
Equations in Assembly: SmplMath

jj2007

Quote from: HSE on December 16, 2021, 11:26:56 AM
Can be a problem with wow64?

It does not crash on my Win7-64 and Win10 machines.

QuoteApparently structure is different:

I'm afraid my C skills are not sufficient to translate it correctly to MASM syntax :sad:

HSE

Quote from: jj2007 on December 16, 2021, 12:40:02 PM
I'm afraid my C skills are not sufficient to translate it correctly to MASM syntax :sad:

:biggrin: :biggrin: I hope somebody can.
Equations in Assembly: SmplMath

TimoVJL

#82
Modified code from here:
https://docs.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-getlogicalprocessorinformation
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 262144
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 262144
processorL3CacheSize: 3145728

GetLogicalProcessorInformation results:
Number of NUMA nodes: 1
Number of physical processor packages: 1
Number of processor cores: 2
Number of logical processors: 4
Number of processor L1/L2/L3 caches: 4/2/1

EDIT:

32-bit
SYSTEM_LOGICAL_PROCESSOR_INFORMATION 24 18h bytes
Relationship     +4h 4h
Cache.Level      +8h 1h
Cache.Size       +Ch 4h
64-bit
SYSTEM_LOGICAL_PROCESSOR_INFORMATION 32 20h bytes
Relationship     +8h 4h
Cache.Level      +10h 1h
Cache.Size       +14h 4h
May the source be with you

HSE

Thanks Timo  :thumbsup:

processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 262144
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 262144
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 262144
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 262144
processorL3CacheSize: 6291456

GetLogicalProcessorInformation results:
Number of NUMA nodes: 1
Number of physical processor packages: 1
Number of processor cores: 4
Number of logical processors: 8
Number of processor L1/L2/L3 caches: 8/4/1


Biterider structure translation is:  SYSTEM_LOGICAL_PROCESSOR_INFORMATION struct
    ProcessorMask ULONG_PTR ?
    Relationship LOGICAL_PROCESSOR_RELATIONSHIP ?
    union
      struct ProcessorCore
        Flags BYTE ?
      ends
      struct NumaNode
        NodeNumber DWORD ?
      ends
      Cache CACHE_DESCRIPTOR <>
      Reserved ULONGLONG 2 dup (?)
    ends
  SYSTEM_LOGICAL_PROCESSOR_INFORMATION ends
Equations in Assembly: SmplMath

TimoVJL

@Greenhorn, please give AMD Ryzen 7 3700X results.

These are interesting:
AMD Ryzen™ 5 5600G L3 16 MB
AMD Ryzen™ 7 5700G L3 16 MB

May the source be with you

Greenhorn

AMD Ryzen 7 3700X

processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432
processorL1CacheSize: 32768
processorL1CacheSize: 32768
processorL2CacheSize: 524288
processorL3CacheSize: 33554432

GetLogicalProcessorInformation results:
Number of NUMA nodes: 1
Number of physical processor packages: 1
Number of processor cores: 8
Number of logical processors: 16
Number of processor L1/L2/L3 caches: 32/16/16
Kole Feut un Nordenwind gift en krusen Büdel un en lütten Pint.

TimoVJL

#86
I was after this test
http://masm32.com/board/index.php?topic=9691.msg106349#msg106349

EDIT:                                                  AMD Ryzen 5 3400G with Radeon Vega Graphics     (SSE4)
                                                          AMD Ryzen 7 3700X 8-Core Processor              (SSE4)
                                                                  AMD Ryzen 9 5950X 16-Core Processor             (SSE4)
                                                                          Intel(R) Core(TM) i3-4005U CPU @ 1.70GHz (SSE4)
                                                                                  Intel(R) Core(TM) i3-10110U CPU @ 2.10GHz (SSE4)
                                                                                          Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)
                                                                                                  Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz (SSE4)
                                                                                                          Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz (SSE4)
                                                                                                                  11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz (SSE4)

kCycles for 1 * rep movsd                   83453    2,86    2,32    1,00    2,85    2,42    2,89    2,16    2,58    1,04
kCycles for 1 * movlps qword ptr [esi+8*e  131239    2,06    1,31    1,15    2,19    1,91    2,32    1,68    1,73    1,00
kCycles for 1 * movaps xmm0, oword ptr [e  119079    1,98    1,40    1,18    2,35    2,06    2,54    1,70    1,93    1,00
kCycles for 1 * movdqa + movntdq            85881    1,82    1,17    1,00    2,11    2,12    2,42    2,23    1,70    1,16
kCycles for 1 * movdqu + movntdq            87420    1,78    1,16    1,00    2,07    2,15    2,38    2,11    1,67    1,02
kCycles for 1 * movdqu + movntdq + mfence   85314    1,84    1,18    1,00    2,12    2,31    2,44    2,00    1,71    1,09
May the source be with you

Greenhorn

Ah, OK ...

AMD Ryzen 7 3700X 8-Core Processor              (SSE4)
++++++++-++9 of 20 tests valid,
252562 kCycles for 1 * rep movsb
193705 kCycles for 1 * rep movsd
172087 kCycles for 1 * movlps qword ptr [esi+8*ecx]
167271 kCycles for 1 * movaps xmm0, oword ptr [esi]
100892 kCycles for 1 * movdqa + movntdq
101051 kCycles for 1 * movdqu + movntdq
100887 kCycles for 1 * movdqu + movntdq + mfence

191904 kCycles for 1 * rep movsb
192633 kCycles for 1 * rep movsd
171663 kCycles for 1 * movlps qword ptr [esi+8*ecx]
163020 kCycles for 1 * movaps xmm0, oword ptr [esi]
100933 kCycles for 1 * movdqa + movntdq
100811 kCycles for 1 * movdqu + movntdq
101287 kCycles for 1 * movdqu + movntdq + mfence

193081 kCycles for 1 * rep movsb
192589 kCycles for 1 * rep movsd
171437 kCycles for 1 * movlps qword ptr [esi+8*ecx]
163055 kCycles for 1 * movaps xmm0, oword ptr [esi]
100982 kCycles for 1 * movdqa + movntdq
100927 kCycles for 1 * movdqu + movntdq
101013 kCycles for 1 * movdqu + movntdq + mfence

191832 kCycles for 1 * rep movsb
192769 kCycles for 1 * rep movsd
171349 kCycles for 1 * movlps qword ptr [esi+8*ecx]
163022 kCycles for 1 * movaps xmm0, oword ptr [esi]
100896 kCycles for 1 * movdqa + movntdq
100825 kCycles for 1 * movdqu + movntdq
101005 kCycles for 1 * movdqu + movntdq + mfence

21 bytes for rep movsb
21 bytes for rep movsd
37 bytes for movlps qword ptr [esi+8*ecx]
42 bytes for movaps xmm0, oword ptr [esi]
44 bytes for movdqa + movntdq
44 bytes for movdqu + movntdq
47 bytes for movdqu + movntdq + mfence


--- ok ---
Kole Feut un Nordenwind gift en krusen Büdel un en lütten Pint.