News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

SHA256-Hash and some "special" characters

Started by cobold, August 23, 2014, 05:42:02 AM

Previous topic - Next topic

cobold

Hello,

in the last two days I've written a program which calculates the SHA-256-Hash for a string or a file (you can chose what you like). I made some tests (NULL-string, "a", "abc" and so on and everything was fine. Then I found a SHA-256-calculator on the web and entered some words, which brings me to the topic:

The hash-values are the same for the online-program and my program.
BUT whenever the string contains a german 'Umlaut' (hope you know what a mean [Ä,Ü and so on]) the hash-values are completely different. Also when there is CR/LF the hash-values don't match.

Now, I don't think that there'S something wrong with my program - I suppose that it has somesthing to do with character-encoding:
For some reason, the website encode Umlauts and carriage return/linefeeds in another way than my program.

What can I do to solve this? Any hint is appreciated! thanks guys

FORTRANS

Hi,

   Many web sites are on Unix/Linux servers.  Windows uses the
CR/LF pair to end a line of text in a text file. Linux uses a single LF
to do the same thing.  You could hash a string without a CR to
see if it then matches.

Regards,

Steve N.

dedndave

characters with umlauts are also used in languages other than German
maybe the code page has something to do with how they are stored   :biggrin:

cobold

Hi,

thanks for the answers, guys. Yes, it's the character encoding. In the meantime I have tested some binary files and the hashes always match.

rgds

dedndave

well - character encoding may not be the whole solution
but, it may be a clue
maybe it's hashed as words instead of bytes - something like that
also - the UTF file type may have something to do with it

Gunther

Hi cobold,

do you use UTF8 encoding for the strings with umlauts?

Gunther
You have to know the facts before you can distort them.

qWord

I guess that these online SHA-256 calculators are implemented in JavaScript, which (AFAIK) use UTF-16 and UTF-8 encoding per standard (ECMAScript). So, as already said, I would also try it with UTF-8 for byte-sized characters.
MREAL macros - when you need floating point arithmetic while assembling!

MichaelW

I don't know anything about the 'Umlaut' or encoding problems, but did you test your implementation against the Windows Cryptographic Service Provider?

;==============================================================================
    include \masm32\include\masm32rt.inc
    include \masm32\include\advapi32.inc
    includelib \masm32\lib\advapi32.lib
;==============================================================================

    ;ALG_CLASS_HASH  equ 4 SHL 13
    ;ALG_TYPE_ANY    equ 0
    ;ALG_SID_SHA_256 equ 12
    ;CALG_SHA_256    equ ALG_CLASS_HASH or ALG_TYPE_ANY or ALG_SID_SHA_256
    CALG_SHA_256 equ 0000800ch

;==============================================================================
    .data
        hProv       dd 0
        hHash       dd 0
        dwDataLen   dd 0
        ;----------------------------------------------------------------------
        ; Test sample from:
        ; http://csrc.nist.gov/groups/ST/toolkit/documents/Examples/SHA256.pdf
        ;----------------------------------------------------------------------
        szSample    db "abc",0
        szExpected  db "BA7816BF8F01CFEA414140DE5DAE2223B00361A3"                   
                    db "96177A9CB410FF61F20015AD",0
    .code
;==============================================================================
start:
;==============================================================================

    invoke CryptAcquireContext, ADDR hProv,
                                NULL,
                                NULL,
                                PROV_RSA_AES,
                                CRYPT_VERIFYCONTEXT
    .IF eax == 0
        printf("%s\n",LastError$())
    .ENDIF

    invoke CryptCreateHash, hProv, CALG_SHA_256, 0, 0, ADDR hHash
    .IF eax == 0
        printf("%s\n",LastError$())
    .ENDIF

    invoke CryptHashData, hHash, ADDR szSample, SIZEOF szSample - 1, 0
    .IF eax == 0
        printf("%s\n",LastError$())
    .ENDIF

    invoke CryptGetHashParam, hHash, HP_HASHVAL, NULL, ADDR dwDataLen, 0
    .IF eax == 0
        printf("%s\n",LastError$())
    .ENDIF
    printf("%d\n",dwDataLen)
    mov edi, alloc(dwDataLen)

    invoke CryptGetHashParam, hHash, HP_HASHVAL, edi, ADDR dwDataLen, 0
    .IF eax == 0
        printf("%s\n",LastError$())
    .ENDIF

    printf("%s\n",ADDR szSample)
    printf("%s\n",ADDR szExpected)

    xor ebx, ebx
    .WHILE ebx < dwDataLen
        movzx edx, BYTE PTR [edi+ebx]
        printf("%02X", edx)
        inc ebx
    .ENDW
    printf("\n\n")

    invoke CryptDestroyHash, hHash

    invoke CryptReleaseContext, hProv, 0

    free edi

    inkey "Press any key to exit..."
    exit
;==============================================================================
end start


32
abc
BA7816BF8F01CFEA414140DE5DAE2223B00361A396177A9CB410FF61F20015AD
BA7816BF8F01CFEA414140DE5DAE2223B00361A396177A9CB410FF61F20015AD


And there is more test data here.
Well Microsoft, here's another nice mess you've gotten us into.

cobold

Yes, I did test that and other test vectors too. It works. Thanks for the sample. Perhaps I did'nt have done this work, if I knew that you can do it from the OS ;-(