News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Google - HTTP Status code error - 503

Started by guga, April 28, 2019, 09:55:41 AM

Previous topic - Next topic

guga

Hi guys

I´m doing a translator using google server to translate a huge text (around 45 Mb)  from english to portuguese.

So far, i suceeded to make it work  But, i faced some weird Http status code error 503 after downloading some chunck of code. How to overcome this ?

The main function i built to download chuncks of data from google (or any other site) is:



Proc XMLDownloadtoFile:
    Arguments @pUrlString, @pSzAgent, @pSzFeedHeaders
    Local @hFeed, @hFeedURL, @XML_Size, @IsMemFilled, @lpdwNumberOfBytesAvailable, @XML_lpdwNumberOfBytesRead, @OldMemBuffer, @StoredBuffer, @CurMem, @StatusCode, @BuffLen, @dwIndex
    Uses edi, ecx, edx, esi, ebx

    mov D@OldMemBuffer 0
    mov D@lpdwNumberOfBytesAvailable 0
    mov D@XML_Size 0
    mov D@IsMemFilled 0
    mov D@XML_lpdwNumberOfBytesRead 0
    mov D@StoredBuffer 0
    mov D@StatusCode 0
    mov D@BuffLen 4
    mov D@dwIndex  0

    ;call 'wininet.InternetOpenA' D@pSzAgent, &INTERNET_OPEN_TYPE_DIRECT, &NULL, &NULL, 0
    call 'wininet.InternetOpenA' D@pSzAgent, &INTERNET_OPEN_TYPE_PRECONFIG, &NULL, &NULL, 0
    On eax = 0, ExitP
    mov D@hFeed eax

    call 'wininet.InternetOpenUrlA' eax, D@pUrlString, D@pSzFeedHeaders, 0-1, &INTERNET_FLAG_RELOAD__&INTERNET_FLAG_PRAGMA_NOCACHE, 0
    ...If eax <> 0
        mov D@hFeedURL eax
        lea ebx D@StatusCode
        lea ecx D@BuffLen
        lea edx D@dwIndex
        call 'wininet.HttpQueryInfoA' eax, &HTTP_QUERY_STATUS_CODE__&HTTP_QUERY_FLAG_NUMBER, ebx, ecx, edx
        If D@StatusCode <> 200

            ; Report Http Status Code Error
            call HttpErrorCode D@StatusCode
            call 'wininet.InternetCloseHandle' D@hFeedURL
            call 'wininet.InternetCloseHandle' D@hFeed
            ;mov eax D@StoredBuffer
            xor eax eax
            ExitP
        End_If
        mov eax D@hFeedURL
        ..While eax <> 0

            lea edx D@lpdwNumberOfBytesAvailable
            call 'wininet.InternetQueryDataAvailable' D@hFeedURL, edx, 0, 0

            ..If D@lpdwNumberOfBytesAvailable = 0
                xor eax eax
            ..Else

                mov ecx D@lpdwNumberOfBytesAvailable
                add ecx D@XML_Size
                ;call AllocateMemory ecx
                mov D@CurMem 0 | lea eax D@CurMem
                call 'RosMem.VMemAlloc' eax, ecx
                mov D@StoredBuffer eax

                .If D@IsMemFilled = &TRUE

                    call CopyMemory eax, D@OldMemBuffer, D@XML_Size
                    add eax D@XML_Size | mov B$eax 0
;                    mov esi D@OldMemBuffer
;                    mov edi eax
;                    While B$esi <> 0 | movsb | End_While
                    ;call FreeMemory D@OldMemBuffer
                    call 'RosMem.VMemFree' D@OldMemBuffer
                    mov D@OldMemBuffer 0

                .End_If

                mov eax D@StoredBuffer
                add eax D@XML_Size
                lea edx D@XML_lpdwNumberOfBytesRead
                call 'wininet.InternetReadFile' D@hFeedURL, eax, D@lpdwNumberOfBytesAvailable, edx
                mov ecx D@XML_lpdwNumberOfBytesRead
                add D@XML_Size ecx
                mov D@IsMemFilled &TRUE
                move D@OldMemBuffer D@StoredBuffer

            ..End_If
        ..End_While

        call 'wininet.InternetCloseHandle' D@hFeedURL
        call 'wininet.InternetCloseHandle' D@hFeed
        mov eax D@StoredBuffer

    ...Else
        call 'wininet.InternetCloseHandle' D@hFeed
        xor eax eax
    ...End_If

EndP

Proc HttpErrorCode:
    Arguments @ErrorValue
    Structure @StringtoAdd 64, @StringtoAdd_DataDis 0
    Uses ebx, ecx, edx

    C_call 'msvcrt.sprintf' D@StringtoAdd, {B$ 'HTTP Error Code value = %d', 0}, D@ErrorValue
    call 'User32.MessageBoxA' &NULL, D@StringtoAdd, { B$ "Connection Error", 0}, &MB_ICONERROR

EndP



The XMLDownloadtoFile does not only download xml files, it is used to download any kind of file, disregarding the name i gave to the function.


The main problem is that after translating some chuncks of the text, the server returned this annoying error. I tried to avoid that using a sleep function before the main call to XMLDownloadtoFile, like this:



Proc FullTranslate:
    Arguments @pString
    Local @FirstPass, @StringLen, @pReturnTranslatedBuffer, @TmpBuffSize, @TmpOutBuffer, @TranslatedText, @CharsCount, @EndString
    Uses esi, edi, ecx, ebx

    mov D@FirstPass 0
    ; 1st we get the total size of our string to be translated
    call StrLenProc D@pString | mov D@StringLen eax


    ...If D@StringLen <= MAX_GOOGLE_BYTES ; Google limit is only 5000 chars. So if our text is smaller or equal to 5000 we go to this routine and translated it at once

        ;lea eax D@TranslatedText | mov D@TranslatedText 0
        lea eax D@pReturnTranslatedBuffer | mov D@pReturnTranslatedBuffer 0
        call CreateTextTranslateGoogle D@pString, eax ; <----------- Inside thsi function has the main routines to the buffer allocation and the XMLDownloadtoFile fucntion
        If eax = 0-1
            mov eax D@pReturnTranslatedBuffer
        Else_If eax <> 0

        ;mov esi D@TranslatedText

        ;mov D@pReturnTranslatedBuffer esi
        ;mov eax D@pReturnTranslatedBuffer
            mov eax D@pReturnTranslatedBuffer
        End_If

    ...Else <----------------------------------------------- Otherwise, if the text file contains more then 5000 chars, we are translating it by chunks, taking care of the lexical routines.
;             <----------------------------------------------- I mean, when the end of the text reaches 5000 chars on each chunk it starts searching backward for a "." (dots - end of sentence) and translated it until this new end
        mov D@CharsCount 0

        mov esi D@pString
        shl eax 2 | Align_On 4 eax | mov D@TmpBuffSize eax
        lea eax D@TmpOutBuffer | mov D@TmpOutBuffer 0
        call CreateOutputDataBase eax, D@TmpBuffSize
        ;mov edi eax | mov D@CopyStart edi
        mov edi eax | mov D@pReturnTranslatedBuffer edi
        mov ecx D@StringLen
        mov edx esi | add edx ecx | mov D@EndString edx

        .Do
            lea eax D@TranslatedText | mov D@TranslatedText 0
            call GetTranslationChunck esi, eax, D@EndString ; <----------- Inside this function has the main routines to the buffer allocation and the XMLDownloadtoFile function
            If eax = 0
                jmp L9>
            Else_If eax = 0-1 ; <---------------------------------------- This is the results of the error generated by HTTP Status Code 503. I settle eax to 0-1 to distinguish the error type only and avoid crashing
                call 'RosMem.VMemFree' D@TranslatedText
                mov eax D@pReturnTranslatedBuffer
                ExitP
            End_If
            add esi eax
            sub ecx MAX_GOOGLE_BYTES
            ;add D@CharsCount eax
            call 'KERNEL32.Sleep' 500 ; <----- A sleep to try avoinding the Error 503

            ;..If ecx > MAX_GOOGLE_BYTES
                ;mov eax eax
                mov ebx D@TranslatedText
                .If D@FirstPass <> 0
                    ; The next string at esi starts with '[[["'. Bypass this to only '['
                    add ebx 2 ; we are at 1st '['
                    ; The next string ends with '[[["' and ends with "],null,"en"]", go back to it
                    ; The previous string at edi ends "],null,"en"]", go back to it
                    Do
                        dec edi
                    Loop_Until D$edi = '],nu'

                    ; it must be replaced with ],[
                    mov B$edi "," | inc edi
                .Else
                    mov D@FirstPass 1
                .End_If

                ;call CopyString D@TranslatedText, edi
                call CopyString ebx, edi
                add edi eax

            ;..End_If
            ;call CopyString D@TranslatedText, edi
            ;add edi eax
            ; free allocated memory from translated chunck
            call 'RosMem.VMemFree' D@TranslatedText

inc D$DummyTextPass

        ;.Loop_Until ecx =<s MAX_GOOGLE_BYTES;= D@StringLen
        .Loop_Until B$esi = 0
L9:
   ;     mov ecx esi | sub ecx D@pString
        .If B$esi <> 0;ecx > 0
            ; we still have remainder
            lea eax D@TranslatedText | mov D@TranslatedText 0
            call CreateTextTranslateGoogle esi, eax
            If eax = 0
                ExitP
            Else_If eax = 0-1
                call 'RosMem.VMemFree' D@TranslatedText
                mov eax D@pReturnTranslatedBuffer
                ExitP
            End_If
            mov esi D@TranslatedText

            ; The next string at esi starts with '[[["'. Bypass this to only '['

            add esi 2 ; we are at 1st '['
            ; The next string ends with '[[["' and ends with "],null,"en"]", go back to it
            ; The previous string at edi ends "],null,"en"]", go back to it
            Do
                dec edi
            Loop_Until D$edi = '],nu'

            ; it must be replaced with ],[
            mov B$edi "," | inc edi

            call CopyString esi, edi
            call 'RosMem.VMemFree' D@TranslatedText
        .End_If
        ;mov esi D@CopyStart
        mov eax D@pReturnTranslatedBuffer

    ...End_If

EndP


Google strings to translate are simple a pass to the server like this:

https://translate.googleapis.com/translate_a/single?client=gtx&sl=en&tl=pt&dt=t&q=Hello world. My name is Guga

Or...i can also use some of the escape chars for urls as well. (This is what the EncodeString function actually do. But, for google purposes it seems it only needs to convert the Carriage and Line Feed to %0D%0. All the other chars seems to be translated ok) One minor isseus are found on the resultant text that may contaions weird chars not necessarily related to UTF-8, like (\r\n that is the decoded of Carriage and Line feed, etc).

So..the question is..how to avoid google to return this error of status 503, so i can translate the whole text at once ?


Btw...a Example of usage of XMLDownloadtoFile function is as this:

[lpszHeaders5: B$ "Host: translate.googleapis.com
Accept: */*

", 0]

call XMLDownloadtoFile D@StringtoTranslate,  {B$ "Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gercko", 0}, lpszHeaders5
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

aw27

I don't understand what you posted because my linguistic capabilities have a limit. However, I can tell you that you receive from Google in direct relation to what you pay for. Its translation API will produce errors after too many uses.
Last century, I had an application called Babel Fish Direct and I had to discontinue it after AltaVista (Google) started causing issues. Then they ended removing the API but I understand they have a new one, but History always repeats.

TimoVJL

Do you recreate session between calls / translations ?
May the source be with you

guga

Quote from: AW on April 28, 2019, 07:32:41 PM
I don't understand what you posted because my linguistic capabilities have a limit. However, I can tell you that you receive from Google in direct relation to what you pay for. Its translation API will produce errors after too many uses.
Last century, I had an application called Babel Fish Direct and I had to discontinue it after AltaVista (Google) started causing issues. Then they ended removing the API but I understand they have a new one, but History always repeats.

Babel fish direct ? Sounds interesting. Do you still have it ? I would like to take a look and see how you managed to download from babel fish. Maybe it can be done using it instead google translator. Which server is used to perform the translation ? (I mean, the strings that are use in InternetOpenUrl api)
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Quote from: TimoVJL on April 28, 2019, 10:26:19 PM
Do you recreate session between calls / translations ?

Hi timo

Not sure i understood. What you mean with recreate sessions ? What i did was creating the translation in chunks before it reaches the limit of 5000 chars.

The steps are done like this for huge files:

1 - get the 1st chunk containing a maximum of 5000 chars
2 - Seek inside the chunk  for it´s true end. So it will search for the last chars of "?", "!", "." before the limit since those chars represents the true end of a sentence. So, if the limited text (5000 chars) have the last "." "?" or "!" at position 4700, it will then start translating the text from pos 0 (beginning) to 4700. Pos 4700 is saved to continue later

3 - After translating the chunk (4700 bytes, for example) it will restart the translation from pos 4700 to the next 5000 chars (so, at a max pos of 9700) and do the analysis of "?", "." "!" again. After analysing the remaining text it will then continue translating the 2nd chunk. Ex: 2nd chunk may have the "?' "." "!" at pos 9251.

4 - The computations of analysing a text for the true end and translate  will be done untill the end of the huge text (byte 0)


One minor question...How to allow a pause/resume downloading/translate ? And how to restart internet connection after it reaches the 503 error ? I tried using sleep api forcing it to wait 5 minutes before it goes inside XMLDownloadtoFile function again, but it didn´t worked.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

TimoVJL

I meant connection open / close between queries.
How many times it allows to use service ?
May the source be with you

aw27

I may have it somewhere but it is in Delphi and closed source.
Anyway, it used a protocol called SOAP, which may have disappeared in the meantime because I am not hearing about since long.

guga

Quote from: AW on April 29, 2019, 12:01:47 AM
I may have it somewhere but it is in Delphi and closed source.
Anyway, it used a protocol called SOAP, which may have disappeared in the meantime because I am not hearing about since long.
Thanks a lot. If you succeed to find and post, i´ll take a look. Never heard of SOAP protocol before.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Quote from: TimoVJL on April 28, 2019, 11:56:33 PM
I meant connection open / close between queries.
How many times it allows to use service ?

I´m not sure. It depends. If i don´t use a sleep function, it will stop working after something around 30-40 open/close. So it translate only 200 Kb of text before it stops (The translation is fast. Take around a couple of seconds to translate 200 Kb). But, If i use a sleep function and wait something around 10-13 seconds between each 5000 limits of text, then it can go further but stops after translating something around 2 Mb of text (The main problems of using a sleep fucntion is that it will then take 1 hour to translate only 2 Mb of text (or less)

Perhaps forcing it to send a different IP  (a renewed one - Or machine information) to the server on each 5000 translation should do the fix, but i don´t know how to programatically change the IP address or the necessary information to send to google server to allow me to continue the translations without stopping/blocking it.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

i read that the error is related to a captcha solving. But how to do it in masm ? I mean, whenever face the 503 error, open the web page to solve the captcha and go back ?

https://support.google.com/websearch/answer/86640

And here it seems to have some info about those captchas
http://codewa.com/question/107867.html

How to make it appear on a http web dialog ?

here are some others examples similar to what i´m doing. Except, mine is for large text files.
https://www.codeproject.com/Articles/12711/Google-Translator?msg=5161148#xx5161148xx
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

aw27

I don't think you can handle that easily in any programming language, very likely  it will be close to impossible.
Google wants you to pay for heavy usage.

TimoVJL

#11
Quote from: guga on April 29, 2019, 02:33:14 AM
How to make it appear on a http web dialog ?
keywords: AtlAxWinInit CreateWindow("AtlAxWin"

EDIT: don't work with translate.google.com captcha  :(
May the source be with you

guga

Thanks a lot Timo.

Can you make it work ? I mean, it is showing the captcha but not returning after pressing the verify btn


Btw...Here is some part of the translated files.

Parte01_en - The part of the English text to be translated.
Parte01_PT_Decoded - The same part above, translated to portuguese

I couldn´t upload it here due to the size. So, i posted it in https://we.tl/t-DOtUDR8R9p
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

aw27

You can also test on this site (actually, it is mine but I have never used it, lol):
http://tests.vgpt.com/captach.aspx

The captcha has a "site key" and "secret key", which we get free from Google. So, this is not a demo like the other page from google. On the other end the ATL control host fails to load it - chokes with the Javascript.