News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

realloc failing if preallocated malloc is too low

Started by cyrus, September 03, 2023, 01:39:39 PM

Previous topic - Next topic

cyrus

I posted a thread last week where I am using a pointer to iterate through my buffer in a recv loop on a socket connection to download a file. In that thread, I allocated a fixed size for a file. But in reality, that size will vary, and so I implemented a realloc routine when the initial bytes I did malloc reaches that size. For example. r12 is the counter of the full bytes received in the buffer. It keeps adding up each loop until the file is received. Lets say the file is 10mb. With malloc, I would malloc 1mb initially. And to set up my realloc implementation, I copy that initial size I malloc'd to r14. In each recv, I am requesting to receive 1024 bytes. So the formula is...when r12 is greater than or equal to r14, meaning, when 1mb of data has been copied (or possibly up to 1024 bytes over, because the recv can output variable bytes, not always 1024), then the reallocer function shall add r12 to r14, and call realloc, effectively doubling up each time. So if I wanted to download a 10mb file. It would reallocate as such:

1mb -> 2mb
2mb -> 4mb
4mb -> 8mb
8mb -> 16mb

The issue comes when the file is much larger, say 2.5 gb. In that manner, there would be a lot of reallocations. I noticed that my program was crashing, and not outputting the information to a file. So I added a printf statement if realloc returned 0 and sure enough, it causes an infinite loop. I tend to do "do while loops" on any allocations because for some programs, ya just can't simply exit the program as most people do, so I force it back to reallocate.

However, If I pre-allocate 10mg with malloc, then I can download this 2.5 gb file every time and never get any reallocation errors. Why is that? Is it my code? I doubt it. Or is it something I am not understanding about dynamic memory management? It should be allocating enough memory eventually since I am basically instructing it to. It just seems like because it has to reallocate more times at 1mb initially as opposed to 10mb, instead of it taking LONGER like i expect, it just simply doesn't want to and gets caught in an infinite loop. Can someone please explain why this is happening.

This code is extremely robust when I pre-allocate 10mb from the beginning if i want to download large files. if i want to download files up to around 100mb, then pre-allocating 1mb works fine but the 100mb files do sometimes hang up about 20% of the time.

;;;; Call recv to start receiving data in a loop until all bytes are received

    xor r14, r14
    mov r14, 100000h       ; initial size of 1 MB
    ;mov r14, 1000000h       ; initial size of 10 MB
    xor r12, r12           ; total; counter of bytes returned from recv function; total_bytes_recv

    ; call recv in a loop:
    recv_data:
        xor r9, r9             ; 4th argument: flags; set to null
        xor r8, r8             ;
        mov r8w, 400h          ; 3rd argument: 0x400 = 1024 bytes = length, in bytes, of the data in buffer pointed to by the buf parameter.
        mov rdx, bufptr        ; 2nd argument: pointer to our buf for recv to write to
        mov rcx, socket_desc       ; 1st argumnt: socket descriptor of pasv port
        sub rsp, 20h
        call recv
        add rsp, 20h           ; must return stack space or stack overflow will occur
 
        add bufptr, rax        ; increment the pointer of the allocated memory block; add rax (num of bytes returned from recv) to the pointer

        add r12, rax           ; r12 is the total counter so each successful call to recv adds num of bytes returned from rax to it

        cmp r12, r14           ; r14 is the initial size but reach reallocation doubles it
        jge reallocer          ; r12 is the total size. when r12 ever reaches or exceeds r14, it must reallocate

        cmp rax, 0             ; if there are bytes returned from recv, then get more until the file is downloaded
        jg recv_data
        cmp rax, -1            ; sometimes this can return -1, so just resume
        je recv_data

        jmp end_recv           ; if 0 returns from recv, then we're done so jump to the end


    reallocer:
    xor rcx, rcx
    xor rdx, rdx
    add r14, r12               ; add total bytes to initial size which is the current size we have allocated; this doubles each allocation

    mov rdx, r14               ; new size: r14 now is twice the size of what r12 is, which is double so this is what we will attempt to reallocate
    mov rcx, buf
    sub rsp, 20h
    call realloc
    add rsp, 20h              ; must add stack space back or stack overflow
    cmp rax, 0                 ; if null, repeat until we get our allocated memory block
    je realloc_err             ; jump back to realloc_err which shall print the error but resume to reallocate; this never seems to ever reallocate
    mov buf, rax
    mov bufptr, rax
    add bufptr, r12
    jmp recv_data

    realloc_err:
        lea rcx, realloc_err_msg       ; just prints 'realloc error'
        sub rsp, 20h
        call printf
        add rsp, 20h
        jmp reallocer

    end_recv:

jj2007

You are not providing complete code, and I am too old to spend time on creating a working example from your snippet, so this is based on intuition and experience only:
- do not think the OS has unlimited resources
- there is someting called "heap fragmentation"
- I don't see any need to reallocate something here

fearless

I think if you know the size of the file beforehand and can determine its size then preallocate mem. Multiples of 4k page sizes, 4096bytes would be ideal. (https://stackoverflow.com/questions/7850059/what-is-the-best-memory-buffer-size-to-allocate-to-download-a-file-from-internet)

I wouldnt realloc at all, just use the buffer, read into it, write out the bytes to file, continue and read again until no more bytes, or the function returns that it has finished and read the left over bytes into the buffer and write out to file.

I think realloc'ing a lot over a huge file download will lead to the memory manager eventually not able to provide a sufficient large enough chunk. If its a 32bit program then you probably will reach some error before 2GB, probably something over 1GB will trigger a realloc error at some point, but it will vary and wont be consistent. Thats why its best to just use a buffer and keep using it, instead of resizing it.

Here is an example using InternetReadFile and WriteFile, I took out some error checking and callback calls for download progress bars etc to limit the example to just show the main loop etc.

;------------------------------------------------------------------------------------------------
; NetReadUrlWriteLocal
;------------------------------------------------------------------------------------------------
NetReadUrlWriteLocal PROC hOpenUrl:DWORD, hLocalFile:DWORD, dwRemoteFileSize:DWORD
    LOCAL TotalBytesRead:DWORD
    LOCAL BytesRead:DWORD
    LOCAL pDataBuffer:DWORD
    LOCAL BytesWritten:DWORD
    LOCAL bLoopAgain:DWORD
   
    ; read data into a buffer and write out to our file
    mov TotalBytesRead, 0
    mov BytesRead, 1 ; set to 1 to trigger our continue condition in the WHILE loop, BytesRead will change to reflect actual bytes read.
    mov pDataBuffer, 0
   
    Invoke GlobalAlloc, GMEM_FIXED or GMEM_ZEROINIT, NET_DOWNLOAD_BUFFER_SIZE ; buffer is 4096 bytes
    .IF eax == NULL
        ; Mem Alloc Error
        mov eax, FALSE
        ret
    .ENDIF
    mov pDataBuffer, eax

    mov eax, TRUE ; set to TRUE initially to trigger the while endw
    .WHILE eax == TRUE && BytesRead != 0 ; continue
        Invoke InternetReadFile, hOpenUrl, pDataBuffer, NET_DOWNLOAD_BUFFER_SIZE, Addr BytesRead
        .IF eax == FALSE
            ; NetReadUrlWriteLocal InternetReadFile Error
            mov eax, FALSE
            ret
        .ENDIF
 
        .IF BytesRead == 0
            mov bLoopAgain, FALSE
        .ELSE
            mov bLoopAgain, TRUE
        .ENDIF
 
        .IF BytesRead != 0
            Invoke WriteFile, hLocalFile, pDataBuffer, BytesRead, Addr BytesWritten, NULL
            .IF eax == 0
                ; Invoke GetLastError
                ; NetReadUrlWriteLocal WriteFile Error
                mov eax, FALSE
                ret
            .ENDIF               
        .ENDIF
        mov eax, BytesRead
        add TotalBytesRead, eax ; cumulative total bytes read
        mov eax, bLoopAgain
    .ENDW
   
    .IF pDataBuffer != 0
        Invoke GlobalFree, pDataBuffer
    .ENDIF
   
    mov eax, TRUE
    ret
NetReadUrlWriteLocal ENDP

Of interest is the remarks of the InternetReadFile:


QuoteInternetReadFile operates much like the base ReadFile function, with a few exceptions. Typically, InternetReadFile retrieves data from an HINTERNET handle as a sequential stream of bytes. The amount of data to be read for each call to InternetReadFile is specified by the dwNumberOfBytesToRead parameter and the data is returned in the lpBuffer parameter. A normal read retrieves the specified dwNumberOfBytesToRead for each call to InternetReadFile until the end of the file is reached. To ensure all data is retrieved, an application must continue to call the InternetReadFile function until the function returns TRUE and the lpdwNumberOfBytesRead parameter equals zero.

I don't think thats the case if your using recv function though.

zedd151

Hi Cyrus,

In addition to what fearless wrote, and since you are using recv function instead, is there any way to query for the size of the file? I am unfamiliar with recv and its counterpart, so I do not know if it is possible, but it may be. Have you looked into that possibility?

cyrus

You guys are all correct. You're only limited by what you know. So now I know I can query the file size before I download it. I will be implementing that into my program so that I never have to use realloc here because it has unpredictable results. Thank you.

zedd151

Quote from: cyrus on September 04, 2023, 07:05:29 AMYou guys are all correct. You're only limited by what you know. So now I know I can query the file size before I download it. I will be implementing that into my program so that I never have to use realloc here because it has unpredictable results. Thank you.
Okay, so you can query for the file size? I only had a suspicion that it might be possible. Good to know. Glad you found the solution.  :smiley:

Will you be posting the finished code? I'd like to see what you have come up with.

jj2007

Quote from: cyrus on September 04, 2023, 07:05:29 AMSo now I know I can query the file size before I download it.

Google Content-Length and/or HTTP_QUERY_CONTENT_LENGTH. You may find out that you cannot query the file size simply because the server decided not to give away that info. Btw the page you are right now looking at does not have the Content-Length header.

Quote from: cyrus on September 04, 2023, 07:05:29 AMYou're only limited by what you know.

So true :thumbsup:

cyrus

Quote from: zedd151 on September 04, 2023, 07:11:22 AM
Quote from: cyrus on September 04, 2023, 07:05:29 AMYou guys are all correct. You're only limited by what you know. So now I know I can query the file size before I download it. I will be implementing that into my program so that I never have to use realloc here because it has unpredictable results. Thank you.
Okay, so you can query for the file size? I only had a suspicion that it might be possible. Good to know. Glad you found the solution.  :smiley:

Will you be posting the finished code? I'd like to see what you have come up with.

Ok I am able to query for the file size and I've been able to parse through it but I am having undefined behavior when I call recv to get the initial data from sending the LIST command. this is an FTP server. When you send the LIST command, the server sends back data in one shot on the command port:

"150 Here comes the directory listing.\r\n226 Directory send OK.\r\n" which is exactly 39 + 24 = 63 bytes

You send these commands on the regular port 21, and receive information using the same socket in the recv call (if you need that information such as pasv port but this isn't the actual data. that is to be sent on the data port). For example, before STOR or RETR, you have to call PASV, and then parse that out. i did a good job parsing that out because that works 100% of the time. Im iterating through the whole thing, counting commas, going back, this that and the other..

So I felt that parsing out the return from the LIST command wouldn't be that difficult. And it isn't for me. The actual data is returned on the pasv port, so using that socket, I do get that data. It looks like this for a file called lines.bmp:

-rw-r--r--    1 1003    1003      3000054 Sep 01 18:28 lines.bmp
gid 1003 is for that user on the server

My trick is, simply go to  the end, then count 5 spaces back, then inc once and read til the next space. Thats the size. And I am seeing that every time in the debugger. I am noticing strange behavior AFTER this. Once I have this information, I then obviously need to send the RETR command on the command port (not data port). So to receive information back, I need to recv on the command port to get the 2nd pasv port. I believe that the server may be sending information for this particular file in an undefined behavior because it seems to work for all my other files except this lines.bmp and another file. I tried my best debugging it but I am not able to get passed it. I am ending up calling recv twice to ensure that the information is being returned and what ends up happening is that it is sending the previous

"150 Here comes the directory listing.\r\n226 Directory send OK.\r\n"

instead of the pasv port and I dont know why. I tried limiting the chars to exactly 63 bytes which is what it is with the CRLF included.

My other program initially allocating with malloc at 16MB and then reallocating twice each time for larger files has proven to be extremely robust every single time and never fails once even for files at 2.5g but of course limited at 4gb because Windows programmers decided to make ReadFile and WriteFile accept a 32-bit DWORD for number of bytes to read (for ReadFile) and number of bytes written does NOT work for WriteFile no matter what. I think it is a bug on their end.

This is my code to parse out the file size. Note, the recv_str is basically the data returned from sending the LIST <file> command

    ; Parse file size from recv_str

    xor rax, rax                ; used for temp to copy byte by byte to make comparison         
    xor rdi, rdi
    lea rsi, recv_str          ; source str; load list output from LIST file into rsi to begin parsing for file sz
    lea rdi, file_sz_str        ; dest str  ; load into rdi to create digit char string from file sz; string_to_int will convert to int

   
    xor r11, r11
    xor r10, r10        ; counter for 5 spaces
    xor rax, rax
    next_chh:
        mov al, [rsi]  ; recv_str
        inc rsi        ; increment next char til the null terminator
        cmp al, 00h
        jne next_chh   
    find_spaces:
        dec rsi          ; go passed the space
        spc_loop:
            cmp r10, 5
            je get_sz
            mov al, [rsi]          ; copy char byte by byte
            cmp al, 20h           
            je count_spaces          ; jmp to count_space if found which just inc r10 and resumes
            dec rsi
            jmp spc_loop

    count_spaces:
        inc r10                ; increments space count
        dec rsi                ; increment next char after space otherwise infinite loop
        jmp spc_loop

    get_sz:                    ; otherwise, we found the beginning; let's start incrementing each byte til the next space
        inc rsi                ; go passed the last space that was decremented
        loop_get_sz:
        inc rsi                ; increment one more to go passed space to reach first digit
        mov al, [rsi]
        cmp al, 20h            ; increment until the last space before month
        je done_sz
        mov [rdi], al            ; otherwise, copy ah char to dest char
        inc rdi                ; increment dest char for write
        inc r11                ; increment num of digits as chars for use in string_to_int
        jmp loop_get_sz

    done_sz:

    ; dest string already in rdi but has been incremented so need to reload it
    xor rcx, rcx
    lea rdi, file_sz_str
    mov rcx, r11                ; number of chars of digits: 176 = 3
    sub rsp, 20h
    call string_to_int
    mov file_sz, rax            ; output will go in file_sz for malloc

zedd151

Well gee, cyrus...
You seem to know a lot more about this network stuff than I do, actually I know zero about it. Sorry if I had given you the impression that I can help with that part of your problem.
Hopefully there is someone on the forum that knows this networking stuff, and can help you much better than I could.



Zedd

fearless

Might have to look at SetFilePointerEx (https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfilepointerex). Read and Write file functions update the file pointer automatically, but you probably would want to take manual control yourself and set the file pointer with the SetFilePointerEx which uses a QWORD size for the liDistanceToMove parameter.
LARGE_INTEGER  liDistanceToMove
Otherwise the other option is to use Memory mapped files and let the OS take care of the writing to file when you move memory from the buffer to the mem mapped file offset which you update every time you read your buffer. After its all done you flush the file cache so the OS commits the file contents onto the disk.

Of course memory mapped files have their own quirks and you might have to adjust the map view of file if reading or writing very large files - sort of like a small sliding window that has a view into the memory mapped file, but only sees part of it at a time. For smaller files an entire file can be memory mapped. Anything over 700MB size you prob need to handle the map view (sliding window), otherwise the mapping could fail.

cyrus

Quote from: fearless on September 05, 2023, 08:45:55 AMMight have to look at SetFilePointerEx (https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfilepointerex). Read and Write file functions update the file pointer automatically, but you probably would want to take manual control yourself and set the file pointer with the SetFilePointerEx which uses a QWORD size for the liDistanceToMove parameter.
LARGE_INTEGER  liDistanceToMove
Otherwise the other option is to use Memory mapped files and let the OS take care of the writing to file when you move memory from the buffer to the mem mapped file offset which you update every time you read your buffer. After its all done you flush the file cache so the OS commits the file contents onto the disk.

Of course memory mapped files have their own quirks and you might have to adjust the map view of file if reading or writing very large files - sort of like a small sliding window that has a view into the memory mapped file, but only sees part of it at a time. For smaller files an entire file can be memory mapped. Anything over 700MB size you prob need to handle the map view (sliding window), otherwise the mapping could fail.

I did try to go down that path but I was not able to get that working. I also looked at using an OVERLAPPED struct in WriteFile to be able to use the upper 32-bits of the register because I was trying to download a 5GB file but was not able to get that to work either and it wasn't super critical so I just didn't care. Even what I have now allocating with 16MB initially seems to work fine. But I would have liked to get it working without having to realloc but yeah I am dealing with something that isn't particularly related to assembly. Seems to be a network issue that I may have to spend a lot of time debugging and not sure if it is worth it.

jj2007

#11
I admit I am a bit lost. Fearless described succinctly what it needs to transfer stuff from the Internet to a local file, using InternetReadFile:

Quote from: fearless on September 03, 2023, 07:29:43 PMI think if you know the size of the file beforehand and can determine its size then preallocate mem. Multiples of 4k page sizes, 4096bytes would be ideal. (https://stackoverflow.com/questions/7850059/what-is-the-best-memory-buffer-size-to-allocate-to-download-a-file-from-internet)

I wouldnt realloc at all, just use the buffer, read into it, write out the bytes to file, continue and read again until no more bytes, or the function returns that it has finished and read the left over bytes into the buffer and write out to file.

Why do you reflect on GlobalAlloc'ing gigabytes? It just doesn't make sense, sorry.

If you give me the URL of that gigabyte file, I can test it with a little proggie that I just hacked together here.