In case anyone is interested, I updated the code I posted earlier so that it now works with somewhat more general bitmap files. In the earlier version, the number of pixels had to be a multiple of 8. The code now handles bitmap files with an arbitrary number of pixels. I have attached a zip file with the source code, the executable, and a couple of teensy bitmap files - one with 24 pixels, and one with 27 pixels.
My code seems to work with both files, but that has been the extent of my testing.
As before, the code requires the bitmap to be 32 bits/pixel in ARGB, and zeroes out the blue component in the pixels. This is a toy program that isn't/wasn't meant to be a "real" program, just a excuse for me to try out some AVX instructions and to learn a little about working with bitmap files.
Here's the assembly code. All the rest of the code is written in C.
; FindBluePix.asm - contains the ChangeBluePixels proc.
.data
; Each element in the BlueMask array is a bitmask that zeroes out
; the blue channel of a pixel.
; byte 0 (LSB) - alpha channel
; byte 1 - blue channel
; byte 2 - green channel
; byte 3 (MSB) - red channel
BlueMask dd 0FFFF00FFh, 0FFFF00FFh, 0FFFF00FFh, 0FFFF00FFh,
0FFFF00FFh, 0FFFF00FFh, 0FFFF00FFh, 0FFFF00FFh
BytesRemaining dq ? ; The number of bytes in the bitmap image that have not been processed. In this code, 1 pixel == 4 bytes.
BytesPerIter dq 32
.code
; ChangeBluePixels reads eight pixels at a time (32 bytes), and searches for blue pixels
; in an array of BI_BITFIELDS (32 bits/pixel) data.
; If blue pixels are found, the bits in the blue channel are changed to 0's.
; Parameters:
; Address of the array in RCX.
; Number of bytes in RDX.
; Returns: nothing.
; To do: Add functionality to deal with arrays that aren't a multiple of 32 bytes.
ChangeBluePixels PROC C
cmp rdx, 32
jb TailLoop
mov BytesRemaining, rdx
sub rsi, rsi
vmovups ymm0, ymmword ptr [BlueMask]
Loop1:
cmp BytesRemaining, 0 ; If equal, we're done.
je Finished
cmp BytesRemaining, 32 ; Exit loop if fewer than 32 bytes left.
jb TailLoop ; Fewer than 32 bytes (8 pixels) remain.
vmovups ymm1, ymmword ptr[rcx + rsi] ; Copy 32 bytes (8 pixels) to YMM1
vandps ymm2, ymm1, ymm0 ; And YMM1 with the array of bitmasks, and store result in YMM2.
vmovups ymmword ptr[rcx + rsi], ymm2 ; Copy the modified pixel data back to the original array.
add rsi, 32 ; Increment the array pointer by 32 bytes to get the next 8 pixels.
sub BytesRemaining, 32 ; Are we done yet?
ja Loop1 ; If BytesRemaining > 32, iterate again.
jb TailLoop ; If BytesRemaining < 32, not enough for another iteration.
jz Finished ; If BytesRemaining - 32 == 0, all pixels are processed, and we're done.
TailLoop:
mov eax, dword ptr[rcx + rsi]
and eax, dword ptr[BlueMask]
sub BytesRemaining, 4
ja TailLoop
Finished:
ret
ChangeBluePixels ENDP
END