Quote from: NoCforMe on May 30, 2024, 09:42:36 AMSteve, saw the images; interesting.
QuoteDid you post code for that?
Quoteor if not, can you? I'm curious to see how that antialising works
Quote from: NoCforMe on May 30, 2024, 09:43:27 AMMagnus, SIMD would be even more overkill for the programming I do. No thanks.That's my kind of fun,others like gp register code,others fpu,others like opengl /d3d code,others like make masmbasic or their own assembler ...
Quote from: NoCforMe on May 30, 2024, 09:41:44 AMCWD was supported way back from the beginning, on the 8086.
What was the value of AX before CWD? Are you sure it was negative?
mov ax, deltax
cwd
xor ax, dx
sub ax, dx
mov ax, deltax
.if ax & 08000h
neg ax
.endif
; equates related to pshufd macro
[SSE_ROTATE_RIGHT_96BITS 147]
[SSE_ROTATE_LEFT_32BITS 147]
[SSE_ROTATE_LEFT_96BITS 57]
[SSE_ROTATE_RIGHT_32BITS 57]
[SSE_SWAP_QWORDS 78] ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_ROTATE_LEFT_64BITS 78] ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_ROTATE_RIGHT_64BITS 78] ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_ROTATE_64BITS 78] ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_INVERT_DWORDS 27] ; invert the order of dwords
[SSE_SWAP_DWORDS 177] ; Dwords in xmm are copied from 0123 to 1032 ordering. The same as: pshufd XMM0 XMM0 {SHUFFLE 1,0,3,2}
[SSE_SWAP_LOQWORD 225] ; Only the 2 dwords in the LowQword are swaped. Dwords in xmm are copied from 0123 to 0132 ordering. The same as: pshufd XMM0 XMM0 {SHUFFLE 0,1,3,2}
[SSE_SWAP_HIQWORD 180] ; Only the 2 dwords in the HighQword are swaped. Dwords in xmm are copied from 0123 to 1023 ordering. The same as: pshufd XMM0 XMM0 {SHUFFLE 1,0,2,3}
; Macros used
[SHUFFLE | ( (#1 shl 6) or (#2 shl 4) or (#3 shl 2) or #4 )] ; Marinus/Sieekmanski
[pshufd | pshufd #1 #2 #3]
[SSE2_MAX_4FLOATS | maxps #1 #2 | pshufd #2 #1 SSE_SWAP_DWORDS | maxps #1 #2 | pshufd #2 #1 SSE_ROTATE_64BITS | maxps #1 #2]
[SSE_ABS_REAL4 | pslld #1 1 | psrld #1 1]
; Variables used
[NeighboursDivision: F$ (1/8), F$ (1/8), F$ (1/8), F$ (1/8)]
[MaxDensity: F$ 0, 0, 0, 0] ; ge the maximum density found to we use in the scale to density
[DensityRange: F$ 255, 255, 255, 255] ; to be used in Scale to density)
Proc EightNeighbourDistance:
Arguments @pInput, @pOutput, @Height, @Width, @Padding, @DensityRatio
Local @CurYPos, @NextRow, @PreviousRow, @NextOutput, @NextInput, @AddPaddingStart
Uses ebx, esi, edi, edx, ecx
xor eax eax
If_Or D@Height = 0, D@Width = 0
ExitP
End_If
movups xmm3 X$NeighboursDivision
movups xmm4 X$MaxDensity
mov eax D@Padding | shl eax 2 | mov D@AddPaddingStart eax
; calculate the address of the next input row
mov eax D@Padding | shl eax 1 | add eax D@Width | shl eax 2 | mov D@NextInput eax
; calculate the address of the next output row
mov eax D@Width | shl eax 2 | mov D@NextOutput eax
mov eax D@Padding | mov ebx eax | add eax D@Width | shl eax 2 | mov D@NextRow eax; 9
shl ebx 1 | shl ebx 2 | add ebx eax | mov D@PreviousRow ebx
; Input start at
mov eax D@Height | mov D@CurYPos eax
mov edx D@pInput | add edx D@NextInput
mov edi D@pOutput
..Do
mov ebx edx | add ebx D@AddPaddingStart
xor esi esi
.Do
; get previous line (Up)
lea eax D$ebx+esi*4 | sub eax D@PreviousRow | movdqu xmm1 X$eax; | SSE_CONV_4INT_TO_4FLOAT xmm1 ; convert the integers to floats
pshufd xmm1 xmm1 {SHUFFLE 2,2,1,0} ; the 4th Float is not needed for this comparision, because we want only 3 (padding + width + padding)
; get next line (Down)
lea eax D$ebx+esi*4 | add eax D@NextRow | movdqu xmm0 X$eax; | SSE_CONV_4INT_TO_4FLOAT xmm0 ; convert the integers to floats
pshufd xmm0 xmm0 {SHUFFLE 2,2,1,0} ; the 4th Float is not needed for this comparision, because we want only 3 (padding + width + padding)
SSE2_MAX_4FLOATS xmm0 xmm1
; now get left
movdqu xmm1 X$ebx+esi*4-4 | pshufd xmm1 xmm1 {SHUFFLE 0,0,0,0}
; now get right
movdqu xmm2 X$ebx+esi*4+4 | pshufd xmm2 xmm2 {SHUFFLE 0,0,0,0}
maxps xmm1 xmm2
; and finally get the maximum between up and down and left to right
maxps xmm1 xmm0
; get our core pixel
mov eax D$ebx+esi*4 | movd xmm0 eax | pshufd xmm0 xmm0 {SHUFFLE 0,0,0,0}
; subtract pixels from right from the core
subps xmm0 xmm1 ; Subtract corresponding integers
SSE_ABS_REAL4 xmm0 ; and convert them to absolute
mulps xmm0 xmm3
; get the maximum density
maxps xmm4 xmm0
movd D$edi+esi*4 xmm0
inc esi
.Loop_Until esi => D@Width
add edx D@NextInput
add edi D@NextOutput
dec D@CurYPos
..Repeat_Until_Zero
; finally calculate the density ratio to be used in our new scale_density_to_image function
mov eax D@DensityRatio
movups xmm0 X$DensityRange | divss xmm0 xmm4 | movd D$eax xmm0
EndP
Quote from: NoCforMe on May 29, 2024, 04:36:07 AMI've said it before and I'll say (write) it again:Its more worth it with 128+ bit SIMD to make things go faster,much code before worked on on!y 8 bit or 16 bit,worth investigation, because 8 bit code ported to 128 bit SIMD run faster than 32 bit code ported to 128 bit only 4 times faster
64-bit programming for me and I think it's safe to say a lot of the other coders around here is total overkill. Completely unnecessary and not worth the additional headaches and hassles.
I exempt the few professionals here who depend on 64-bit coding for their livelihood or for special programming needs (huge datasets, etc.). But for the rest of us, it's just not needed.
Hell, even 32-bit is overkill for a lot of the stuff I do (and I imagine what others here code). Sometimes a little-bitty 8-bit µprocessor is all that's needed to do the job.