News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Recent posts

#71
The Workshop / Re: Line-drawing routine
Last post by FORTRANS - May 31, 2024, 12:20:22 AM
Hi,

Quote from: NoCforMe on May 30, 2024, 09:42:36 AMSteve, saw the images; interesting.

   Thanks.

QuoteDid you post code for that?

   Nobody replied to that post, so I assume not.

Quoteor if not, can you? I'm curious to see how that antialising works

   Okay.  The code was copied from a PDF using OCR and editing.
References should be mentioned both in the code and that other
ZIP.  Some errors may exist and I altered some comments.  You
should read the original text by Abrash.  If a moderator is
worried that it's not original code by me: croak it.

   I am also assuming you are not concerned about the gamma
correction mentioned in the prior post.

Cheers,

Steve N.
#72
The Campus / Re: Number confusion
Last post by daydreamer - May 30, 2024, 10:49:08 PM
Quote from: NoCforMe on May 30, 2024, 09:43:27 AMMagnus, SIMD would be even more overkill for the programming I do. No thanks.
That's my kind of fun,others like gp register code,others fpu,others like opengl /d3d code,others like make masmbasic or their own assembler ...
#73
16 bit DOS Programming / Re: Line Drawing
Last post by NoCforMe - May 30, 2024, 11:06:58 AM
Kewl, psychedelic.
#74
16 bit DOS Programming / Re: Line Drawing
Last post by tda0626 - May 30, 2024, 10:35:54 AM
Quote from: NoCforMe on May 30, 2024, 09:41:44 AMCWD was supported way back from the beginning, on the 8086.

What was the value of AX before CWD? Are you sure it was negative?

Wish I took a screen shot of it. I can't remember unfortunately but my conditional statement code fixed it. I will add your bit of code back in there and test for that condition again but this time I will take screenshots of it and let you know.

Made a loop to test for all cases and it appears to draw everything correctly now, see picture.


#75
The Campus / Re: Number confusion
Last post by NoCforMe - May 30, 2024, 09:43:27 AM
Magnus, SIMD would be even more overkill for the programming I do. No thanks.
#76
The Workshop / Re: Line-drawing routine
Last post by NoCforMe - May 30, 2024, 09:42:36 AM
Steve, saw the images; interesting. Did you post code for that? or if not, can you? I'm curious to see how that antialising works.
#77
16 bit DOS Programming / Re: Line Drawing
Last post by NoCforMe - May 30, 2024, 09:41:44 AM
CWD was supported way back from the beginning, on the 8086.

What was the value of AX before CWD? Are you sure it was negative?
#78
16 bit DOS Programming / Re: Line Drawing
Last post by tda0626 - May 30, 2024, 08:14:23 AM
This is odd. I was having trouble with one case in my line drawing program that didn't make sense to me. If the deltay > deltax (steep slope) but the deltax was negative, it would not draw the line correctly. Right before the conditional code to determine which line to draw and to either swap the x and y values, I had NoCforMe's ABS() code that I used to convert the negative to a positive.

mov ax, deltax
cwd
xor ax, dx
sub ax, dx

Fired up DOS debug and got to that part of code that does that ABS() routine but it does not sign extend to DX after the CWD instruction. Anyone know why? As far as I know, that instruction is supported on 386 and higher. The DX register contents are zero after the instruction is executed.

In the meantime, I just used this bit of code to convert it and now it works but still curious to why CWD isn't working because that is a pretty good method if you ask me to get the ABS().

mov ax, deltax
.if ax  & 08000h
neg ax
.endif

Tim
#79
The Laboratory / Re: DBScan, Homogeneity and Co...
Last post by guga - May 30, 2024, 06:14:45 AM
Hi Six_L tks

It was hard to understand and port this thing, but i suceeded to do it.I´ll try later to create a sort of tutorial on how this thing works (at least the part showed by the article at BurdwanUniversity)

Basically, it takes only the maximum difference between 8 neighbors pixels at every pixel we are working with. 1St it converts the image to grayscale and uses only the unique generated color corresponding to grey, then it creates a map of the grey pixels (Size = width*height*4 = 4 because we are dealing with a dword (In fact, i converted to Real4 to gain extra speed in SSE2) to make the computation easier). For example, say that on a certain region of the image, the pixels are displaced as (already converted to gray):

99    140    185
102    63    162
58    113    138

And we are at the center of the area. So, we are at pixel whose value = 63

The original article (and all other places i read so far), tell us to get the difference between each pixel and the one we are in (99-63), (140-63)....(138-63). Then some articles says that we have to get the minimum of maximum distance etc etc while the article at Burdwan University says to we get only the maximum difference between all of them and use the resultant value to create a Density map.

But all of this consumes a lot of processing time. So, all i had to do is go straightforward finding the maximum value of all of the 8 neighbors pixels and simply subtracted that value from the one at the center.

So, instead calculating all the 8 deltas and only later finding the bigger of them etc etc . All is needed, in fact, is the maximum of those 8 and subtract directly. Then we divide the value by 8 pixels

So, in this example, the maximum value of the neighbor pixels = 185. Therefore, the value of the density map = (185-63)/8.

Using SSE2 it takes a few operations to do that. All we need is locate the starting address at the top and bottom (3 pixels each), put them on a xmm register and calculate he maximum value of them.
Then,, we do the same but only for the pixels on the left and right of the center. Get the max of both.
And finally, we get the maximum of the resultant 2 values (Top/Bottom and Left/Right).

There´s no need for subtract all those 8 pixels, get their results, make them absolute and only later, get their maximum. This is simply because, no matter what subtraction or deltas you will have, the maximum delta will always be the one that contains the maximum value of one of the 8 pixels.

So, all we need is find the max value of the 8 pixels and subtract it from the center. And then divide the resultant delta by 8 (Because we have 8 neighbor pixels)

Since the algo takes 8 pixels surrounding each one, the better when start all over this, is create the grey table with a extended size. So, width+1 and height+1 and put the image at the center of that buffer area, making the borders (1 pixel each) useless, but preventing crashes and other computations to adjust the border all over the time.

On this way, if we start the algo say, at position x = 0. When he try to get the pixel on the left, it wont crash, and since it´s value would be 0 , it won´t affect too much the result (except on the borders and corners) that will take, in fact, 3 to 5 pixels to calculate the max, instead the regular 8.


So, in memory, our grey image map will be like this:

Padding1    Padding2    Padding    Padding    Padding    Padding    Padding    Padding    Padding    Padding
Padding2    148    96    98    98    66    139    142    24    Padding
Padding3    120    90    78    149    161    101    112    46    Padding
Padding    97    97    99    140    185    180    105    140    Padding
Padding    106    99    102    63    162    115    115    139    Padding
Padding    95    107    58    113    138    82    134    176    Padding
Padding    97    102    91    53    121    78    156    204    Padding
Padding    126    93    68    57    145    155    135    58    Padding
Padding    137    95    33    86    122    169    66    37    Padding
Padding    Padding    Padding    Padding    Padding    Padding    Padding    Padding    Padding    Padding


Where padding, is nothing less then a null value used to adjust the problem at the borders and corners.

The whole function is :


; equates related to pshufd macro
[SSE_ROTATE_RIGHT_96BITS    147]
[SSE_ROTATE_LEFT_32BITS     147]

[SSE_ROTATE_LEFT_96BITS     57]
[SSE_ROTATE_RIGHT_32BITS    57]

[SSE_SWAP_QWORDS 78]         ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_ROTATE_LEFT_64BITS 78]  ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_ROTATE_RIGHT_64BITS 78] ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_ROTATE_64BITS 78]       ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS

[SSE_INVERT_DWORDS 27]      ; invert the order of dwords
[SSE_SWAP_DWORDS 177]       ; Dwords in xmm are copied from 0123 to 1032 ordering. The same as: pshufd XMM0 XMM0 {SHUFFLE 1,0,3,2}
[SSE_SWAP_LOQWORD 225]      ; Only the 2 dwords in the LowQword are swaped. Dwords in xmm are copied from 0123 to 0132 ordering. The same as: pshufd XMM0 XMM0 {SHUFFLE 0,1,3,2}
[SSE_SWAP_HIQWORD 180]      ; Only the 2 dwords in the HighQword are swaped. Dwords in xmm are copied from 0123 to 1023 ordering. The same as: pshufd XMM0 XMM0 {SHUFFLE 1,0,2,3}


; Macros used
[SHUFFLE | ( (#1 shl 6) or (#2 shl 4) or (#3 shl 2) or #4 )] ; Marinus/Sieekmanski
[pshufd | pshufd #1 #2 #3]

[SSE2_MAX_4FLOATS | maxps #1 #2 | pshufd #2 #1 SSE_SWAP_DWORDS | maxps #1 #2 | pshufd #2 #1 SSE_ROTATE_64BITS | maxps #1 #2]

[SSE_ABS_REAL4 | pslld #1 1 | psrld #1 1]


; Variables used

[NeighboursDivision: F$ (1/8), F$ (1/8), F$ (1/8), F$ (1/8)]
[MaxDensity: F$ 0, 0, 0, 0] ; ge the maximum density found to we use in the scale to density
[DensityRange: F$ 255, 255, 255, 255] ; to be used in Scale to density)

Proc EightNeighbourDistance:
    Arguments @pInput, @pOutput, @Height, @Width, @Padding, @DensityRatio
    Local @CurYPos, @NextRow, @PreviousRow, @NextOutput, @NextInput, @AddPaddingStart
    Uses ebx, esi, edi, edx, ecx

    xor eax eax
    If_Or D@Height = 0, D@Width = 0
        ExitP
    End_If

    movups xmm3 X$NeighboursDivision
    movups xmm4 X$MaxDensity

    mov eax D@Padding | shl eax 2 | mov D@AddPaddingStart eax
    ; calculate the address of the next input row
    mov eax D@Padding | shl eax 1 | add eax D@Width | shl eax 2 | mov D@NextInput eax

    ; calculate the address of the next output row
    mov eax D@Width | shl eax 2 | mov D@NextOutput eax

    mov eax D@Padding | mov ebx eax | add eax D@Width | shl eax 2 | mov D@NextRow eax; 9
    shl ebx 1 | shl ebx 2 | add ebx eax | mov D@PreviousRow ebx
    ; Input  start at

    mov eax D@Height | mov D@CurYPos eax
    mov edx D@pInput | add edx D@NextInput
    mov edi D@pOutput
    ..Do
        mov ebx edx | add ebx D@AddPaddingStart
        xor esi esi
        .Do
            ; get previous line (Up)
            lea eax D$ebx+esi*4 | sub eax D@PreviousRow | movdqu xmm1 X$eax; | SSE_CONV_4INT_TO_4FLOAT xmm1 ; convert the integers to floats
            pshufd xmm1 xmm1 {SHUFFLE 2,2,1,0} ; the 4th Float is not needed for this comparision, because we want only 3 (padding + width + padding)

            ; get next line (Down)
            lea eax D$ebx+esi*4 | add eax D@NextRow | movdqu xmm0 X$eax; | SSE_CONV_4INT_TO_4FLOAT xmm0 ; convert the integers to floats
            pshufd xmm0 xmm0 {SHUFFLE 2,2,1,0} ; the 4th Float is not needed for this comparision, because we want only 3 (padding + width + padding)

            SSE2_MAX_4FLOATS xmm0 xmm1

            ; now get left
            movdqu xmm1 X$ebx+esi*4-4 | pshufd xmm1 xmm1 {SHUFFLE 0,0,0,0}

            ; now get right
            movdqu xmm2 X$ebx+esi*4+4 | pshufd xmm2 xmm2 {SHUFFLE 0,0,0,0}
            maxps xmm1 xmm2

            ; and finally get the maximum between up and down and left to right
            maxps xmm1 xmm0

            ; get our core pixel
            mov eax D$ebx+esi*4 | movd xmm0 eax | pshufd xmm0 xmm0 {SHUFFLE 0,0,0,0}

            ; subtract pixels from right from the core
            subps xmm0 xmm1 ; Subtract corresponding integers

            SSE_ABS_REAL4 xmm0 ; and convert them to absolute

            mulps xmm0 xmm3

            ; get the maximum density
            maxps xmm4 xmm0
            movd D$edi+esi*4 xmm0

            inc esi
        .Loop_Until esi => D@Width

        add edx D@NextInput
        add edi D@NextOutput
        dec D@CurYPos
    ..Repeat_Until_Zero

    ; finally calculate the density ratio to be used in our new scale_density_to_image function
    mov eax D@DensityRatio
    movups xmm0 X$DensityRange | divss xmm0 xmm4 | movd D$eax xmm0

EndP

I´ll clean up the code later, but it only uses 2 functions to do all the trick. In fact, it uses only one "EightNeighbourDistance" which is responsible to create a map of the density (size = width*height*4 only  :azn:  :azn: )  (It creates a sort of a "pixeldensity" or "DensityPixel" or whatever need to call this stuff), while another function scale_density_to_image is responsible only to make that map be visible since the values of the density map are around 10% of the original grey ones we inputted. That´s why i calculated the maximum density and the ratio inside EightNeighbourDistance at once, to avoid extra checking on the scale_density_to_image function.

So, the final value of the pixel (to it be visible) is basically DensityPixel*DensityRatio

I added a parameter "DensityRatio" on the function EightNeighbourDistance (responsible to create the density map) to make easier to we make the pixels be visible again from the scale_density_to_image function.

The density ratio is basically this:
Ratio = 255/MaxDensity

Max Density is the maximum value found in the density map at EightNeighbourDistance function. And 255 is because this is the maximum value of any pixel.

So, if in the density map we found that the maximum value of the density image is let´s say: 28. We simply calculate the ratio r = 255/28

Then, to make the pixels visible again, we do it in anther function (scale_density_to_image) . So, on that function, we take the value of each "PixelDensity" and multiply with our Ratio.

To convert the pixel density to visible pixels we simply do:

Final Pixel Value= PixelDensity*(255/28)


Could we use euclidean distance to calculate the value of the density map ? Sure...but the question is why ? Using euclidian distance may result in 2 disadvantages:
1 - Speed
2 - Accuracy
The speed problem could be solved using a Table of 256 floats (Real4) containing the correspondent value of the Euclidean Distance (since it will result in values from 0 to 255 anyway). So, the max delta could be applied to a table containing 256 predefined values of Euclidean distances.

But the problem seems to be accuracy. Using euclidean distances may result in a higher values of Density which could not be accurated when classifyig the clusters. Ex:

if at a given pixel we calculate the density using euclidean as:
Maximum value of the 8 Neighbors = 90
Center Pixel to get the delta 40

Euclidean Distance = sqrt(90^2-40^2) = 80.62....

But, using a simple delta, we get:
Density Map (simple delta) = 90-40 = 50

So, it seems that using a simple delta from the max of the 8 neighbours and the pixel we are using to create (or classify) in the density map seem enough and more accurate for me - Not to mention, way faster. (Is the same conclusion, btw found in the article at the Burdwan University)
#80
The Campus / Re: Number confusion
Last post by daydreamer - May 30, 2024, 05:28:21 AM
Quote from: NoCforMe on May 29, 2024, 04:36:07 AMI've said it before and I'll say (write) it again:
64-bit programming for me and I think it's safe to say a lot of the other coders around here is total overkill. Completely unnecessary and not worth the additional headaches and hassles.

I exempt the few professionals here who depend on 64-bit coding for their livelihood or for special programming needs (huge datasets, etc.). But for the rest of us, it's just not needed.

Hell, even 32-bit is overkill for a lot of the stuff I do (and I imagine what others here code). Sometimes a little-bitty 8-bit µprocessor is all that's needed to do the job.
Its more worth it with 128+ bit SIMD to make things go faster,much code before worked on on!y 8 bit or 16 bit,worth investigation, because 8 bit code ported to 128 bit SIMD run faster than 32 bit code ported to 128 bit only 4 times faster