News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Fast median algorithm

Started by guga, July 18, 2020, 02:22:35 AM

Previous topic - Next topic

Siekmanski

Hi guga,

-> I´l try to go further without those Gx and Gy and use only the G to see what happens. :thumbsup:

Yes, I think that's the logic behind it all.

->  W_mod = np.sqrt(np.square(gradx) + np.square(grady)) <---- This is, actually the G. Always positive.
->  W_mod = PlotImage(W_mod) ; What is this ?

Construct the edge detection image with the G values.


As far as I understand so far, correct me if I'm wrong....

- Perform edge detection on all 100 images with the Sobel formula.
- Find the medians, collecting 100 values at all the same x,y locations from 100 edge detected images.
- Use those median values to construct a new image.
- Detect the location of the watermark.
- Perform some magic, to remove the watermark?
Creative coders use backward thinking techniques as a strategy.

guga

Hi marinus

Quote from: Siekmanski on July 22, 2020, 09:39:07 PM
As far as I understand so far, correct me if I'm wrong....

- Perform edge detection on all 100 images with the Sobel formula.
- Find the medians, collecting 100 values at all the same x,y locations from 100 edge detected images.
- Use those median values to construct a new image.
- Detect the location of the watermark.
- Perform some magic, to remove the watermark?

Yes...that´s correct. But it can look for way more then 100 images. It always depends on how similar the images are. On different images, scanning among 100 is enough to estimate the watermark, but, on similar images (like sequences of scenes from a video), then i presume it would be necessary way more then that due to the similarity of the scenes.

The worst part is the magic :mrgreen: :mrgreen: :mrgreen:  I could understand how to find the edges of the watermark but, retrieve the alpha channel and reconstruct is another story. I´m not into that yet. And to things get worst, the guy who succeeded to make it work after reading the google articles, made it in python, instead in C or any other language that people could easily give a try and eventually fix it.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

Siekmanski

 :biggrin:

Do you have the links to the Google articles?
Creative coders use backward thinking techniques as a strategy.

guga

Quote from: Siekmanski on July 22, 2020, 11:20:45 PM
:biggrin:

Do you have the links to the Google articles?

Yes...I posted it here in page 2 of the thread, but here are the links and the pdf

All the method is described here from google:
https://watermark-cvpr17.github.io/supplemental/
https://watermark-cvpr17.github.io/
https://ai.googleblog.com/2017/08/making-visible-watermarks-more-effective.html

And the link to the python version in github i posted.

If you want to read the google documentation that describes the algorithm it can be find here:
http://openaccess.thecvf.com/content_cvpr_2017/papers/Dekel_On_the_Effectiveness_CVPR_2017_paper.pdf     <------ Google Article in pdf
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

I found it where VisualStudio generated negative values. It was the way the python version was created. It does not calculate the media of G directly, it calculate the median of Gx and later the median of Gy and only after it, it compute the value of G = sqrt (MedianGx^2 + MedianGy^2)


def estimate_watermark(foldername):
(...)
    # Compute median of grads
    print("Computing median gradients.")
    Wm_x = np.median(np.array(gradx), axis=0)
    Wm_y = np.median(np.array(grady), axis=0)
    return (Wm_x, Wm_y, gradx, grady)


And it only calculated the value Of G after getting the median of Gy and Gy separately at:

def crop_watermark(gradx, grady, threshold=0.4, boundary_size=2):
    """
    Crops the watermark by taking the edge map of magnitude of grad(W)
    Assumes the gradx and grady to be in 3 channels
    @param: threshold - gives the threshold param
    @param: boundary_size - boundary around cropped image
    """
    W_mod = np.sqrt(np.square(gradx) + np.square(grady)) <-------


I have no idea why it is doing like that, rather then calculate the median directly whenever the values of gx and gy are retrieved. Well...i´ll continue working on the crop_watermark function, but, if you could, maybe  doing a algo for calculating negative values of media (so range from -255 to 255), could be good to we see if (or how) it affects the result.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

Siekmanski

Quote from: guga on July 23, 2020, 12:01:26 AM
I found it where VisualStudio generated negative values. It was the way the python version was created. It does not calculate the media of G directly, it calculate the median of Gx and later the median of Gy and only after it, it compute the value of G = sqrt (MedianGx^2 + MedianGy^2)

Hmmm... if that's the case, my routine has no use at all, calculating 6 numbers will not be beneficial.
Sorting will be faster.
Creative coders use backward thinking techniques as a strategy.

hutch--

If you are talking about sorting small number sequences, -256 - 256 as FP values, you should be able to do that with one of the simpler exchange sorts but the comparison would have to be done in FP in the sort. With such a low range of numbers, a bucket sort of some type would probably be the fastest, simply an array that you increment each member when it occurs in the numbers being sorted. This better suits integers but if the FP precision is not a big deal, rounding the values to integer may be viable. Once the array if filled, just grab the half way mark.

HSE

To work with range -256 to 256 you can add 256 to values then you range became 0 to 512. Later substract 256 to result.

If you want to make integers from real4 in the range 0-1, you have to multiply previously by 100, 1000 or precision you requiere. Later divide result for that number.
Equations in Assembly: SmplMath

nidud

#83
deleted

guga

Tks a lot Nidud. I´ll give a try.

I suceeded to compile the signed integer version (CalculateMedian ), but when trying to assemble the SSE one (in quickeditor), those are the errors i´ve got.

C:\masm32\examples\NidudMedian\3dframes2.asm(135) : error A2070:invalid instruction operands
compare(1): Macro Called From
  swapif(9): Macro Called From
   C:\masm32\examples\NidudMedian\3dframes2.asm(135): Main Line Code
C:\masm32\examples\NidudMedian\3dframes2.asm(135) : error A2070:invalid instruction operands
compare(2): Macro Called From
  swapif(9): Macro Called From
   C:\masm32\examples\NidudMedian\3dframes2.asm(135): Main Line Code
C:\masm32\examples\NidudMedian\3dframes2.asm(135) : error A2008:syntax error : .
swapif(10): Macro Called From
  C:\masm32\examples\NidudMedian\3dframes2.asm(135): Main Line Code
C:\masm32\examples\NidudMedian\3dframes2.asm(135) : error A2070:invalid instruction operands
swapif(11): Macro Called From
  C:\masm32\examples\NidudMedian\3dframes2.asm(135): Main Line Code
C:\masm32\examples\NidudMedian\3dframes2.asm(135) : error A2070:invalid instruction operands
swapif(12): Macro Called From
  C:\masm32\examples\NidudMedian\3dframes2.asm(135): Main Line Code
C:\masm32\examples\NidudMedian\3dframes2.asm(135) : error A2070:invalid instruction operands
swapif(13): Macro Called From
  C:\masm32\examples\NidudMedian\3dframes2.asm(135): Main Line Code
C:\masm32\examples\NidudMedian\3dframes2.asm(135) : fatal error A1011:directive must be in control block
swapif(14): Macro Called From
  C:\masm32\examples\NidudMedian\3dframes2.asm(135): Main Line Code


I suceeded to dl your AsmC to make it run, but don´t know how to generate the executable file so i can see what are those macros used in 'median' function all about. Can you make a executable file for the SSE/Float version one (32 bits, pls), so i can test ?
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Hi Nidud

Can you please make a small test ? 

Check the median (The integer version: CalculateMedian) for these, pls ? -216,-151,136

The correct value is -151.

I´m not sure if i ported correctly your version, because the result i´ve got is wrong

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

Siekmanski

Hi guga,

A branchless floating point sort algorithm, only using registers.
To calculate the Median of the 6 numbers from a Sobel Matrix [ 3 * 3 ]

align 4
CalculateMedianFP6 proc
; Fill the first 6 xmm registers with the unsorted 6 values
    movaps  xmm0,oword ptr [GrayPixels]
    movlps  xmm4,qword ptr [GrayPixels+16]
    movaps  xmm1,xmm0
    movaps  xmm2,xmm0
    shufps  xmm1,xmm1,00111001b
    shufps  xmm2,xmm2,01001110b
    movaps  xmm3,xmm0
    movaps  xmm5,xmm4
    shufps  xmm3,xmm3,10010011b
    shufps  xmm5,xmm5,00111001b

; Branchless floating point sort algorithm, only using registers.
; 6 numbers = max 12 swaps

    movss   xmm6,xmm1
    movss   xmm7,xmm4
    minps   xmm1,xmm2
    minps   xmm4,xmm5
    maxps   xmm2,xmm6
    maxps   xmm5,xmm7

    movss   xmm6,xmm0
    movss   xmm7,xmm3
    minps   xmm0,xmm2
    minps   xmm3,xmm5
    maxps   xmm2,xmm6
    maxps   xmm5,xmm7

    movss   xmm6,xmm0
    movss   xmm7,xmm3
    minps   xmm0,xmm1
    minps   xmm3,xmm4
    maxps   xmm1,xmm6
    maxps   xmm4,xmm7

    movss   xmm6,xmm1
    movss   xmm7,xmm0
    minps   xmm1,xmm4
    minps   xmm0,xmm3
    maxps   xmm4,xmm6
    maxps   xmm3,xmm7

    movss   xmm6,xmm2
    movss   xmm7,xmm1
    minps   xmm2,xmm5
    minps   xmm1,xmm3
    maxps   xmm5,xmm6
    maxps   xmm3,xmm7

    movss   xmm6,xmm2
    minps   xmm2,xmm4
    maxps   xmm4,xmm6
    movss   xmm6,xmm2
    minps   xmm2,xmm3
    maxps   xmm3,xmm6
   
    addss   xmm2,xmm3
    mulss   xmm2,real4 ptr [Half] ; Median in xmm2
    ret

CalculateMedianFP6 endp
Creative coders use backward thinking techniques as a strategy.

guga

Thanks a lot, Marinus, i´ll give a try and check the results :thumbsup: :thumbsup: :thumbsup: :thumbsup:
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

#88
Quote from: Siekmanski on July 24, 2020, 12:53:45 AM
Hi guga,

A branchless floating point sort algorithm, only using registers.
To calculate the Median of the 6 numbers from a Sobel Matrix [ 3 * 3 ]

align 4
CalculateMedianFP6 proc
; Fill the first 6 xmm registers with the unsorted 6 values
    movaps  xmm0,oword ptr [GrayPixels]
    movlps  xmm4,qword ptr [GrayPixels+16]
    movaps  xmm1,xmm0
    movaps  xmm2,xmm0
    shufps  xmm1,xmm1,00111001b
    shufps  xmm2,xmm2,01001110b
    movaps  xmm3,xmm0
    movaps  xmm5,xmm4
    shufps  xmm3,xmm3,10010011b
    shufps  xmm5,xmm5,00111001b

; Branchless floating point sort algorithm, only using registers.
; 6 numbers = max 12 swaps

    movss   xmm6,xmm1
    movss   xmm7,xmm4
    minps   xmm1,xmm2
    minps   xmm4,xmm5
    maxps   xmm2,xmm6
    maxps   xmm5,xmm7

    movss   xmm6,xmm0
    movss   xmm7,xmm3
    minps   xmm0,xmm2
    minps   xmm3,xmm5
    maxps   xmm2,xmm6
    maxps   xmm5,xmm7

    movss   xmm6,xmm0
    movss   xmm7,xmm3
    minps   xmm0,xmm1
    minps   xmm3,xmm4
    maxps   xmm1,xmm6
    maxps   xmm4,xmm7

    movss   xmm6,xmm1
    movss   xmm7,xmm0
    minps   xmm1,xmm4
    minps   xmm0,xmm3
    maxps   xmm4,xmm6
    maxps   xmm3,xmm7

    movss   xmm6,xmm2
    movss   xmm7,xmm1
    minps   xmm2,xmm5
    minps   xmm1,xmm3
    maxps   xmm5,xmm6
    maxps   xmm3,xmm7

    movss   xmm6,xmm2
    minps   xmm2,xmm4
    maxps   xmm4,xmm6
    movss   xmm6,xmm2
    minps   xmm2,xmm3
    maxps   xmm3,xmm6
   
    addss   xmm2,xmm3
    mulss   xmm2,real4 ptr [Half] ; Median in xmm2
    ret

CalculateMedianFP6 endp


Hi Marinus...got it. The speed is awesome  :greenclp: :greenclp: :greenclp:! But the results are being switched among xmm2 and xmm3. I mean, sometimes it puts the correct result in xmm2 and others in xmm3. Check this values as input, pls:

[FloatDataSelect12: F$ 136, -151, -216, -40, -13, -215, -24, -104, 230] ; Median = -40. The correct result is in xmm3
[FloatDataSelect13: F$ -36, -51, -16, -10, -213, -115, -214, -14, -230] ; Median = -51. The correct result is in  xmm2
[FloatDataSelect14: F$ 0.565856, 0.9898, 0.1112545, 0.55897941, 0.00214548, 0.4564585, 0.6587978, 0.115545, 0.98989] ; Median = 0.55897941. The correct result is in xmm3
[FloatDataSelect15: F$ -0.565856, 0.9898, -0.1112545, -0.55897941, -0.00214548, 0.4564585, -0.6587978, 0.115545, -0.98989] ; Median = -0.1112545. The correct result is in xmm2

Same thing happens when the input is all negative or all positive:
[FloatDataSelect16: F$ 2, F$ 5, F$ 1, F$ 3, F$ 6, F$ 4, F$ 4, F$ 4, F$ 4] ; Median = 4. Result in xmm3
[FloatDataSelect17: F$ -2, F$ -5, F$ -1, F$ -3, F$ -6, F$ -4, F$ -4, F$ -4, F$ -4] ; Median = -4. Result in xmm2

The problem seems to be at the end of the code. Some comparison maybe needed to identify which of the 2 registers (xmm2 or xmm3) contains the correct value. I´m not sure about what is missing to compare, but when we reach at the last "maxps XMM3 XMM6", i´ve got this conditions from xmm2 and xmm3:

Examples of result

Analysis from Result1
[FloatDataSelect15: F$ -0.565856, 0.9898, -0.1112545, -0.55897941, -0.00214548, 0.4564585, -0.6587978, 0.115545, -0.98989] ; Median = -0.1112545. result in xmm2
xmm2 and xmm3 are negative.
xmm2 = -1.112544e-1
xmm3 = -2.145479e-3

Result should be -1.11254e-1 From xmm2

Analysis from Result2
[FloatDataSelect14: F$ 0.565856, 0.9898, 0.1112545, 0.55897941, 0.00214548, 0.4564585, 0.6587978, 0.115545, 0.98989] ; Median = 0.55897941. result in xmm3
xmm2 and xmm3 are positive.
xmm2 = 4.564585e-1
xmm3 = 5.58979e-1

Result should be 5.58979e-1 From xmm3

Analysis from Result3
[FloatDataSelect13: F$ -36, -51, -16, -10, -213, -115, -214, -14, -230] ; Median = -51. result in xmm2
xmm2 and xmm3 are negative.
xmm2 = -51
xmm3 = -36

Result should be -51 From xmm2

Analysis from Result4
[FloatDataSelect12: F$ 136, -151, -216, -40, -13, -215, -24, -104, 230] ; Median = -40. Result in xmm3
xmm2 and xmm3 are negative.
xmm2 = -151
xmm3 = -40

Result should be -40 From xmm3

Can you please check with those values to see what´s happening for it be switching the results ?


Note:

This new algo can be used to calculate the median of sobelX and SobelY of a matrix on one single image, right ?  This is not for N images where we get the Sobel of each pixel at specific coordinates of each image, i presume. In any case, this new algo we can use to calculate the median on the boarders of each image, after getting the median of all N images at specified positions.

So, i presume the algo is handling the matrix 3*3 of image 1. Then, to fix the border issue, all we need to do is:

1- Compute the median of Sobelx and SobelY on all images (not only limited to 100), with other algo similar to this new algo you did. So, the algo will take the pixel values on coord x/y from image 1 until image N, and get the median of them. Save this median related to posx/y to the output at same location x/y. Then go to the next byte at pos x+1/y on all images from 1 to N, get the median of those values and save the result in x+1/y on output...It loops the routine untill ll N images calculated the median of Sobel on all pixels at x/y coords.
The routine maybe restricted to width and height that are divisible by 3, so we can later fill the border with this new algo you did.

2 - After computing the median of sobelx and sobely on all images, we will end up creating two memory buffers containing the median of Sobels x and Y on all images, except the values at their boarders (only on the cases when the width and height are not divisible by 3), and then we simply use this new algo you create to fix the median of Sobelx and SobelY on those 2 created Buffers, whose boarders needs to be fixed.

So, please can you fix this algo (so we can later use it to fix the boarder issues), and later we give a try on other one to get the median of all N images ?
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

nidud

#89
deleted