Fast median algorithm

guga · July 20, 2020, 04:59:56 AM

Quote from: daydreamer on July 19, 2020, 07:46:10 PM
Quote from: guga on July 19, 2020, 03:36:43 PM
I´m not sure i understood what is wrong with the code as suggested by HSE. I´m testing it here and didn´t found incorrect values so far. What i´m missing ?

For a tiny table of pixels (256 integer values only from 0 to 255) the algorithm seems to work fine, so far. I ported it to RosAsm trying to make it a bit faster (not measured to see what happened), using local variables so i could try to understand what the algo is doing. . Also it only needs to preserve ecx, since the data is stored in local vars, so no extra push/pop chains at the start and end of the function. And i set the data in the table to dwords, to replace routines like this:

Code Select Expand
(...) FindMedianLP: inc eax movzx ecx,byte ptr[edi+eax] add edx,ecx cmp edx,ebx jbe FindMedianLP

To something like:
Code Select Expand
(....) MedianBuffer dd 256 dup (0) (...) FindMedianLP: inc eax add ecx [edi+eax] ; or add ecx [MedianBuffer+eax] to avoid using more registers. cmp edx,ebx jbe FindMedianLP

Can you explain to me how does it is working ? I´m not fully understood why.

And also, does it needs to use a table ? This is because even if the algo can be improved for speed, the table should byte zeroed whenever the function is called and if we are dealing with huge amount of data like pixels in videos or in my case, hundreds of images to be analysed), then, cleaning the table could slow down the algo a little bit.

And another question,. how to make it work with SSE on Float or Real Numbers ? I mean, the algo is working on a array of integers (gray maps), but it won´t work on another table that is a array of floats to used to compute the median of Sobel data.

maybe you should have a big dynamic alloc memory buffer already cleared,so that simple move on to next 256 bytes,next time its called?,your going to use dynamic alloc anyway to process big image data
thats what I mean with SSE MAXPS and MINPS is:
if arraya>arrayb swap(arraya,arrayb) :

arrayb=MAXSS(arraya,arrayb)
arraya=MINSS(arraya,arrayb)
so unroll it MAXPS and MINPS does 4 of these conditional swaps in parallel, which could help with big image data,and MAXSS and MINSS to sort the final few numbers

Hi DayDreamer. yes, it uses a huge dynamic table to work with the data. I commented about this above to JJ. I´m trying to see if the median routine works as expected before go further onto the google documentation and the python version to continue creating this algorithm. The main problem is that i don´t know python language and it is very hard for me to understand exactly what the algorithm is doing precisely. I installed the python version at https://github.com/rohitrango/automatic-watermark-detection but i have no idea how to make it work and neither how to debug it. I tested it on VisualStudio, but the app simply freezes my PC before it reaches here:

Code Select



# W_m is the cropped watermark
W_m = poisson_reconstruct(cropped_gx, cropped_gy)

# Get the number of images in the folder
num_images = len(gxlist)

J, img_paths = get_cropped_images(
    'images/fotolia_processed', num_images, start, end, cropped_gx.shape)

The proper way would be convert this to C, so i could try to see what it is doing and debug it more properly, but i don´t have any idea how to convert python to C. And the worst, the python version uses opencv unnecessarily, because the sobel, canny and even poisson reconstruction can be don on a faster and easier way.

nidud · July 20, 2020, 06:34:20 AM

deleted

jj2007 · July 20, 2020, 07:23:52 AM

Let's make a test case: test1000.zip are 1,000 random DWORDs between 0 and 125,248. The graph shows their distribution when sorted.

HSE · July 20, 2020, 08:48:09 AM

Quote from: jj2007 on July 20, 2020, 07:23:52 AM
Let's make a test case: test1000.zip are 1,000 random DWORDs between 0 and 125,248. The graph shows their distribution when sorted.

Nice. ~~Test fail here~~

My mistake printing result! Perfect here.

Siekmanski · July 20, 2020, 08:52:34 AM

Had the time and looked at the logic again with a fresh mind and rewrote the routine after a bug report from HSE.
I think it has become very fast because of the use of an index buffer, it uses very few iterations.
The speed and iteration count, depend on how many large number values are in the data set.
If there are less large numbers than half of the data set size, they are discarded and not used in the calculation.

Fully commented source code:

Code Select

CalculateMedian proc uses ebx esi edi
    
    mov     esi,offset DataSet
    mov     edi,offset MedianBuffer ; Zero the MedianBuffer before using this routine.

    mov     ecx,PopulationSize
PopulateLP:
    mov     eax,[esi]
    inc     byte ptr [edi+eax]      ; Populate the Median buffer with the values as index numbers  
    add     esi,4                   ; and keep track for duplicates in one go.
    dec     ecx
    jnz     PopulateLP

    xor     esi,esi                 ; Uneven marker.
    mov     ecx,PopulationSize
    mov     ebx,ecx
    shr     ebx,1                   ; The middle of the PopulationSize.
    and     ecx,1                   ; Check even or uneven.
    jnz     UnEven
    dec     ebx                     ; Make even.
    inc     esi                     ; Set marker true = even.
UnEven:
    xor     edx,edx                 ; Median counter.
    mov     eax,-1                  ; Number counter.
FindMedianLP:
    inc     eax
    movzx   ecx,byte ptr[edi+eax]   ; Count of single and/or duplicate numbers. ( zero is not a number )
    add     edx,ecx                 ; Add it to the Median counter.
    cmp     edx,ebx                 ; Compare if we reached the middle of the PopulationSize.
    jbe     FindMedianLP            ; Repeat until we reached the middle.
    dec     esi                     ; Check even or uneven.
    js      Ready                   ; Ready if uneven.
    cmp     ecx,1                   ; Check if next number is a duplicate. ( no averaging needed )
    ja      Ready                   ; Ready if next number is the same value as the previous number.
    mov     edx,eax                 ; Save the first middle number to calculate the average of 2 middle numbers.
FindSecondMiddleNumberLP:
    inc     eax    
    movzx   ecx,byte ptr[edi+eax]
    test    ecx,ecx                 ; If zero, its not a number.
    jz      FindSecondMiddleNumberLP
    add     eax,edx                 ; Second middle number found, add it to the first middle number.
    shr     eax,1                   ; And get the average.
Ready:
    ret
CalculateMedian endp

HSE · July 20, 2020, 09:59:07 AM

Now faill JJ test

Its jumping to Ready when have to obtain the mean.

No problen with CalculateMedian2.

Siekmanski · July 20, 2020, 10:30:51 AM

Hi HSE,
You can not divide a number by 1000, because it is the index number.

It fails, because there are numbers higher than 1023 in the file.
You need an index buffer in byte size as large as the highest number+1 in the file.
And no more than 255 duplicate numbers.
You can increase the duplicates if needed by using a word or dword sized index buffer, but then you have to rewrite the code.

It seems a good idea to use a dword sized indexed buffer.
So guga can use it for his graphics and also get the median of a picture containing only 1 or a few different colors.
Then he needs only an index buffer of 256 * 1 dword = 1024 bytes and can handle 4294967295 duplicate colors.

But I think he is only using 256 color indexed pictures which is smarter and faster.

Siekmanski · July 20, 2020, 11:51:49 AM

Routine for dword sized MedianBuffer ( can handle 4294967295 duplicate numbers/colors )

Waiting for guga how he wants to use this in his graphics routine.....

Code Select

align 4    
CalculateMedian proc uses ebx esi edi
    
    mov     esi,offset DataSet
    mov     edi,offset MedianBuffer ; Zero the MedianBuffer before using this routine.

    mov     ecx,PopulationSize
PopulateLP:
    mov     eax,[esi]
    inc     dword ptr[edi+eax*4]    ; Populate the Median buffer with the values as index numbers  
    add     esi,4                   ; and keep track for duplicates in one go.
    dec     ecx
    jnz     PopulateLP

    xor     esi,esi                 ; Uneven marker.
    mov     ecx,PopulationSize
    mov     ebx,ecx
    shr     ebx,1                   ; The middle of the PopulationSize.
    and     ecx,1                   ; Check even or uneven.
    jnz     UnEven
    dec     ebx                     ; Make even.
    inc     esi                     ; Set marker true = even.
UnEven:
    xor     edx,edx                 ; Median counter.
    mov     eax,-1                  ; Number counter.
FindMedianLP:
    inc     eax
    mov     ecx,[edi+eax*4]         ; Count of single and/or duplicate numbers. ( zero is not a number )
    add     edx,ecx                 ; Add it to the Median counter.
    cmp     edx,ebx                 ; Compare if we reached the middle of the PopulationSize.
    jbe     FindMedianLP            ; Repeat until we reached the middle.
    dec     esi                     ; Check even or uneven.
    js      Ready                   ; Ready if uneven.
    cmp     ecx,1                   ; Check if next number is a duplicate. ( no averaging needed )
    ja      Ready                   ; Ready if next number is the same value as the previous number.
    mov     edx,eax                 ; Save the first middle number to calculate the average of 2 middle numbers.
FindSecondMiddleNumberLP:
    inc     eax    
    mov     ecx,[edi+eax*4]
    test    ecx,ecx                 ; If zero, its not a number.
    jz      FindSecondMiddleNumberLP
    add     eax,edx                 ; Second middle number found, add it to the first middle number.
    shr     eax,1                   ; And get the average.
Ready:
    ret

CalculateMedian endp

HSE · July 20, 2020, 12:44:03 PM

Hi Siekmanski!

Quote from: Siekmanski on July 20, 2020, 10:30:51 AM
You can not divide a number by 1000, because it is the index number.

Yes, I can. Exactly because the index number is the number. I'm scaling down data, and loosing precision, nothing else.

(And fooling JJ's idea of a so big DataBuffer that it's better to sort data

)

Quote from: Siekmanski on July 20, 2020, 10:30:51 AM
It fails, because there are numbers higher than 1023 in the file.

It fails because fails. Maximum number is 125.

Quote from: Siekmanski on July 20, 2020, 10:30:51 AM
Then he needs only an index buffer of 256 * 1 dword = 1024 bytes and can handle 4294967295 duplicate colors.

guga · July 20, 2020, 01:31:27 PM

Tks a lot Siekmanski

I´ll test this new version.

On the previous version, it seems to work fine, but i´ll make more tests on this new one.

So far, yours CalculateMedian i'm using to retrieve the median of a table of grey values on each image. For example, if i have to scan 100 different images, i create 100 tables of gray representing each image. (each one stored in a Dword rather then a single byte)

For example: The routine is at pos (x/y), and the image has a size of 960*720
X = 0, Y = 0 -->It take the 1st dword (That is equal to the gray value) at pos x/y (0:0) on the 1st image and use it in your routine to compare to the x,y of the next image until it reaches the 100th image (Which, btw, are stored on a huge linked table to properly points to the correct address). After it retrieve the median at pos x/y, the routine returns and look at the next dword at pos (x+1), y and to the median calculation again untill it reaches the 100th image. Like this:
X = 1, Y = 0. Calculate the median again on this pos from image 1 to image 100

It does this untill it reaches to the last data of he image, so the last Y = 720.

I´ll see if this new version also works ok.

One thing, can you make it work for Real4 and Real8 as well, using SSE ? The functions in the graphic routines contains tables of arrays that uses Floats representing the Sobel Gradient Values. So, in other table we have things like this:
X = 0, Y = 0. Value = F$ 0.565646 (F$ in RosAsm is Real4 in masm, i suppose. So,m the same size as a dword but used in floating point)
X = 1, Y = 0. Value = F$ 0.7.9898
X = 2, Y = 0. Value = F$ 0.11111

I think that perhaps to make it works with Floats, it would be need to handle it using similar calculations as done in Quartiles. I was reading how a quartile deviation works and found out that the quartile2 (Q2) is equal to the median of any value. So, maybe to work with float all is need is a table with 4 Dwords (Real4, in fact), that represents Quartiles 2 and Quartile 3 (each one of them separated by his low and half part).

ex: We can have a table like this:

Quartile2A______Quartile2B_________Quartile3A_________Quartile3B

And we have a sequence of, let´s say 100 Floating values.
[17.2, 12.5, 19.6, 111.88, 99.41, 55, 88, 63.8, 1544.89, 99.8978........]

To retrieve the median, we can simply fill the 1st 8 Dwords of our table and order them and start scanning for the rest of the dwords/floats using SSE, jumping from 4 to 4. Dwords that don´t match the last one.

So, the 1st step could be ordering the 1st 4 value and putting them on the quartile table:
Quartile2A______Quartile2B_______Quartile3A________Quartile3B
12.5____________17.2____________19.6____________111.88

Then it simply store the next 4 dwords in xmm0 and looks fopr conditions like this:
1 - If all next 4 values are smaller or equal to The 1st quartelion2A, the routine jumps over the next 4 dwords and scans again.
2 - If all next 4 values are bigger then the last quartile3B, it will copy all 4 values onto Quartile2A to Quartile3 in order.

or something like this.

Also, if using a similar way to compute using quartiles (A table containign onlly 2 quartiles - Q2 and Q3 seems to do the trick), perhaps the suggestion of daydreamer to use maxsss and minss could be a good thing

https://www.hackmath.net/en/calculator/quartile-deviation
https://www.mathsisfun.com/data/quartiles.html

Siekmanski · July 20, 2020, 07:56:03 PM

Hi guga,

So, if I did understand it correctly:

1. Convert 100 images to gray values.
2. Get the median of 100 pixels. ( from the same x/y position of all those 100 images )
3. If the image size = 960*720 pixels you'll have at the end 691200 median values for your Sobel function.

Is the above correct?

I asked you this because you use pixel data which is in 4 bytes format per ARGB value.
Every gray pixel value has a range of 0-255 because each color component in the gray RGB value have the same value.
So you can store a gray value as a byte.

My Median routine is based on integer numbers ( used as indexed numbers ) and will not work with Real4 numbers. ( @HSE this is why a data set of Real4 values will fail.

)
Pixel data are integer data, that's why I came up with this Algorithm using indexed numbers.

But you want the Median values to be in Real4 format?
You can convert and save the calculated Median integer value as a real4 value at the end of this Algorithm and use it for the Sobel function. ( and gain 0.5 value precision for double middle numbers )

And from that point you can do your Quartile Sobel Real4 calculations in fast SIMD instructions SSE SSE2 etc.

HSE · July 21, 2020, 12:26:14 AM

Quote from: Siekmanski on July 20, 2020, 07:56:03 PM
My Median routine is based on integer numbers ( used as indexed numbers ) and will not work with Real4 numbers. ( @HSE this is why a data set of Real4 values will fail. )
Pixel data are integer data, that's why I came up with this Algorithm using indexed numbers.

Sorry Siekmanski! I was thinking you know qWord tricks.

Result of fSlv eax = eax/1000 is an integer. Value in eax is moved to a buffer, FPU load integer from buffer, FPU make calculation, FPU store a rounded to integer value in buffer, and finally integer buffer content is moved to eax.

My corrected patch for version that work:

Code Select

        mov ecx, edx
        sub ecx, ebx
        cmp ecx , 1
        jg Done

CalculateMedian2 work; CalculateMedian3 and 4 fail.

Siekmanski · July 21, 2020, 12:41:05 AM

Guga is using pixel data thus that trick is not necessary, the Median routine only reads integers.

guga · July 21, 2020, 03:12:16 AM

Hi Siekmanski.

Quote from: Siekmanski on July 20, 2020, 07:56:03 PM
Hi guga,

So, if I did understand it correctly:

1. Convert 100 images to gray values.
2. Get the median of 100 pixels. ( from the same x/y position of all those 100 images )
3. If the image size = 960*720 pixels you'll have at the end 691200 median values for your Sobel function.

Is the above correct?
I asked you this because you use pixel data which is in 4 bytes format per ARGB value.
Every gray pixel value has a range of 0-255 because each color component in the gray RGB value have the same value.
So you can store a gray value as a byte.

My Median routine is based on integer numbers ( used as indexed numbers ) and will not work with Real4 numbers. ( @HSE this is why a data set of Real4 values will fail. )
Pixel data are integer data, that's why I came up with this Algorithm using indexed numbers.

Yes, that´s correct, but not in that order. The median calculation is done after sobel magnitudes are found. Basically it does:
1 - Scans a directory of images, select the images that fits to the resolution of 960*720
2 - Create a buffer large enough to store those images that wil be converted onto different tables: grey, sobelx, sobelx and sobel.
3 - Convert all chosen images to gray and put the data on the proper buffer location to be used as a map. (For now, all 3 channels are preserved, but it can be fixed later to decrease the size of the GrayTable)
4 - Create the Maps for all 3 types of Sobel (Sobelx, Sobely and Sobel itself) from the grey values found in the grey map. The sobel values are in Real4 because they are the fixed normalization of the luminosity (so, the true value - corrected when needed - and then divided by 255)
5 - Calculate the median of each Sobel Maps (Sobelx, SobelY and Sobel). On this step, since i wanted to see the results, i used your median algo to convert the values to integer to create the corresponding bitmap, so i can see what it is going on.
6 - Generate a poisson reconstruction algorithm somehow....
7 - Read the google papers, but have no idea what to do next, yet.

I realize that those steps can be unnecessary since i could simply take the median of greys (in integer) and therefore generate one single sobel map. (or a maximum of 3: sobelX, sobelY and sobel), but since i don´t know what the next steps are doing (in the python version) i choose to keep all the maps so i could later try to understand if the rest of the watermark functions are using those maps or not. One of the parts of the google algorithm uses a possion reconstruction routine and also convert something to canny (I presume it is converting all 100 gray images to canny as well...but don´t know yet)

Quote from: Siekmanski on July 20, 2020, 07:56:03 PM
But you want the Median values to be in Real4 format?
You can convert and save the calculated Median integer value as a real4 value at the end of this Algorithm and use it for the Sobel function. ( and gain 0.5 value precision for double middle numbers )

And from that point you can do your Quartile Sobel Real4 calculations in fast SIMD instructions SSE SSE2 etc.

Yes, please. If you can, could you try a version to work with Real4 ? On this way i could properly use it in step "5" (and others) that are using Real4 data rather then integers to detect the watermark. This is because i´m not sure if calculate the median previously biased only on the integer values of the grey will also works for the rest of the detect watermark algorithm. So i could try later both methods and see which one is getting closer to a result according to the google article.

I presume it could work, but since i don´t know python i can follow/debug and test the resultant values, so i´m basically reading google article, reading the python source and making a trial and error to find if the routines are according to the article.

I´m currently working only on this part of the python routine:

Code Select



def estimate_watermark(foldername):
    """
    Given a folder, estimate the watermark (grad(W) = median(grad(J))) follow Eq.4
    Also, give the list of gradients, so that further processing can be done on it
    """
    if not os.path.exists(foldername):
        warnings.warn("Folder does not exist.", UserWarning)
        return None

    images = []
    for r, dirs, files in os.walk(foldername):
        # Get all the images
        for file in files:
            img = cv2.imread(os.sep.join([r, file]))
            if img is not None:
                images.append(img)
            else:
                print("%s not found." % (file))

    # Compute gradients
    print("Computing gradients.")
    gradx = list(map(lambda x: cv2.Sobel(
        x, cv2.CV_64F, 1, 0, ksize=KERNEL_SIZE), images))
    grady = list(map(lambda x: cv2.Sobel(
        x, cv2.CV_64F, 0, 1, ksize=KERNEL_SIZE), images))

    # Compute median of grads
    print("Computing median gradients.")
    Wm_x = np.median(np.array(gradx), axis=0)
    Wm_y = np.median(np.array(grady), axis=0)

    return (Wm_x, Wm_y, gradx, grady)

Which is called from:

Code Select


gx, gy, gxlist, gylist = estimate_watermark('images/fotolia_processed')

So, if i understood correctly:
gxlist, gylist are all the 100 sobelmaps. The question is: Are they being used again ??? I don´t know yet
gx, gy are the median maps (2 only) of the sobel ones.

I just don´t know if all gxlist will be used or i can use only the median values previously calculate because the next routine is the poisson reconstruction and i didn´t fully understood yet what it is doing. The possion is build as:

Code Select



def poisson_reconstruct(gradx, grady, kernel_size=KERNEL_SIZE, num_iters=11, h=0.1,
                        boundary_image=None, boundary_zero=True):
    """
    Iterative algorithm for Poisson reconstruction. 
    Given the gradx and grady values, find laplacian, and solve for image
    Also return the squared difference of every step.
    h = convergence rate
    """
    fxx = cv2.Sobel(gradx, cv2.CV_64F, 1, 0, ksize=kernel_size)
    fyy = cv2.Sobel(grady, cv2.CV_64F, 0, 1, ksize=kernel_size)
    laplacian = fxx + fyy
    m, n, p = laplacian.shape

    if boundary_zero == True:
        est = np.zeros(laplacian.shape)
    else:
        assert(boundary_image is not None)
        assert(boundary_image.shape == laplacian.shape)
        est = boundary_image.copy()

    est[1:-1, 1:-1, :] = np.random.random((m-2, n-2, p))
    loss = []

    for i in range(num_iters):
        old_est = est.copy()
        est[1:-1, 1:-1, :] = 0.25*(est[0:-2, 1:-1, :] + est[1:-1, 0:-2, :] +
                                   est[2:, 1:-1, :] + est[1:-1, 2:, :] - h*h*laplacian[1:-1, 1:-1, :])
        error = np.sum(np.square(est-old_est))
        loss.append(error)

    return (est)

See ? The poisson routine is calculating sobel all over again but this time it seems to be on the target file from which the watermark should be removed, and i have no idea if that is really necessary.
The poisson is called from this part of the code:

Code Select



gx, gy, gxlist, gylist = estimate_watermark('images/fotolia_processed') ; <---- Already did previously (or trying to finish it to see if it ok)

# est = poisson_reconstruct(gx, gy, np.zeros(gx.shape)[:,:,0])
cropped_gx, cropped_gy = crop_watermark(gx, gy) ; <---------- Is really necessary crop the watermark here ??? It can b do0ne previously since the very 1st graymap to we work only on the already cropped tables. (Assuming it is necessary to crop btw)

# random photo
img = cv2.imread('images/fotolia_processed/fotolia_137840668.jpg') ; <--- read the image from where the watermark should be removed
im, start, end = watermark_detector(img, cropped_gx, cropped_gy) ; < ---- try to detect the watermark on this image as well. (This routine is where it will use canny)

# # Save result of watermark detection
# plt.subplot(1, 2, 1)
# plt.imshow(img)
# plt.subplot(1, 2, 2)
# plt.imshow(im)
# plt.savefig('images/results/watermark_detect_result.png')
# We are done with watermark estimation

# W_m is the cropped watermark
W_m = poisson_reconstruct(cropped_gx, cropped_gy) ; <--- Use poisson only on the target image i suppose
(...)

The crop watermark function is like this, and i have n idea what it is actually doing

Code Select



def crop_watermark(gradx, grady, threshold=0.4, boundary_size=2):
    """
    Crops the watermark by taking the edge map of magnitude of grad(W)
    Assumes the gradx and grady to be in 3 channels
    @param: threshold - gives the threshold param
    @param: boundary_size - boundary around cropped image
    """
    W_mod = np.sqrt(np.square(gradx) + np.square(grady))
    # Map image values to [0, 1]
    W_mod = PlotImage(W_mod)
    # Threshold the image with threshold=0.4
    W_gray = image_threshold(np.average(W_mod, axis=2), threshold=threshold)
    x, y = np.where(W_gray == 1)

    # Boundary of cropped image (contain watermark)
    xm, xM = np.min(x) - boundary_size - 1, np.max(x) + boundary_size + 1
    ym, yM = np.min(y) - boundary_size - 1, np.max(y) + boundary_size + 1

    return gradx[xm:xM, ym:yM, :], grady[xm:xM, ym:yM, :]
     # return gradx[xm:xM, ym:yM, 0], grady[xm:xM, ym:yM, 0]

basically, the files in python are these ones i attached here (Some 2 or 3 related to numpy, i didn´t included due to their size), but the full python version can be downloaded here https://github.com/rohitrango/automatic-watermark-detection
Also, i included the functions used in the source in one single pdf so i could better try to follow the rotines to see if i can understand what it is doing.

Siekmanski · July 21, 2020, 06:34:19 AM

I thought this is what you wanted to do:
Instead of X and Y convolution matrices you like to use X and Y median matrices to detect the edges in the image?

I do not understand the part, you talk about 100 different pictures.
I'm not really sure what to make of the watermark removal part, or is it only about edge detection?

-> The median calculation is done after sobel magnitudes are found.
Why not directly find the magnitudes from the median matrices. ( don't know if this is a good idea... have to experiment with that thought. )

The MASM Forum

News:

Fast median algorithm

guga

nidud

jj2007

HSE

Siekmanski

HSE

Siekmanski

Siekmanski

HSE

guga

Siekmanski

HSE

Siekmanski

guga

Siekmanski