News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Fast median algorithm

Started by guga, July 18, 2020, 02:22:35 AM

Previous topic - Next topic

jj2007

Quote from: Siekmanski on July 22, 2020, 02:33:43 AM
Hi Jochen,

Don't know... could be a bug...

It's a very nice algo, Marinus, similar to a bucket sort. And I found the culprit: I had zeroed the MedianBuffer using the number of elements instead of the sizeof the buffer. Now it works!

For Guga's application, with 0...255 as value range, it's fine. For other data sets with different ranges, especially with negative numbers, it will be difficult.

nidud

#61
deleted

Siekmanski

Quote from: jj2007 on July 22, 2020, 02:41:03 AM
Quote from: Siekmanski on July 22, 2020, 02:33:43 AM
Hi Jochen,

Don't know... could be a bug...

It's a very nice algo, Marinus, similar to a bucket sort. And I found the culprit: I had zeroed the MedianBuffer using the number of elements instead of the sizeof the buffer. Now it works!

For Guga's application, with 0...255 as value range, it's fine. For other data sets with different ranges, especially with negative numbers, it will be difficult.

Phew.. that's a relieve.  :thumbsup:

Yeah, it's not really a substitute for a fast sorting algorithm.
But in guga case, it does very well on gray colored bitmaps and will beat sorting algorithms for this kind of processing.
Creative coders use backward thinking techniques as a strategy.

guga

Quote from: Siekmanski on July 21, 2020, 06:46:22 PM
Just some ideas:

For example, to get rid of the errors at the image borders:
- image = 100 * 100 pixels
- X and Y matrices = 3 * 3

Allocate a work buffer of 102 * 102 pixels with zeros.
Copy the bitmap data exactly in the middle so we have created a 1 pixel space around the image.
Now we can use the 3 * 3 matrix convolution maps without restoring the image border pixels.
The 1 pixel extra boundary space is only read, not written.

To prevent overshoots in Gx and Gy, we could normalize the matrix members.
If I'm correct, we will have clamped G values between 0-255 after sqrt(Gx^2+Gy^2).

This way we can also experiment with other matrix sizes and/or for example Gaussian ( evenly distributed ) convolution maps.

This is all theoretical, of course, and it has to be proven to work in the real world.  :biggrin:

Hi Siekmanski

Well, maybe could work putting the image in the middle, but then you would need to copy the result to another buffer to the image be saved and processed with other graphic libraries such as FreeImage, SDL, etc, right ?

And counting the filling the borders to zero could result on a cascading effect of incorrect values, i presume. I still don´t know what could be the best choice. Let them be zeros as you said, or calculate the average of the sobel (direction and magnitude) of the previous 3 pixels (in x and y pos) and simply using those averages to fill the missing values of the borders. In theory, doing that could keep the structure (direction and magnitude) of the missing data, by simply knowing what are the values and direction of their neighbours (but, it could slow down a bit the computation, because we would need another routine just to calculate those things near the border). IO´m still thinking on what could be better to do.

Btw...i succedded to make it work trhe 1st part of the whole process (estimating the watermark as in estimate_watermark function from python version) :azn:  The only problem is that my version is slow as hell because in order to convert the gray, i kinda exaggerated the result and forced it to get the media of the grays using all different Working Spaces available (I used 16 different matrices for grey :bgrin: :bgrin: :bgrin: :bgrin: ).
I´ll revert this and use only the one to be chosen, like ADOBE_RGB_1998_D65, or CS_MATRIX_NTSC_RGB_C, CS_MATRIX_SRGB_D65_HDTV etc, instead a average of all of them. Afterall, the average don´t seemss to produce a different result as if i used only one work space  :bgrin: :bgrin: :bgrin: :mrgreen: :mrgreen:

I´m fixing it to see the results, but so far here is what it exported after scanning 170 images :thumbsup: :thumbsup: :thumbsup:



Since the im,ages are very similar to each other you can still see the edges of one and another, but, i guess this is how the algorithm works. Later i´ll compare to the results trying to debug at least this 1st step of the python version to compare the resultant values.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

Siekmanski

#64
Quote from: nidud on July 22, 2020, 03:26:48 AM
Quote from: Siekmanski on July 22, 2020, 02:33:43 AM
It is not so fast when there are great gaps between the numbers.
But this routine is special designed for gray color bitmap pixel data.
- they are normally close to each other and there are many duplicate values and have a small range 0-255
- Big chance the whole buffer is populated at all offsets.
- it can process many thousands of pixel values very fast.

Think it's the other way around. The sort algo only have to loop half the array if all are equal. Yours have to loop the whole array for the count and (depending on the value) another half to calculate so that's 1:3 in favor of the sort algo.

500 dup(110)
total [0 .. 3], 1++
   282577 cycles 0.asm: median
   619579 cycles 1.asm: CalculateMedian

The sort algo gets slower as more swaps are needed and yours a fixed overhang by clearing the count buffer.

Yes, it has to read the whole array.
But it stops processing when it finds the median, which could be somewhere in the beginning of the small 256 items buffer.
The overhang is only clearing a buffer of 256 items put that against thousands of pixels.

The median could be in the last part of the image.

For example:
9,10,11,12,13,1,2,3,4,5,6,8  -> median is 7 ( the average of 6 and 8 )

But you are right, it also depends on what you feed it.
Unfortunately there is no, one fits all solution.

This routine is also a candidate for multithreading.
1 Median Buffer for each thread, when done adding the buffers together and find the median.
Creative coders use backward thinking techniques as a strategy.

Siekmanski

Hi guga,

COOL!  :cool: :cool: :cool:

We don't need other graphics libraries.
GdiPlus is our friend.
We can load images, manipulate the bitmap data and save the new images with it.
Creative coders use backward thinking techniques as a strategy.

nidud

#66
deleted

guga

Quote from: Siekmanski on July 22, 2020, 04:53:06 AM
Hi guga,

COOL!  :cool: :cool: :cool:

We don't need other graphics libraries.
GdiPlus is our friend.
We can load images, manipulate the bitmap data and save the new images with it.

Great ! I´ll try to find some examples on how to load (and save to other formats as well, not only bitmap) images with gdiplus and also grab their size and pointers to the pixeldata. I didn´t worked with gdiplus yet.

Btw...Take a look at this result :greenclp: :greenclp: :greenclp: :greenclp: :greenclp: :greenclp:

Scanned on 101 images and the edges of the watermark were found almost perfectly :greenclp: :greenclp: :greenclp: :greenclp:



The more different the images are, better is to identify the watermark or imperfections. The other test i made i used a sequence of images that i saved from a movie so, the scene contains images too similar to each other and that´s why this new test gave better results. But...the most important is that, it seems to be working :)

I´ll remove that routine i made to convert all working spaces and stay with only one to make the function work as fast as possible. Now it is 16 times slower because i´m using 16 different matrices to calculate an average gray on each pixel per image. Unnecessary, btw.

But..it works  :greenclp: :greenclp: :greenclp: :greenclp:  The next step is try to understand on the python version what a hell it is doing with the "crop_watermark" function before going through the watermark_detector and possion functions.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

Siekmanski

Quote from: nidud on July 22, 2020, 05:03:41 AM
Quote from: Siekmanski on July 22, 2020, 03:59:34 AM
The overhang is only clearing a buffer of 256 items put that against thousands of pixels.

I'm not sure how it's going to be used but I assumed the result was the median value of one specific pixel from each bitmap. If that's the case the input is 100 bytes.


You have a point there, I forgot that part, it's thousands of pixels * ( 100 pixels median calculations )
Then I will use multithreading.
1 Median Buffer for each thread, when done adding the buffers together per index number and find the median, will speed it up.
Creative coders use backward thinking techniques as a strategy.

guga

Hi Siekmanski

I found an issue with the median calculation. When analyzing the routine in the python version, i saw that it also takes the median for negative values as well.

The values on SobelX and SobelY can result in either positive, negative or zeroes. How to make your algo also works for signed integers ? Ex: say we have a sequence of: -100, -20, -5, 0, 36, 158, -12, 14, 66 etc, etc

Or this: -12, 0, 145 . Median = 0. (limited, therefore from -255 to 255)
-100, -20, -5, 0, 36, 158, -12, 14, 66 => Median = 0
-20, -5, 0, 36, 158, -12, 14, 66 => Median = 7
Also, the Real4 version would also be needed, for what i saw in the others routines
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Quote from: guga on July 22, 2020, 11:18:54 AMAlso, the Real4 version would also be needed, for what i saw in the others routines

You can forget the Real4 version, Marinus' algo is based on the number itself being an index. That works fine for a byte (0...255), but a Real4 has a range of 0...FFFFFFFFh, so you would need an index buffer of 4GB. Use a sort algo instead.

Siekmanski

Quote from: guga on July 22, 2020, 11:18:54 AM
Hi Siekmanski

I found an issue with the median calculation. When analyzing the routine in the python version, i saw that it also takes the median for negative values as well.

The values on SobelX and SobelY can result in either positive, negative or zeroes. How to make your algo also works for signed integers ? Ex: say we have a sequence of: -100, -20, -5, 0, 36, 158, -12, 14, 66 etc, etc

Or this: -12, 0, 145 . Median = 0. (limited, therefore from -255 to 255)
-100, -20, -5, 0, 36, 158, -12, 14, 66 => Median = 0
-20, -5, 0, 36, 158, -12, 14, 66 => Median = 7
Also, the Real4 version would also be needed, for what i saw in the others routines

Hi guga,

G = sqrt(Gx^2+Gy^2) -> Sobel result is always positive, no matter if Gx or Gy is negative.

I thought you needed the median of those G numbers?
Creative coders use backward thinking techniques as a strategy.

guga

Hi Marinus, the G will always be positive. But Gx and Gy can have either positive or negative and i´m not sure if they are being used in other parts of the python routines i´m trying to port.

Maybe i misinterpreted the results for the debugger when i was analyzing the results on Visual Studio. The data showed on VS contains negative values, but i´m not sure if they were a new median of the Gx or Gy that are still being used or only buffers that the algorithm didn't emptied (most likely, i hope).

I´l try to go further without those Gx and Gy and use only the G to see what happens. :thumbsup:


I´m a bit confused in what it is actually doing at:



def crop_watermark(gradx, grady, threshold=0.4, boundary_size=2):
    """
    Crops the watermark by taking the edge map of magnitude of grad(W)
    Assumes the gradx and grady to be in 3 channels
    @param: threshold - gives the threshold param
    @param: boundary_size - boundary around cropped image
    """
    W_mod = np.sqrt(np.square(gradx) + np.square(grady)) <---- This is, actually the G. Always positive.
    # Map image values to [0, 1]
    W_mod = PlotImage(W_mod) ; What is this ?
    # Threshold the image with threshold=0.4
    W_gray = image_threshold(np.average(W_mod, axis=2), threshold=threshold)
    x, y = np.where(W_gray == 1)

    # Boundary of cropped image (contain watermark)
    xm, xM = np.min(x) - boundary_size - 1, np.max(x) + boundary_size + 1
    ym, yM = np.min(y) - boundary_size - 1, np.max(y) + boundary_size + 1

    return gradx[xm:xM, ym:yM, :], grady[xm:xM, ym:yM, :] ; < ????????
     # return gradx[xm:xM, ym:yM, 0], grady[xm:xM, ym:yM, 0]


def image_threshold(image, threshold=0.5):
    '''
    Threshold the image to make all its elements greater than threshold*MAX = 1
    '''
    m, M = np.min(image), np.max(image)
    im = PlotImage(image)
    im[im >= threshold] = 1
    im[im < 1] = 0
    return im


I have no idea what  return gradx[xm:xM, ym:yM, :], grady[xm:xM, ym:yM, :] means. What exactly it is returning ? Do you have any idea of what is this ? xm:xM, ym:yM, :
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Quote from: jj2007 on July 22, 2020, 05:37:46 PM
Quote from: guga on July 22, 2020, 11:18:54 AMAlso, the Real4 version would also be needed, for what i saw in the others routines

You can forget the Real4 version, Marinus' algo is based on the number itself being an index. That works fine for a byte (0...255), but a Real4 has a range of 0...FFFFFFFFh, so you would need an index buffer of 4GB. Use a sort algo instead.

Hi JJ, yeah, i´m aware that. Unfortunately when i come to the part that the algo uses floats to calculate the median, i´ll be forced to use sorting algos. I just hope it don´t becomes too slow, though.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

If your floats are in a given range, convert them to fixed point integers to save time
3571 µs for converting from REAL4 to int32
648 ms for MbMedian      M=249925309
720 ms for ArraySort     M=249925309
#elements=5120000, maxY=767 Mio

xor ecx, ecx
.Repeat
fld REAL4 ptr [esi]
fmul FP4(10.0)
fistp DWORD ptr [edi]
inc ecx
.Until ecx>=eax