News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

DBScan, Homogeneity and Completeness algorithm

Started by guga, May 11, 2024, 11:39:49 PM

Previous topic - Next topic

guga

New version in C (to be compiled with Pelles) with a bit of optimization and precalculating the euclidean distances.

I´ll try optimize it even further before porting it to Assembly.

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

#16
Succeeded to automate the process of identifying Epsilon and Minimum Points. For the articles i´m reading, the smaller epsilon, the better.

https://hyperskill.org/learn/step/31529
https://mrinalyadav7.medium.com/dbscan-algorithm-c894701306d5

On the attached version, it generated only 2 epsilon values.

Main:
....
        call GetMinPoints DBSCAN_DIM_1D
        mov D$MinPoints eax
        call GetEpsilon D$points, D$Num_Points, Epsilon, DBSCAN_DIM_1D


        call dbscan D$points, D$Num_Points, Epsilon, D$MinPoints, Homogeneity, HomogeneityLog
....

;;
   Automatically find epsilon
   http://www.sefidian.com/2022/12/18/how-to-determine-epsilon-and-minpts-parameters-of-dbscan-clustering/
   https://mrinalyadav7.medium.com/dbscan-algorithm-c894701306d5
   https://hyperskill.org/learn/step/31529
;;

[DBSCAN_DIM_1D 1]
[DBSCAN_DIM_2D 2]
[DBSCAN_DIM_3D 3]

Proc GetMinPoints:
    Arguments @DimensionFlags

    mov eax D@DimensionFlags
    shl eax 1

EndP

Proc GetEpsilon:
    Arguments @points, @num_points, @pOutput, @Flag
    Local @k_distances, @kFactor
    Uses ebx, esi, edi, ecx, edx

    xor eax eax
    On D@num_points = 0, ExitP

    C_call 'msvcrt.calloc' D@num_points, 8
    mov D@k_distances eax

    mov eax D@Flag
    .If_Or eax = DBSCAN_DIM_1D, eax = 0
        mov eax 2 ; dimensions*2) = 1*2
    .Else_If eax = DBSCAN_DIM_2D
        mov eax 4 ; dimensions*2) = 2*2
    .Else_If eax = DBSCAN_DIM_3D
        mov eax 6 ; dimensions*2) = 3*2
    .Else_If_And eax > 3, eax < 7
        mov eax 7
    .Else_If eax > 10
        inc eax
    .End_If
    mov D@kFactor eax

    ; Calculate k-distances for each point
    call calculate_k_distances D@points, D@num_points, D@kFactor, D@k_distances
    call find_knee_point D@k_distances, D@num_points

    mov eax D@pOutput | fstp R$eax

    C_call 'msvcrt.free' D@k_distances

EndP

; Function to calculate k-distance for each point

Proc calculate_k_distances:
    Arguments @points, @num_points, @kFactor, @k_distances
    local @Distances, @PointsIPos
    Uses ebx, esi, edi, ecx, edx

    xor eax eax
    On D@num_points = 0, ExitP

    C_call 'msvcrt.calloc' D@num_points, 8
    mov D@Distances eax
    mov ebx eax
    mov esi D@points
    mov D@PointsIPos esi
    xor edi edi

    .Do

        mov esi D@points
        xor ecx ecx
        Do
       
            call euclidean_dist D@PointsIPos, esi
            fstp R$ebx+ecx*8
            add esi Size_Of_point_t
            inc ecx
        Loop_Until ecx >= D@num_points ; jl

        C_call 'msvcrt.qsort' ebx, D@num_points, 8, compare_doubles
        ;  k-th nearest neighbor distance
        mov eax D@kFactor
        mov edx D@k_distances
        fld R$ebx+eax*8 | fstp R$edx+edi*8

        add D@PointsIPos Size_Of_point_t
        inc edi
    .Loop_Until edi >= D@num_points

    C_call 'msvcrt.free' D@Distances

EndP

Proc compare_doubles:
    Arguments @Arg1, @Arg2

    mov eax D@Arg1 | movsd XMM0 X$eax
    mov eax D@Arg2 | movsd XMM1 X$eax

    xor eax eax
    SSE_D_If xmm1 > xmm0
        mov eax 0-1
    SSE_D_Else_If xmm1 < xmm0
        mov eax 1
    SSE_D_End_If

EndSTD

[KneeRate: R$ 0.1]

Proc find_knee_point:
    Arguments @k_Distance, @Num_Points
    Local @KneeIndex
    Uses ebx, esi, ecx, edx

    mov ebx D@k_Distance
    mov esi D@Num_Points
    C_call 'msvcrt.qsort' ebx, esi, 8, compare_doubles
    fild D@Num_Points | fmul R$KneeRate | fistp D@KneeIndex

;fld R$ebx;+eax*8
;mov eax 52 | fld R$ebx+eax*8

    mov eax D@KneeIndex | fld R$ebx+eax*8

EndP



On the function find_knee_point it creates a table of distances values (possible values of epsilon) forming a array of data (Size is NumPoints *8 = 8 because we are dealing with a Real8). The generated contents are displaced in ascending order On the example we have 53 values, on 2 unique epsilon values from the smaller to the bigger.

The function chooses one located at the 10% of the initial data in the array. Later i´ll make a routine to get rid of the duplicated data, and insert the different values on another table to make easier the user choose what type of data we wants do identify.

according to the 2nd article, Epsilon can be better described as:
Quoteε: The value for ε can then be chosen by using a k-distance graph, plotting the distance to the k = minPts-1 nearest neighbor ordered from the largest to the smallest value. Good values of ε are where this plot shows an "elbow": if ε is chosen much too small, a large part of the data will not be clustered; whereas for a too high value of ε, clusters will merge and the majority of objects will be in the same cluster. In general, small values of ε are preferable, and as a rule of thumb, only a small fraction of points should be within this distance of each other.


Btw,, so far, chatgpt didn´t succeeded to help me making the algo way more fast. It is incredibly slow when working on real images. The alternative i proposed was create a table in memory to directly point to the addresses in get_neighbor routines, but it turns out that i´ll need a insane huge table for that (NumPixels*8*NumPixels). I´m trying to overcome this, and use only 1 table (size = Numpixes*8) and then, make some lighter arithmetic function to calculate the values of the next indexed tables. Thinking how to do it yet....


About epsilon and minimum points, chatgpt answered me like this (That proven to be correct at the end)

QuoteThe choice of k is not strictly fixed and can vary based on the characteristics of the data and the specific requirements of the clustering task. Here's a more detailed explanation of how to determine an appropriate k value for different dimensions:
1D Data

For 1D data, clusters are linear, and local density is easier to determine. Therefore, a small k value is usually sufficient.

    k = 2: Using the nearest neighbor (1st neighbor) can be enough to define local density in 1D data.

2D Data

For 2D data, clusters are more complex, and you generally need a slightly larger k to capture the local density effectively.

    k = 4: This is a commonly used heuristic. A point and its three nearest neighbors can adequately define a local dense region in 2D space.

3D Data

For 3D data, the complexity increases due to the additional dimension, requiring a larger k value to capture the local density accurately.

    k = 6: This value is chosen to include a point and its five nearest neighbors, which helps define the local density in 3D space.
    k = 7-10: Depending on the noise level and the expected cluster structure, you might need a larger k to ensure robustness.

Choosing k Based on Dimensions

The heuristic k = dimensions + 1 is a starting point, but it may need to be adjusted based on empirical observations and the specific dataset. The values mentioned (e.g., k = 6 for 3D) come from practical experience and empirical studies in clustering literature, where a slightly larger k helps to better capture the density in higher dimensions.
Example Adjustments in Code

Here is how you can set k based on the dimensionality of your data and adjust it if necessary:

His piece of code was originally this:
#include <math.h>
#include <stdlib.h>
#include <stdio.h>

// Structure to represent a point in 3D (adjust for higher dimensions if needed)
typedef struct {
    double x, y, z; // Add more dimensions as necessary
} point_t;

// Function to calculate Euclidean distance between two points
double euclidean_distance(point_t *a, point_t *b) {
    return sqrt(pow(a->x - b->x, 2) + pow(a->y - b->y, 2) + pow(a->z - b->z, 2)); // Adjust for dimensions
}

// Function to calculate k-distance for each point
void calculate_k_distances(point_t *points, int num_points, int k, double *k_distances) {
    for (int i = 0; i < num_points; ++i) {
        double distances[num_points];
        for (int j = 0; j < num_points; ++j) {
            distances[j] = euclidean_distance(&points[i], &points[j]);
        }
        // Sort distances to find the k-th nearest neighbor
        qsort(distances, num_points, sizeof(double), compare_doubles);
        k_distances[i] = distances[k]; // k-th nearest neighbor distance
    }
}

// Comparison function for qsort
int compare_doubles(const void *a, const void *b) {
    double arg1 = *(const double*)a;
    double arg2 = *(const double*)b;
    if (arg1 < arg2) return -1;
    if (arg1 > arg2) return 1;
    return 0;
}

// Function to find the "knee" point in k-distances plot
double find_knee_point(double *k_distances, int num_points) {
    // Sort the k-distances to create the k-distances plot
    qsort(k_distances, num_points, sizeof(double), compare_doubles);
    // Simple heuristic: return the distance at the 90th percentile
    return k_distances[(int)(0.9 * num_points)];
}

int main() {
    // Assuming points is already populated
    point_t *points = ...; // Replace with actual points initialization
    int num_points = ...; // Replace with actual number of points
    int dimensions = 3; // Adjust based on the dimensionality of your data
    int k = dimensions + 3; // Adjusted heuristic for k (example: dimensions + 3)
    double k_distances[num_points];

    // Calculate k-distances for each point
    calculate_k_distances(points, num_points, k, k_distances);
    // Find a suitable epsilon using the knee point of the k-distances plot
    double epsilon = find_knee_point(k_distances, num_points);
    // Determine minPts, often set as 2 * dimensions
    int minPts = 2 * dimensions;

    // Now run DBSCAN with the calculated epsilon and minPts
    int num_clusters = dbscan_run(points, epsilon, minPts, num_points);

    // Continue with your process...

    return 0;
}


And then it answered me:
QuoteSummary

    1D Data: k = 2 is typically sufficient.
    2D Data: k = 4 is a commonly used heuristic.
    3D Data: k = 6 or k = 7-10 based on noise and cluster structure.
    Higher Dimensions: k = dimensions + 1 or slightly higher based on empirical observation.

These values are starting points, and you may need to adjust k based on your specific dataset and the results you observe.


The resultant values were the same i´ve got earlier, so, it seems the routines i created to automate epsilon and min points were correct

Epsilon: 1.000000
Minimum points: 2
Homogeneity: 0.000000
Homogeneity(log): 0.877543
Number of points: 53
 x    y    z    cluster_id
-----------------------------
 1.00  3.00  1.00: 0
 1.00  4.00  1.00: 0
 1.00  5.00  1.00: 0
 1.00  6.00  1.00: 0
 2.00  2.00  1.00: 2
 2.00  3.00  0.00: 1
 2.00  4.00  0.00: 1
 2.00  5.00  0.00: 1
 2.00  6.00  0.00: 1
 2.00  7.00  1.00: 3
 3.00  1.00  1.00: 2
 3.00  2.00  1.00: 2
 3.00  3.00  1.00: 2
 3.00  4.00  0.00: 1
 3.00  5.00  0.00: 1
 3.00  6.00  1.00: 3
 3.00  7.00  1.00: 3
 4.00  1.00  1.00: 2
 4.00  2.00  1.00: 2
 4.00  3.00  0.00: 1
 4.00  4.00  0.00: 1
 4.00  5.00  1.00: -2
 4.00  6.00  0.00: 1
 4.00  7.00  1.00: 3
 4.00  8.00  1.00: 3
 5.00  1.00  1.00: 2
 5.00  2.00  0.00: 1
 5.00  3.00  0.00: 1
 5.00  4.00  0.00: 1
 5.00  5.00  0.00: 1
 5.00  6.00  0.00: 1
 5.00  7.00  1.00: 3
 5.00  8.00  1.00: 3
 6.00  1.00  1.00: 2
 6.00  2.00  0.00: 1
 6.00  3.00  1.00: 3
 6.00  4.00  1.00: 3
 6.00  5.00  1.00: 3
 6.00  6.00  1.00: 3
 6.00  7.00  1.00: 3
 7.00  1.00  1.00: 2
 7.00  2.00  0.00: 1
 7.00  3.00  0.00: 1
 7.00  4.00  0.00: 1
 7.00  5.00  1.00: 3
 8.00  1.00  1.00: 2
 8.00  2.00  1.00: 2
 8.00  3.00  0.00: 1
 8.00  4.00  1.00: 3
 8.00  5.00  1.00: 3
 8.00  6.00  1.00: 3
 9.00  2.00  1.00: 2
 9.00  3.00  1.00: 2

Press enter to exit...



And below are some links i found with examples on how to create a faster library for it:
https://github.com/Eleobert/dbscan
https://proceedings.mlr.press/v157/weng21a.html

I didn´t tested this new routines yet, neither ported it co Plain C or assembly yet. Thinking....
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

six_L

Hi,guga
First of all, congratulations on the progress you have made in your research.
Next,thanks you for providing a lot of useful information about DBSCAN. As is known to all, if use the Bing searching, i'll get very little useful information, but waste a lot of time.

Thank you very much!
;-------------------------------------
1) About image analysing, I remember a thing.
long time ago, There was a criminal who had specialized in robbing money and killing people in front of some banks. More than ten people have been killed. the police ordered the arrest of the criminal at making public, had released front and side photos of the criminal(he has been gone to jail before).When the criminal had committed an offence again, the police and military dispatched more than 5000 armed personnel to surround and annihilate the criminal.the criminal had died. the on-site photos were very bloody. Only the one ear could be recognized. Due to the difference in the two ear in the published photos, many people doubted: the person who was killed was not the criminal. then the police wrote a article to prove: the person who was killed was the criminal. I ultimately believed the police. I also understood a bit of the principle of projection.

2) the Technology of Image Generation
light; radar; infrared ray; ultraviolet rays; radio telescope; ultrasonic wave; nuclear magnetic resonance. These images have different characteristics.

3)about distance
there are the euclidean distance,the riemannian distance, the cosine distance,the manhattan distance.

4)trainning the DBSCAN model.
We need typical image datas to train the DBSCAN model, get the optimum epsilon and minPts parameters.

5)the DBSCAN model verificating datas.
i found the most DBSCAN used the mathematical datas,not the real image datas. Only the embryonic DBSCAN used the real image datas.

regards.
Say you, Say me, Say the codes together for ever.

guga

Hi Six_L

tks you :)


About the usage in real images, it is possible to do. Most likely the same way on the embryo image used in javascript. The main problem seems to be the insane amount of memory it needs to be allocated in order to make it work faster. I´m thinking how to overcome this.
If i succeed to make a routine to make it faster on a 1-D data, the same can also be used in any other dimensions i suppose.

Btw...I also found a way to identify the maximum values of a cluster. The function is like this:

;;
    Max Points = ((k+1)/2)^D
;;

[DBSCAN_DIM_1D 1] ; 1 Dimension
[DBSCAN_DIM_2D 2] ; 2 Dimensions
[DBSCAN_DIM_3D 3] ; 3 Dimensions


Proc GetMaxPoints:
    Arguments @DimensionFlags
    Local @kFactorPlus1

    mov eax D@DimensionFlags
    .If_Or eax = DBSCAN_DIM_1D, eax = 0
        mov eax 2 ; dimensions*2) = 1*2
    .Else_If eax = DBSCAN_DIM_2D
        mov eax 4 ; dimensions*2) = 2*2
    .Else_If eax = DBSCAN_DIM_3D
        mov eax 6 ; dimensions*2) = 3*2
    .Else_If_And eax > 3, eax < 7
        mov eax 7
    .Else_If eax > 10
        inc eax
    .End_If
    inc eax
    mov D@kFactorPlus1 eax | imul eax eax | imul eax D@kFactorPlus1
    shr eax 3 ; divide by 8

EndP


This time, chatgpt was usefull in helping me finding the maximum values. It answered me that:
QuoteTo calculate the maximum number of points in a dataset for a given value of kk and a specific dimensionality D, you can use the following formula:

Max points= ((k+1)/2)^D

Where:

x denotes the floor function, which returns the largest integer less than or equal to x.
k is the number of nearest neighbors used in the density estimation.
D is the dimensionality of the data.

For example, let's say you want to calculate the maximum number of points for k=6 in 3D data:

Max points= (((6+1)/2)^3)
Max points= ((7/2)^3)
Max points= (3.5^3)
Max points= 42.875
Max points= 42

So, for k=6 in 3D data, the maximum number of points would be 42.

So, each cluster now is properly identifiable by a minimum and maximum number of points it can have.

Maybe Knowing the maximum amount of points, can give me more clues on how to optimize this whole thing.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

I´m currently reviewing he code for min and maxpts and also trying to implement a routine to make this thing faster
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

six_L

Hi,guga
i want to select a part in image for dbscaning, this is creating some tested datasets automatically. but the SCROLL of static can't work. now i am progressing slowly.

regards
Say you, Say me, Say the codes together for ever.

guga

That´s pretty good  :greenclp:  :greenclp:  :greenclp:

I´m doing some more tests to see if i can optimize it even further. Dbscan seems a impressive algo for images. For noise removal, using dbscan maybe can do a wonderful thing in removing all that is noise. Perhaps we simply have to run the algo a couple of times to remove the noise and thus enhance the image completely even without the needs to use large models etc.

I was thinking that for noise removal, all we need to do relies in a few steps working on separated channels or the full image RGB contents. Personally, i would try with separated channels 1st.

The strategy maybe resume to this:

1 - Run Dbscan once to identify noises all over the image. It will result in well formed clusters and the corresponding noises that may exists near them.
2 - Create a routine to identify where the noises are and then remove them using the average of the values of the clusters that surrounds that noise or are closer to them. For example, let´s say that we run all the algo in separated channels.

So we identify the noises only on Red Channel, then we check if the noise is close to one or more clusters. We then calculate the average (Or standard deviation - that maybe better) of the colors of the cluster (or clusters) nearby the noise and apply the resultant value on that specific noise pixel. Then we go through all the image and do it all over again, until we eliminate  the majority of the noises

We do that step only when we have 1 noise pixel surrounded or closer of a good cluster. If we have 2 or more pixels noises close to each other (even if they are closer or surrounded by clusters), we skip this routine, because we are mainly focusing in isolated noises pixels.

3 - Now, our image/channel contains either no noise (if all noises were isolated ones) or may still contains small chunks of noises near each other and surrounded by good clusters. Since we worked on isolated noises previously and settled their new values  biased on the average (or standard deviation) of the clusters nearby, we then run DbScan again.

Why ? Because since we applied new values for what was noise, then we may have formed new set of clusters that uses those new pixels, thus enhancing the image a bit more.

4 - If the results on step 3 still contains noises (or a new set of noises) we do steps 1-3 all over again.

This is done until we can reach no noises on the resultant image

Also, we can do a routine to go out in case of infinite loops that maybe generated in the steps 1-4 results always in small chunks of noises closer to each other. On that case, we can go out the loop, and do a couple of things:

a) End the routine, since the majority of noises are now over.
b) create another routine to deal with those small chunks of noises closer to good clusters.  And then create a new strategy on how to handle those. Perhaps, starts checking for small groups of odd and even noises. For example
1 -  if we have 2 noises surrounded by 3 good clusters. We apply the average/standard deviation biased on the noise that are closer to a given cluster. (See image attached).
Noise1 is closer to Cluster A than from B or C. So, the Average (or Standard deviation) Color of Cluster A influences more on the result then Cluster B or C. So we can apply a weighted of the average colors of them, perhaps biased on the distance from where they are from each other.
Noise 2 is closer to Cluster B than from Cluster A or C, then we do it the same as above
c) once we do this, we run dbscan all over again....and keep doing this untill all noises are removed



Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

New version

A bit faster, but i´m not quite sure about the results. ´ll make a double check later.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

six_L

Hi,guga
The translation of DBSCAN.C completed. Compilation has been passed, Testing is still in progress.
DBSCAN.C has some mazy data structures. Only "euclidean_dist" and "fcpu_cmp_val" works normally.
;¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
;point_s STRUCT        ;this is the struct from dbscan.cpp, we need to analyse, modify
;    x        REAL8 ?
;    y        REAL8 ?
;    z        REAL8 ?
;    cluster_id    DWORD ?
;point_s ENDS
;point_t typedef ptr point_s
point_s STRUCT        ;this is the struct from dbscan.cpp, we need to analyse image, modify
    x        QWORD ?    ; x coordinate
    y        QWORD ?    ; y coordinate
    z        QWORD ?    ; Pixel
    cluster_id    QWORD ?
point_s ENDS
point_t typedef ptr point_s

dist_t typedef ptr dist
dist STRUCT
    a    point_s <>
    b    point_s <>
dist ENDS

node_t typedef ptr node_s
node_s STRUCT
    index        QWORD ?
    next        point_t ?
node_s ENDS

epsilon_neighbours_t typedef ptr epsilon_neighbours_s
epsilon_neighbours_s STRUCT
    num_members    QWORD ?
    head        node_t ?
    tail        node_t ?
epsilon_neighbours_s ENDS
;¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
create_node            PROTO index:QWORD
append_at_end            PROTO index:QWORD,en:epsilon_neighbours_t
get_epsilon_neighbours        PROTO index:QWORD,points:point_t,num_points:QWORD,epsilon:REAL8,@dist:dist_t
print_epsilon_neighbours    PROTO points:point_t,en:epsilon_neighbours_t
destroy_epsilon_neighbours    PROTO en:epsilon_neighbours_t
dbscan                PROTO points:point_t,num_points:QWORD,epsilon:REAL8,minpts:QWORD,@dist:dist_t
expand PROTO index:QWORD,cluster_id:QWORD,points:point_t,num_points:QWORD,epsilon:REAL8,minpts:QWORD,@dist:dist_t
spread PROTO index:QWORD,seeds:epsilon_neighbours_t,cluster_id:QWORD,points:point_t,num_points:QWORD,epsilon:REAL8,minpts:QWORD,@dist:dist_t
euclidean_dist            PROTO a:point_t,b:point_t
adjacent_intensity_dist        PROTO a:point_t,b:point_t
parse_input            PROTO hWndSrc:QWORD
print_points            PROTO points:point_t,num_points:QWORD
fcpu_cmp_val proc Val_1:REAL8, Val_2:REAL8
   
    finit
    fld    Val_1
    ;fild    val_2        ;if val_2 is int.
    fld    Val_2
    fcompp 
    fstsw    ax              ;retrieve exception flags from FPU
    fwait
    shr    al,1            ;test for invalid operation
    jc    @err            ;clean-up and return result
    sahf                    ;transfer to the CPU flags
    jpe    @err            ;error if non comparable
    ja    @Greater
    jc    @F
    mov    rax,3        ;val_1 = val_2
    jmp    @Exit
@@:
    mov    rax,1        ;val_1 < val_2
    jmp    @Exit
@Greater:
    mov    rax,2        ;val_1 > val_2
@Exit:
        ret
@err:
    xor    rax,rax        ;rax = 0 -> Error
    ret

fcpu_cmp_val endp
;¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
euclidean_dist proc uses rsi rdi a:point_t,b:point_t
    Local x1:REAL10
    Local x2:REAL10
   
    finit
    mov    rsi,a
    mov    rdi,b

    ;sqrt[(x2-x1)^2+(y2-y1)^2+(z2-z1)^2]
    fild    [rsi].point_s.x        ;x1->st0
    fild    [rdi].point_s.x        ;x2->st1
    fsub                ;st0-st1
    fst    st(1)            ;st0->st1
    fmul                ;st0*st1
    fstp    x1   

    fild    [rsi].point_s.y        ;x1->st0
    fild    [rdi].point_s.y        ;x2->st1
    fsub                ;st0-st1
    fst    st(1)            ;st0->st1
    fmul                ;st0*st1
    fstp    x2   

    fild    [rsi].point_s.z        ;x1->st0
    fild    [rdi].point_s.z        ;x2->st1
    fsub                ;st0-st1
    fst    st(1)            ;st0->st1
    fmul                ;st0*st1
   
    fld    x2
    fadd
    fld    x1
    fadd

    fsqrt               
    ;result in st0,
    ;Local ResultR8:real8,Local ResultR10:real10   
    ;invoke    euclidean_dist, addr aa,addr bb
    ;fstp    ResultR8 ;fstp    ResultR10

    ;fstp    rVal

        ret

euclidean_dist endp

regards
Say you, Say me, Say the codes together for ever.

guga

Hi Six_L

Nice work. Indeed, this algorithm has some strange structures. I succeeded to port to RosAsm, but still it requires a insane amount of memory to be allocated specially when our final goal is processing images.

But, in the meanwhile, i found an very very interesting approach that claims to be a improvement for dbscan. Take a look. I suceeded to compile in Pelles and the results are great.

It do has some flaws, like can´t handle some types of images, or it shows a weird diagonal line in one of my tests, but it seems promising. I`m trying to port it right now. It uses basically 2 main functions, and it is really fast


https://github.com/DIP-BurdwanUniversity/DBSCAN-ImageClustering/tree/master/input
https://github.com/DIP-BurdwanUniversity/DBSCAN-ImageClustering
https://biswajit.netlify.app/pages/dip-burdwanuniversity/
https://github.com/DIP-BurdwanUniversity
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Ok, gentlemen. 1st version Pre-alpha released :biggrin:  :biggrin:  :biggrin:

File attached. It may crash sometimes, i suppose. I´ll make further tests later and some small bug fixes. So far, was mainly to test for speed of this algorithm from the BurdwanUniversity

It´s really good, in fact. At the end, their algorithm is basically creating a density map rather than the "normal" clusters and noises in a conventional dbscan algorithm. But this can also be implemented.

The good news is that it does not take tons of gigabytes and is really, really fast.

Note: Source code embedded in main app (as usual, in RosAsm files). Later i´ll clean up the code (it´s a huge, huge mess and post it here on the next release)

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

six_L

Say you, Say me, Say the codes together for ever.

guga

#27
Hi Six_L tks

It was hard to understand and port this thing, but i suceeded to do it.I´ll try later to create a sort of tutorial on how this thing works (at least the part showed by the article at BurdwanUniversity)

Basically, it takes only the maximum difference between 8 neighbors pixels at every pixel we are working with. 1St it converts the image to grayscale and uses only the unique generated color corresponding to grey, then it creates a map of the grey pixels (Size = width*height*4 = 4 because we are dealing with a dword (In fact, i converted to Real4 to gain extra speed in SSE2) to make the computation easier). For example, say that on a certain region of the image, the pixels are displaced as (already converted to gray):

99    140    185
102    63    162
58    113    138

And we are at the center of the area. So, we are at pixel whose value = 63

The original article (and all other places i read so far), tell us to get the difference between each pixel and the one we are in (99-63), (140-63)....(138-63). Then some articles says that we have to get the minimum of maximum distance etc etc while the article at Burdwan University says to we get only the maximum difference between all of them and use the resultant value to create a Density map.

But all of this consumes a lot of processing time. So, all i had to do is go straightforward finding the maximum value of all of the 8 neighbors pixels and simply subtracted that value from the one at the center.

So, instead calculating all the 8 deltas and only later finding the bigger of them etc etc . All is needed, in fact, is the maximum of those 8 and subtract directly. Then we divide the value by 8 pixels

So, in this example, the maximum value of the neighbor pixels = 185. Therefore, the value of the density map = (185-63)/8.

Using SSE2 it takes a few operations to do that. All we need is locate the starting address at the top and bottom (3 pixels each), put them on a xmm register and calculate he maximum value of them.
Then,, we do the same but only for the pixels on the left and right of the center. Get the max of both.
And finally, we get the maximum of the resultant 2 values (Top/Bottom and Left/Right).

There´s no need for subtract all those 8 pixels, get their results, make them absolute and only later, get their maximum. This is simply because, no matter what subtraction or deltas you will have, the maximum delta will always be the one that contains the maximum value of one of the 8 pixels.

So, all we need is find the max value of the 8 pixels and subtract it from the center. And then divide the resultant delta by 8 (Because we have 8 neighbor pixels)

Since the algo takes 8 pixels surrounding each one, the better when start all over this, is create the grey table with a extended size. So, width+1 and height+1 and put the image at the center of that buffer area, making the borders (1 pixel each) useless, but preventing crashes and other computations to adjust the border all over the time.

On this way, if we start the algo say, at position x = 0. When he try to get the pixel on the left, it wont crash, and since it´s value would be 0 , it won´t affect too much the result (except on the borders and corners) that will take, in fact, 3 to 5 pixels to calculate the max, instead the regular 8.


So, in memory, our grey image map will be like this:

Padding1    Padding2    Padding    Padding    Padding    Padding    Padding    Padding    Padding    Padding
Padding2    148    96    98    98    66    139    142    24    Padding
Padding3    120    90    78    149    161    101    112    46    Padding
Padding    97    97    99    140    185    180    105    140    Padding
Padding    106    99    102    63    162    115    115    139    Padding
Padding    95    107    58    113    138    82    134    176    Padding
Padding    97    102    91    53    121    78    156    204    Padding
Padding    126    93    68    57    145    155    135    58    Padding
Padding    137    95    33    86    122    169    66    37    Padding
Padding    Padding    Padding    Padding    Padding    Padding    Padding    Padding    Padding    Padding


Where padding, is nothing less then a null value used to adjust the problem at the borders and corners.

The whole function is :


; equates related to pshufd macro
[SSE_ROTATE_RIGHT_96BITS    147]
[SSE_ROTATE_LEFT_32BITS     147]

[SSE_ROTATE_LEFT_96BITS     57]
[SSE_ROTATE_RIGHT_32BITS    57]

[SSE_SWAP_QWORDS 78]         ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_ROTATE_LEFT_64BITS 78]  ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_ROTATE_RIGHT_64BITS 78] ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS
[SSE_ROTATE_64BITS 78]       ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS

[SSE_INVERT_DWORDS 27]      ; invert the order of dwords
[SSE_SWAP_DWORDS 177]       ; Dwords in xmm are copied from 0123 to 1032 ordering. The same as: pshufd XMM0 XMM0 {SHUFFLE 1,0,3,2}
[SSE_SWAP_LOQWORD 225]      ; Only the 2 dwords in the LowQword are swaped. Dwords in xmm are copied from 0123 to 0132 ordering. The same as: pshufd XMM0 XMM0 {SHUFFLE 0,1,3,2}
[SSE_SWAP_HIQWORD 180]      ; Only the 2 dwords in the HighQword are swaped. Dwords in xmm are copied from 0123 to 1023 ordering. The same as: pshufd XMM0 XMM0 {SHUFFLE 1,0,2,3}


; Macros used
[SHUFFLE | ( (#1 shl 6) or (#2 shl 4) or (#3 shl 2) or #4 )] ; Marinus/Sieekmanski
[pshufd | pshufd #1 #2 #3]

[SSE2_MAX_4FLOATS | maxps #1 #2 | pshufd #2 #1 SSE_SWAP_DWORDS | maxps #1 #2 | pshufd #2 #1 SSE_ROTATE_64BITS | maxps #1 #2]

[SSE_ABS_REAL4 | pslld #1 1 | psrld #1 1]


; Variables used

[NeighboursDivision: F$ (1/8), F$ (1/8), F$ (1/8), F$ (1/8)]
[MaxDensity: F$ 0, 0, 0, 0] ; ge the maximum density found to we use in the scale to density
[DensityRange: F$ 255, 255, 255, 255] ; to be used in Scale to density)

Proc EightNeighbourDistance:
    Arguments @pInput, @pOutput, @Height, @Width, @Padding, @DensityRatio
    Local @CurYPos, @NextRow, @PreviousRow, @NextOutput, @NextInput, @AddPaddingStart
    Uses ebx, esi, edi, edx, ecx

    xor eax eax
    If_Or D@Height = 0, D@Width = 0
        ExitP
    End_If

    movups xmm3 X$NeighboursDivision
    movups xmm4 X$MaxDensity

    mov eax D@Padding | shl eax 2 | mov D@AddPaddingStart eax
    ; calculate the address of the next input row
    mov eax D@Padding | shl eax 1 | add eax D@Width | shl eax 2 | mov D@NextInput eax

    ; calculate the address of the next output row
    mov eax D@Width | shl eax 2 | mov D@NextOutput eax

    mov eax D@Padding | mov ebx eax | add eax D@Width | shl eax 2 | mov D@NextRow eax; 9
    shl ebx 1 | shl ebx 2 | add ebx eax | mov D@PreviousRow ebx
    ; Input  start at

    mov eax D@Height | mov D@CurYPos eax
    mov edx D@pInput | add edx D@NextInput
    mov edi D@pOutput
    ..Do
        mov ebx edx | add ebx D@AddPaddingStart
        xor esi esi
        .Do
            ; get previous line (Up)
            lea eax D$ebx+esi*4 | sub eax D@PreviousRow | movdqu xmm1 X$eax; | SSE_CONV_4INT_TO_4FLOAT xmm1 ; convert the integers to floats
            pshufd xmm1 xmm1 {SHUFFLE 2,2,1,0} ; the 4th Float is not needed for this comparision, because we want only 3 (padding + width + padding)

            ; get next line (Down)
            lea eax D$ebx+esi*4 | add eax D@NextRow | movdqu xmm0 X$eax; | SSE_CONV_4INT_TO_4FLOAT xmm0 ; convert the integers to floats
            pshufd xmm0 xmm0 {SHUFFLE 2,2,1,0} ; the 4th Float is not needed for this comparision, because we want only 3 (padding + width + padding)

            SSE2_MAX_4FLOATS xmm0 xmm1

            ; now get left
            movdqu xmm1 X$ebx+esi*4-4 | pshufd xmm1 xmm1 {SHUFFLE 0,0,0,0}

            ; now get right
            movdqu xmm2 X$ebx+esi*4+4 | pshufd xmm2 xmm2 {SHUFFLE 0,0,0,0}
            maxps xmm1 xmm2

            ; and finally get the maximum between up and down and left to right
            maxps xmm1 xmm0

            ; get our core pixel
            mov eax D$ebx+esi*4 | movd xmm0 eax | pshufd xmm0 xmm0 {SHUFFLE 0,0,0,0}

            ; subtract pixels from right from the core
            subps xmm0 xmm1 ; Subtract corresponding integers

            SSE_ABS_REAL4 xmm0 ; and convert them to absolute

            mulps xmm0 xmm3

            ; get the maximum density
            maxps xmm4 xmm0
            movd D$edi+esi*4 xmm0

            inc esi
        .Loop_Until esi => D@Width

        add edx D@NextInput
        add edi D@NextOutput
        dec D@CurYPos
    ..Repeat_Until_Zero

    ; finally calculate the density ratio to be used in our new scale_density_to_image function
    mov eax D@DensityRatio
    movups xmm0 X$DensityRange | divss xmm0 xmm4 | movd D$eax xmm0

EndP

I´ll clean up the code later, but it only uses 2 functions to do all the trick. In fact, it uses only one "EightNeighbourDistance" which is responsible to create a map of the density (size = width*height*4 only  :azn:  :azn: )  (It creates a sort of a "pixeldensity" or "DensityPixel" or whatever need to call this stuff), while another function scale_density_to_image is responsible only to make that map be visible since the values of the density map are around 10% of the original grey ones we inputted. That´s why i calculated the maximum density and the ratio inside EightNeighbourDistance at once, to avoid extra checking on the scale_density_to_image function.

So, the final value of the pixel (to it be visible) is basically DensityPixel*DensityRatio

I added a parameter "DensityRatio" on the function EightNeighbourDistance (responsible to create the density map) to make easier to we make the pixels be visible again from the scale_density_to_image function.

The density ratio is basically this:
Ratio = 255/MaxDensity

Max Density is the maximum value found in the density map at EightNeighbourDistance function. And 255 is because this is the maximum value of any pixel.

So, if in the density map we found that the maximum value of the density image is let´s say: 28. We simply calculate the ratio r = 255/28

Then, to make the pixels visible again, we do it in anther function (scale_density_to_image) . So, on that function, we take the value of each "PixelDensity" and multiply with our Ratio.

To convert the pixel density to visible pixels we simply do:

Final Pixel Value= PixelDensity*(255/28)


Could we use euclidean distance to calculate the value of the density map ? Sure...but the question is why ? Using euclidian distance may result in 2 disadvantages:
1 - Speed
2 - Accuracy
The speed problem could be solved using a Table of 256 floats (Real4) containing the correspondent value of the Euclidean Distance (since it will result in values from 0 to 255 anyway). So, the max delta could be applied to a table containing 256 predefined values of Euclidean distances.

But the problem seems to be accuracy. Using euclidean distances may result in a higher values of Density which could not be accurated when classifyig the clusters. Ex:

if at a given pixel we calculate the density using euclidean as:
Maximum value of the 8 Neighbors = 90
Center Pixel to get the delta 40

Euclidean Distance = sqrt(90^2-40^2) = 80.62....

But, using a simple delta, we get:
Density Map (simple delta) = 90-40 = 50

So, it seems that using a simple delta from the max of the 8 neighbours and the pixel we are using to create (or classify) in the density map seem enough and more accurate for me - Not to mention, way faster. (Is the same conclusion, btw found in the article at the Burdwan University)
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

The algorithm seems really food to apply in other filters. I made a quick test using the Density Map over each RGB channel of the image and the result is pretty good to enhance the contours.

It can be refined, but for now it´s just a quick test. Didn´t checked for bugs, and probably the app will crash if you try to open a new image without closing the app (I didn´t created the routine yet to release the memory since it was just a quick test.

What it does is get the values of each channel separately and subtract it from the value of the pixel found in the density map (I used the scaled value for this purpose). Then apply only the delta and use this new value as the one on each channel. Like this:

Channel Red at a given position
Pixel 30 48 150

Density Map 48 48 48 (Already scaled)


Get the delta
Pixel = Abs(30-48) (48-48) (150-48) and voilà !

In some images it may appears to increase some noises but, in fact what it is doing is enhanced the noises that were already in the original image or existent to bad encoding etc.

The better to use this techique should be apply the delta only on the luma with CieLCH colorspace, but i have no time right now to fix my CieLCH routines

Anyway, for a test, it seems very good. Now i´m proceeding to try to create the Classification (Cluster types and homogeneity etc)

Note: use the same dlls i posted in my previous version. I released only the executable this time.


Btw, calculating the delta on all of 32 bytes on xmm register revealed o be very tricky in SSE2. But here is the part of the code that do it

            movups xmm0 X$ebx+edx*8 ; original image dest
            movups xmm1 X$esi+edx*8 ; density src

;            SSE2_ABS_BYTE_DELTA xmm0 xmm3

    ; calculate the delta on each one of the 32 bytes on the xmm register
   
    ; Backup original xmm0 and xmm1
    movaps xmm7, xmm0   ; xmm7 = xmm0
    movaps xmm6, xmm1    ; xmm6 = xmm1

    ; Calculate xmm0 - xmm1
    psubusb xmm0, xmm1    ; xmm0 = max(xmm0 - xmm1, 0)

    ; Calculate xmm1 - xmm7
    psubusb xmm6, xmm7 ; xmm6 = max(xmm1 - xmm7, 0)

    ; Take the maximum of both results to get the absolute difference
    por xmm0, xmm6      ; xmm0 = max(xmm0, xmm6)
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com