News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Comparing K-Means and Others Algorithms for Data Clustering in Assembly.

Started by HSE, January 30, 2024, 07:36:03 AM

Previous topic - Next topic

HSE

Hi all!

Some weeks ago I found the article: Comparing K-Means and Others Algorithms for Data Clustering by Nicolás Descartes. With C# source code.

Look very interesting to translate that to Assembly. Still is a work in progress but look well.

There is some kind of abuse of Collections, because is very easy to write that in C#. I removed most obvious exagerations (perhaps they try to make more clear the algorithm, not sure).

- K-Means strategy need initial randomness, then probably you have to run several times before to find the better solution.

- Hierarchical strategy need to define termination, and in these cases are number of clusters.

This 2 don't need collections indeed, but I follow the author here. I use a K-Means method I wrote in PowerBasic for DOS three weeks ago :biggrin:

- Density-based spatial clustering of applications with noise really benefit from Collections, and also need a couple of Sorted Vectors.

Using this vectors happen that .VecForEach/.VecNext have some problems if you use more than one kind of vector.

Anyway is a good challenge, and still I have to collect some leaks  :biggrin:

Any sugestion or improvement is welcome!

Regards, HSE

Source Code and 64bit Binary updated 9 March 2024 in GitHub
.
Equations in Assembly: SmplMath

Biterider

Hi HSE
Very interesting and cool project, I'll read the CP article first to get a deeper understanding.  :thumbsup:
 
QuoteUsing this vectors happen that .VecForEach/.VecNext have some problems if you use more than one kind of vector.
After reading the code, I think I understand the problem. Since the vectors are of different sizes, the macros don't take this into account, with disastrous results. The good news is that we can do better.  :cool:

Biterider


HSE

Hi Biterider!

Quote from: Biterider on January 30, 2024, 08:33:36 AMVery interesting and cool project, I'll read the CP article first to get a deeper understanding.  :thumbsup:

Thanks  :thumbsup:

Quote from: Biterider on January 30, 2024, 08:33:36 AMThe good news is that we can do better.  :cool:

Yes. Just my first idea fail  :biggrin:  :biggrin: , and is not necessary right now.

In comparison, .ColForEach/.ColNext was critical, and work "impecable".

HSE
Equations in Assembly: SmplMath

HSE

Equations in Assembly: SmplMath

Biterider

Hi HSE
I'm trying to compile the "Clusters" application, but I get an error that I can't fix:

forced error: [back end: x86-32/64,FPU][#CG6] function is undefined or not supported by current back end: ceil
I downloaded the latest version of SmplMath from GitHub, but that didn't solve the problem. Also, there seems to be a missing file called IfElseM.inc, which I had to borrow from another previous installation.

Without being able to compile the above application, I tried to solve the TVector problem, assuming that the problem was due to different element sizes, and prepared a modification that might solve it.

Would you give them a try?

Regards, Biterider

HSE

Hi Biterider!

Quote from: Biterider on January 31, 2024, 05:29:14 AMAlso, there seems to be a missing file called IfElseM.inc, which I had to borrow from another previous installation.

:biggrin:  :biggrin: Thanks

In math.inc you can change "IfElseM.inc" by "FlowControl.inc", but I think nothing of that is used.


Quote from: Biterider on January 31, 2024, 05:29:14 AMforced error: [back end: x86-32/64,FPU][#CG6] function is undefined or not supported by current back end: ceil

In math_functions.inc you can add:
fslv_fnc_ceil macro
    IFE fslv_volatile_gprs AND FSVGPR_EAX
        T_EXPR(<push eax>,<mov [rsp+8],rax>)
    ENDIF
    fstcw WORD ptr T_EXPR([esp-2],[rsp])
    movzx eax,WORD ptr T_EXPR([esp-2],[rsp])
    or eax,0800h
    mov T_EXPR([esp-4],[rsp+2]),ax
    fldcw T_EXPR([esp-4],[rsp+2])
    frndint   
    fldcw WORD ptr T_EXPR([esp-2],[rsp])
    IFE fslv_volatile_gprs AND FSVGPR_EAX
        T_EXPR(<pop eax>,<mov rax,[rsp+8]>)
    ENDIF
endm
default_fnc_dscptr2 <ceil>,nArgs=1,fpu=-1,x64=-1

Currently I use the Complete SmplMath package, with DoubleDouble precision and complex numbers, but package size and preprocess are bigger. Perhaps I have to post a Full version, that nobody will use  :biggrin:  :biggrin:


Previously, .VecForEach don't know about first kind of vector, now about last vector.

In .VecNext apparently must be
      inc dword ptr @CatStr(<??VecForEach_Index_>, %??VecForEach_ID)
Regards, HSE
Equations in Assembly: SmplMath

Biterider

Thanks HSE
Now I can compile the application.  :thumbsup:
Could you show me the places where you wanted to use the TVector macros?

Biterider

Biterider

Quote from: HSE on January 31, 2024, 07:12:23 AMPerhaps I have to post a Full version, that nobody will use  :biggrin:  :biggrin:
At least you can count with one  :thumbsup:

Biterider

HSE

Quote from: Biterider on January 31, 2024, 08:04:51 AMCould you show me the places where you wanted to use the TVector macros?

Was for testing porpouses, because for debugging I used the classic Randy loop.

StrategyForClusters.inc line 597 and line 629. For this sorted vector is TVectorNameS
Equations in Assembly: SmplMath

HSE

Quote from: jj2007 on January 30, 2024, 10:37:43 AMUse the F1 key :biggrin:

Updated in first post with F1 and Alt+F1 keys

The help is a PDF from Word, very big. I will have to build fron TE:biggrin:
Equations in Assembly: SmplMath

Biterider

Quote from: HSE on January 31, 2024, 08:13:35 AMStrategyForClusters.inc line 597 and line 629.
Looking at these lines, I think I have it now.
The best way to provide the missing information is to use something like this (for vectors only)

.VecForEach [ebx]::Real8VectorS
  ...
.VecNext

If there are no objections, I will have a go at it.  :biggrin:

Biterider


HSE

Equations in Assembly: SmplMath

jj2007

Quote from: HSE on January 31, 2024, 08:40:44 AMThe help is a PDF from Word, very big

You are right, it's far too big. Add the two attached files to your folder, and when user presses VK_F1, do an invoke WinExec, chr$("clusters.rtf"), SW_RESTORE :cool:

NoCforMe

Side comment:

PDF from Word? Gag, puke, retch, barf. Have you ever looked at one of those abominations? Boy, Micro$oft really screwed up with that conversion! Yikes!

We now return you to your regularly scheduled thread.
Assembly language programming should be fun. That's why I do it.