Author Topic: Multiply matrix MxN by NxK real4 any size  (Read 372 times)

RuiLoureiro

  • Member
  • ****
  • Posts: 819
Multiply matrix MxN by NxK real4 any size
« on: August 13, 2018, 02:34:20 AM »
:biggrin: 
Hi all
       Here we have 3 versions to multiply any matrix MxN by
       any matrix NxK using SSE instructions and 1 that uses FPU.

Quote
      VERSION 6:
                PROCEDURE:  MultiplyMxN_NxK_v6SSE
               
                FILE:              multiplySSEMxN_MxK_v6.inc
               
                MACROS:       multiplyMxN_MxK_v6A.mac   <<-- to solve all cases A
                                    multiplyMxN_MxK_v6B.mac    <<-- to solve all cases B
                                    basicmulMxN_MxK_v6.mac

      VERSION 5:
                PROCEDURE:  MultiplyMxN_NxK_v5SSE
               
                FILE:       multiplySSEMxN_MxK_v5.inc
               
                MACROS:     multiplyMxN_MxK_v5A.mac
                                  multiplyMxN_MxK_v5B.mac
                                  basicmulMxN_MxK_v5.mac

      VERSION 4:
                PROCEDURE:  MultiplyMxN_NxK_v4SSE
               
                FILE:       multiplySSEMxN_MxK_v4.inc
               
                MACROS:     multiplyMxN_MxK_v4A.mac
                                  multiplyMxN_MxK_v4B.mac
                                  basicmulMxN_MxK_v4.mac

      VERSION FPU:
                PROCEDURE:  MultiplyMxN_NxK_v1FPU
               
                FILE:       multiplyFPUMxN_MxK_v1.inc


    DOCUMENTATION:          TEXT_ABOUT_MULTIPLY_SSE_REAL4.txt

    MATRIX DEFINITION:     We must define any matrixX as this

                            ALIGN 16
                            dd ?
                            dd ?
                            dd N   ; <<--- number of columns
                            dd M   ; <<--- number of lines
             matrixX  dd (M*N) dup (?)         

                            If we want to alloc memory, see the file AllocMemory.inc

    VERIFY SSE PROCEDURES:  Use multiplyMxN_MxK_v6.exe/asm, multiplyMxN_MxK_v5.exe/asm
                                             or multiplyMxN_MxK_v4.exe/asm

    Please test it in your CPU (i5/i7/AMD).
    Use ExecuteTestmultiplyMxN_MxK_SSEv6.bat and post the file ResultsmultiplyMxN_MxK_v6.txt.

particular note: i started this work taking an example given by Siekmanski :t

Good luck
RuiLoureiro
EDIT: replace the FPU procedure ...
« Last Edit: August 17, 2018, 06:18:10 AM by RuiLoureiro »

Siekmanski

  • Member
  • *****
  • Posts: 1684
Re: Multiply matrix MxN by NxK real4 any size
« Reply #1 on: August 14, 2018, 04:58:54 AM »
Hi Rui,

Here are the results from my machine.
Creative coders use backward thinking techniques as a strategy.

LiaoMi

  • Member
  • ***
  • Posts: 323
Re: Multiply matrix MxN by NxK real4 any size
« Reply #2 on: August 14, 2018, 10:18:21 PM »
Hi RuiLoureiro,

my test results ..

jj2007

  • Member
  • *****
  • Posts: 8822
  • Assembler is fun ;-)
    • MasmBasic
Re: Multiply matrix MxN by NxK real4 any size
« Reply #3 on: August 15, 2018, 12:43:50 AM »
Results for my Core i5.

HSE

  • Member
  • ****
  • Posts: 839
  • <AMD>< 7-32>
Re: Multiply matrix MxN by NxK real4 any size
« Reply #4 on: August 15, 2018, 01:53:38 AM »
AMD A6-3500

RuiLoureiro

  • Member
  • ****
  • Posts: 819
Re: Multiply matrix MxN by NxK real4 any size
« Reply #5 on: August 16, 2018, 02:42:50 AM »
Hi all

Interesting results to multiply a vector 1x6 by 6x6 by lines v2 -Multiply1x6Real4By6x6v2
note: multiply by lines means this: 1x6 * (6x6)^t

Thank you all :t 
Quote
LiaoMi:   :t  :t

AMD Ryzen 7 1700 Eight-Core Processor           (SSE4)

  24  cycles, Multiply1x6Real4By6x6v2,  MatrixX1x6 * MatrixY6x6 <<<<He he.. even better !!!

Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz (SSE4)

  27  cycles, Multiply1x6Real4By6x6v2,  MatrixX1x6 * MatrixY6x6

Siekmanski:

Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz (SSE4)

  42  cycles, Multiply1x6Real4By6x6v2,  MatrixX1x6 * MatrixY6x6

Jochen:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)

  46  cycles, Multiply1x6Real4By6x6v2,  MatrixX1x6 * MatrixY6x6

HSE:

AMD A6-3500 APU with Radeon(tm) HD Graphics (SSE3)

  59  cycles, Multiply1x6Real4By6x6v2,  MatrixX1x6 * MatrixY6x6
« Last Edit: August 16, 2018, 06:45:05 AM by RuiLoureiro »

LiaoMi

  • Member
  • ***
  • Posts: 323
Re: Multiply matrix MxN by NxK real4 any size
« Reply #6 on: August 16, 2018, 05:28:19 AM »
AMD Ryzen 7 1700 Eight-Core Processor (SSE4)