The MASM Forum

General => The Workshop => Topic started by: jj2007 on August 12, 2015, 07:19:26 AM

Title: Prime numbers
Post by: jj2007 on August 12, 2015, 07:19:26 AM
Just for fun :biggrin:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
It took 29.16 seconds to find 98222287 primes among 2000000000 numbers
2       3       5       7       11      13      17      19      23      29      31      37      41      43      47      53      59      61      67
71      73      79      83      89      97      101     103     107     109     113     127     131     137     139     149     151     157     163
167     173     179     181     191     193     197     199     211     223     227     229     233     239     241     251     257     263     269
271     277     281     283     293     307     311     313     317     331     337     347     349     353     359     367     373     379     383
389     397     401     409     419     421     431     433     439     443     449     457     461     463     467     479     487     491     499

...
1999999553      1999999559      1999999571      1999999609      1999999613      1999999621      1999999643      1999999649      1999999657      199999
9747    1999999763      1999999777      1999999811      1999999817      1999999829      1999999853      1999999861      1999999871      1999999873
1999999913      1999999927      1999999943      1999999973
Title: Re: Prime numbers
Post by: rrr314159 on August 12, 2015, 08:46:44 AM
Why is it so slow? 8)
Title: Re: Prime numbers
Post by: jj2007 on August 12, 2015, 08:57:26 AM
It's waiting for your fast 64-bit version :biggrin:
Title: Re: Prime numbers
Post by: rrr314159 on August 12, 2015, 09:50:09 AM
I have a 32-bit version somewhere which took a few seconds, but was outclassed by others mentioned on the net. Of course multiple threads helps

I have a question about Italy, might as well ask it now. Long ago Italy produced men like Caesar, the Cato's, Pompey, Brutus, Octavius Augustus .. and Horace, Virgil, Tacitus .. Marcus Aurelius ..: a long list of great men. More recently we have Dante Alighieri, Leonardo da Vinci, Enrico Fermi and other great geniuses. But since  1945 or so, nothing! Well there's jj2007, but that's about it. The question is, what the h**l went wrong??

Title: Re: Prime numbers
Post by: dedndave on August 12, 2015, 10:35:09 AM
Jochen is from Germany   :biggrin:
Title: Re: Prime numbers
Post by: jj2007 on August 12, 2015, 10:37:23 AM
Fellini, Pasolini, Abbado, Gucci, Armani, Ferrari, Maserati, Lamborghini, ... ;-)
Title: Re: Prime numbers
Post by: rrr314159 on August 12, 2015, 11:38:21 AM
Quote from: jjFellini, Pasolini, Abbado, Gucci, Armani, Ferrari, Maserati, Lamborghini, ... ;-)

- Right: like I said, since 1945 or so, a bunch of nobodies! Imagine telling Caesar, after he's spent the day slaughtering thousands of Germans and half the Senate, and while taking a quick break is deciding the fate of the entire known world: "We still have great men in Italy! For instance here's a guy Armani who designs clothes. Take off that blood-stained armor and try on this little number - you'll look so cute, darling!" He'd chop your head off and feed it to the dogs in the street.

What went wrong??
Title: Re: Prime numbers
Post by: jj2007 on August 12, 2015, 12:02:45 PM
If King Richard III shouted "My kingdom for a horse", what do you think Caesar would have offered for a Lamborghini???
Title: Re: Prime numbers
Post by: rrr314159 on August 12, 2015, 12:19:45 PM
Caesar would respect Henry Ford, or other great inventors, but not a mere mechanic. He'd put Lamborghini to work repairing cart axles and cleaning up after the horses. A car would be useless: where would he get gasoline, oil, brake fluid, tires, spare parts? At least, he'd prefer something very rugged and reliable - Lamborghini's break down constantly. Maybe a Lamborghini tractor would do, but no doubt he'd want something Japanese or Korean-made.

Anyway, anybody these days can have a car; me, you, Lamborghini are all way ahead of Caesar in that regard. But I'm talking about the quality of the man, not his toys.

What the h**l went wrong ??
Title: Re: Prime numbers
Post by: dedndave on August 12, 2015, 12:31:41 PM
(http://i.imgur.com/aVBZoNC.jpg)
Title: Re: Prime numbers
Post by: hutch-- on August 12, 2015, 12:42:25 PM
you forgot Monica Bellucci, a national treasure.

(http://www.effinhot.com/wp-content/uploads/2014/12/6af1da79629498336e7bee2b066d77291.jpg)
Title: Re: Prime numbers
Post by: xanatose on August 12, 2015, 12:56:07 PM
Quote from: rrr314159 on August 12, 2015, 12:19:45 PM
What the h**l went wrong ??

People got domesticated and complacent. And "leaders" got to rage war without putting their own skin on the line. That's what went wrong.

The same question could be asked for the Greeks. Is like they have chosen the worst traits of Athenians and Spartans instead of the best traits.
Title: Re: Prime numbers
Post by: rrr314159 on August 12, 2015, 01:22:58 PM
@dedndave and hutch,

Enzo Ferrari was somebody, can't deny that, but his glory days of skill and bravery were all before 1945 (the year I specified, as u remember). After that he was a mere businessman, who managed to win a few races, while killing seven of his drivers and numerous spectators. His dictatorial disregard of other's lives would, actually, be the only quality to win Caesar's respect ...

Likewise, I can't deny Monica Bellucci is somebody! Caesar would definitely have respected her and appointed her to a very important position. In fact, the entire army would have respected her - one after the other.

Come to think of it, the women have, indeed, improved since ancient days. There are extant pictures of Caesar's great passion, Cleopatra: if that's the best they had ... well, I'd hate to see the worst. More recently, look at paintings such as Mona Lisa. Ms. Bellucci is a huge improvement. Then there's Sophia Loren, and Gina Lollobridgida, and a million young ones I've never heard of. No doubt, the quality of the women is way up.

But, I'm talking about the quality of the men.

What went wrong ??
Title: Re: Prime numbers
Post by: rrr314159 on August 12, 2015, 01:46:36 PM
xanatose,

you must be a lot of fun at parties! ... I'm just trash-talking, don't really expect an answer. You're right, there are reasons why Italy is so much less than it was; but those same reasons apply to every country. The modern era is real tough on heroes. Italy is a good subject, not because its current crop is worse than anyone else's, but because its surpassing greatness in the past makes such a contrast.

For instance it wouldn't be funny to pick on Poland: the list of notables is a joke in itself, all it needs is the laugh track. Pilsudski? Boleslaw the Brave? Żeromski? Tarski? General Beck, who charged tanks with cavalry? Say "Chopin" - that's it, you're done with great Poles (well, Ulam perhaps also).

Quote from: xanatoseIs like they [Greeks] have chosen the worst traits of Athenians and Spartans instead of the best traits.

- indeed the same thought has occurred to me
Title: Re: Prime numbers
Post by: rrr314159 on August 12, 2015, 02:47:30 PM
Quote from: jj2007Just for fun :biggrin: ... It took 29.16 seconds to find 98222287 primes among 2000000000 numbers

- Actually this is an interesting topic, perhaps the time has come for a 64-bit SIMD multi-threaded prime number generator. Sorry, I accidentally hijacked the thread to complain about Italy. My excuse, I noticed "Just for fun" and, without checking, thought we were in the SoapBox or Colosseum. Like to know how fast the current champion PM generator is, can't seem to find it on net. Undoubtedly beatable, assuming it's not assembler. ... some fast techniques would be very SIMD-izable, for instance "wheeling". And of course cache management is vital
Title: Re: Prime numbers
Post by: hutch-- on August 12, 2015, 04:15:55 PM
I think I will stick with Monica Bellucci being an Italian national treasure. When I want prime numbers I look them up in a massive table I have.
Title: Re: Prime numbers
Post by: jj2007 on August 12, 2015, 06:22:00 PM
Quote from: rrr314159 on August 12, 2015, 01:22:58 PMMonica Bellucci is somebody!

Quote from: hutch-- on August 12, 2015, 04:15:55 PMI think I will stick with Monica Bellucci being an Italian national treasure.

Nice to see that two philosophers can agree on some body :icon_mrgreen:

Will look into the SIMD stuff when I find time. Maybe.
Title: Re: Prime numbers
Post by: herge on August 12, 2015, 06:22:53 PM

Hello jj2007:
C:\masm32\bin\PrimesBitField>PrimesBitField.exe
Intel(R) Core(TM)2 Duo CPU     E4600  @ 2.40GHz
It took 69.97 seconds to find 98222287 primes among 2000000000 numbers
2       3       5       7       11      13      17      19      23      29
31      37      41      43      47      53      59      61      67      71
73      79      83      89      97      101     103     107     109     113
127     131     137     139     149     151     157     163     167     173
179     181     191     193     197     199     211     223     227     229
233     239     241     251     257     263     269     271     277     281
283     293     307     311     313     317     331     337     347     349
353     359     367     373     379     383     389     397     401     409
419     421     431     433     439     443     449     457     461     463
467     479     487     491     499
...
1999999553      1999999559      1999999571      1999999609      1999999613
1999999621      1999999643      1999999649      1999999657      1999999747
1999999763      1999999777      1999999811      1999999817      1999999829
1999999853      1999999861      1999999871      1999999873      1999999913
1999999927      1999999943      1999999973


It took almost 70 seconds.

Regards Herge
Title: Re: Prime numbers
Post by: Zen on August 13, 2015, 05:30:19 AM
I LOVE THREADS LIKE THIS !!!
...As long as we're NOT MAKING ANY SENSE AT ALL,...

(https://scontent.cdninstagram.com/hphotos-xaf1/t51.2885-15/s320x320/e15/11357619_1657219911181892_1112746291_n.jpg)

...Oh,...and, HUTCH,...good point (http://masm32.com/board/index.php?topic=4495.msg48074#msg48074),...
Title: Re: Prime bodies
Post by: jj2007 on August 13, 2015, 06:17:47 AM
Quote from: Zen on August 13, 2015, 05:30:19 AM
I LOVE THREADS LIKE THIS !!!
...As long as we're NOT MAKING ANY SENSE AT ALL,...0

Shovels, dead bodies...?  :dazzled:

You are hijacking this thread! Until here, we were arguing about prime alive bodies 8)
(http://www.wallpixo.com/wp-content/uploads/2015/07/monica-bellucci-1-0.jpg)
Title: Re: Prime numbers
Post by: rrr314159 on August 13, 2015, 06:30:45 AM
Quote from: primesieve.orgprimesieve is a free software program and C/C++ library that generates primes using a highly optimized sieve of Eratosthenes implementation. It counts the primes below 10^10 in just 0.57 seconds on an Intel Core i7-4770 CPU (4 x 3.4GHz) from 2013.

- This is about as fast as they get, AFAIK. It's written in C++ and doesn't use SIMD, so must be beatable. compared to jj2007's, it's only about 20 times faster (taking into account multiple threads, etc) - with maybe 100 times as much code.
Title: Re: Prime numbers
Post by: jj2007 on August 13, 2015, 06:55:54 AM
Quote from: rrr314159 on August 13, 2015, 06:30:45 AMcompared to jj2007's, it's only about 20 times faster

You made my day :bgrin:
Title: Re: Prime numbers
Post by: rrr314159 on August 14, 2015, 10:44:21 AM
Quote from: jj2007 on August 13, 2015, 06:55:54 AM
Quote from: rrr314159 on August 13, 2015, 06:30:45 AMcompared to jj2007's, it's only about 20 times faster

You made my day :bgrin:

Well it might be only factor of 10, hard to say. There are a lot of tricks to speed up priming, such as marking small primes all at once using bit-map of length 210 (leaving out details). If it makes your day better, I found my prime routine and timed it. It's about the same as yours except with threads
Title: Re: Prime numbers
Post by: jj2007 on August 14, 2015, 05:39:46 PM
Hey, just kidding. I had also noticed primesieve & company, there seems to be a big community dedicated to chasing speed records, so we are just amateurs... if I had more time, I'd try the SIMD road, though.
Title: Re: Prime numbers
Post by: rrr314159 on August 16, 2015, 03:59:28 AM
Quote from: jjwe are just amateurs

- Every pro started as an amateur .. ! If we had to figure out their tricks a defeatist attitude might be appropriate, but we don't, it's all public knowledge.

- BTW I've returned to my main thing - math programming - for the last couple months. I'm no longer learning assembler as such; it's a tool to do math with. You may have noticed my answers to questions have been vague and often along the lines of "d**ned if I know!". For 6 months I studied every issue that came up - goal was to get up to speed on assembler and I was interested in every detail. Now however I feel I've accomplished the objective (well enough).

- Therefore I may tackle prime numbers. To you it's a side issue (the main issue is assembler); to me, this sort of math is what it's all about.

- BTW I don't at the moment see much use for SIMD. Seems prime num algo does not lend itself to vector programming; unlike FFT's and graphics. Probably that's why primesieve author doesn't use it
Title: Re: Prime numbers
Post by: rrr314159 on August 17, 2015, 12:51:38 PM
Just for fun I made this Eratosthenes prime generator, does 2 billion in 10.5 seconds, on one thread. With 4 threads it gets under four seconds, but not 2.625, since threading is not very efficient. It has a few tricks (out of dozens one could put in):

- map only has odd numbers, similar to sieve of Sundaram
- uses "wheel" to mark 3 and 5
- starts priming at the square, skips even numbered multiples
- couple other even more minor items

But Chris Wallich's is 10 times faster. Apparently the reason is mainly cache management. With regular Eratosthenes (as I'm using) you mark one prime all the way through the map, then go back for the next one, etc. So the entire map must be moved to cache once for each prime (up to the square root); that's a few thousand times. The "Wallich technique" (AFAIK he invented it) takes just one piece of the map at a time, which fits into cache, and marks all the primes. Then it gets the next one: about 30,000 little segments. It's a lot more trouble, but this way the entire map is brought into cache only once. Apparently that speeds it up an order of magnitude! So I'd like to go ahead and try that, if I don't get bored. The other nice thing about his technique, it's more convenient to multi-thread. However, none of these approaches seems to use SIMD effectively (scatter / gather commands would help).

Anyway FWIW here it is. The main routine, primes.asm:

include primes.inc

SIZEOFMAP = 1000000000              ; this is highest index
MAX_NUMBER = SIZEOFMAP*2 + 1        ; highest odd number is 2*index + 1

indextoprime MACRO                  ; uses eax, converts any odd num not just prime
    shl eax, 1
    add eax, 1
ENDM

primetoindex MACRO                  ; reverse computation
    sub eax, 1
    shr eax, 1
ENDM

.data
    primesmap dd 0
    tenm3 real8 0.001
    timerResults real8 0.0
.code

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
primes:

    call GetSystemInformation
    InitCycles                      ; start timer

; allocate memory
    invoke GlobalAlloc, 0, SIZEOFMAP + 100h    ; buffer in case SIZEOFMAP not div by 4
    mov edi, eax                    ; edi => the primes map

; do the wheel "mask"
    CALL MasktheMap                 ; fill in "wheel" of 60 (30 times 2 for DWORDs)

    dec edi                         ; below here edi always adds 1, edi + 0 is never used,
                                    ; so just back it up 1 because ecx = 1 is index of 3
                                    ; thus edi + 1 is on a dword boundary
    mov primesmap, edi

; main work here
    call MarkPrimes                 ; sieve of eratosthenes
    call CountThem                  ; return count in ebx

; get time, print results
    movflt timerResults, @ElapsedCycles()
    movflt timerResults, @CyclestoTime(timerResults)
    fld timerResults
    fmul tenm3
    fstp timerResults
    pp "number of primes %u to %u\n", ebx, MAX_NUMBER
    pp "prime time seconds %.4g\n", timerResults


; finally, print out last few primes
    mov ecx, SIZEOFMAP
    mov edx, ecx
    sub edx, 100
    xor ebx, ebx
    printloop:
        cmp BYTE PTR [edi + ecx], 0
        je @F
            mov eax, ecx
            indextoprime
            pp "prime %u\n", eax
            inc ebx
        @@:
        dec ecx
        cmp ecx, edx
        jge printloop
    pp "number of primes %i above %i\n", ebx, edx

ret
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
MasktheMap:

    mov esi, edi
    mov ecx, 0                      ; threes/fives mask (nonzero ultimately means it's prime)
    @@:
        mov DWORD PTR [esi + ecx], 00010000h        ; this knocks out (2 and) 3 and 5, 60 bytes at a time
        mov DWORD PTR [esi + ecx + 4], 01000101h    ; don't forget they're backwards
        mov DWORD PTR [esi + ecx + 8], 00010001h

        mov DWORD PTR [esi + ecx + 12], 00010100h
        mov DWORD PTR [esi + ecx + 16], 01000100h
        mov DWORD PTR [esi + ecx + 20], 01010001h

        mov DWORD PTR [esi + ecx + 24], 00000100h
        mov DWORD PTR [esi + ecx + 28], 00000101h
        mov DWORD PTR [esi + ecx + 32], 01010001h

        mov DWORD PTR [esi + ecx + 36], 00010100h
        mov DWORD PTR [esi + ecx + 40], 01000001h
        mov DWORD PTR [esi + ecx + 44], 01000001h

        mov DWORD PTR [esi + ecx + 48], 00010100h
        mov DWORD PTR [esi + ecx + 52], 01000101h
        mov DWORD PTR [esi + ecx + 56], 01010000h

        add ecx, 60
        cmp ecx, SIZEOFMAP
        jle @B

    mov DWORD PTR [edi], 00010101h  ; the first word set 3, 5, 7 to 1
ret
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
MarkPrimes:

    xor esi, esi
    mov edi, primesmap
    mov ecx, 2                      ; below it incs, so it starts with 7

    beginnextprimeloop:
        inc ecx
        cmp BYTE PTR [edi + ecx], 1
        jne beginnextprimeloop
   
        ; found a prime
        mov eax, ecx                ; ecx is the prime's index
        mul ecx
        add eax, ecx                ; now eax has the index ^2 + index
        shl eax, 1                  ; index of the square = (x^2 + x) * 2
        xchg eax, ecx               ; eax again has prime's index
        cmp ecx, SIZEOFMAP          ; square has reached top ?
        jg donepriming
       
        indextoprime                ; double eax, add 1, to check only odd multiples
       
        @@:
            mov BYTE PTR [edi + ecx], 0
            add ecx, eax
            cmp ecx, SIZEOFMAP
            jle @B
       
        primetoindex                ; eax back to the prime's index (eg, for 11 it's 5)
        mov ecx, eax
    jmp beginnextprimeloop

donepriming:
ret
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
CountThem:

    mov ebx, 1                      ; for "2"
    mov esi, primesmap
    mov ecx, 1                      ; ecx = 1 is index of 3

    @@:
        mov edx, [esi + ecx]
        xor eax, eax
        add al, dl
        add al, dh
        bswap edx
        add al, dl
        add al, dh
        add ebx, eax
        add ecx, 4
        cmp ecx, SIZEOFMAP
        jl @B
ret
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end primes
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««


The zip includes:

primes.exe
primes.asm
primes.inc ; support routines like timer, and includes etc
makeit.bat ; batch make file
primesrun.txt ; runs for this routine and also jj2007's on my machine
Title: Re: Prime numbers
Post by: rrr314159 on August 20, 2015, 12:41:47 PM
It works!

Put in a very rough implementation of cache segmentation; cut the time for 2 billion from 10.5 to 2 seconds (!) on one thread. This is NOT optimized, lots of obvious improvements (other than threads) can be made (for instance I've got variables in memory which could be in registers) so it can be a lot faster. That's within striking distance of Chris Wallich, maybe half the speed or so. Then there are more difficult improvements one can steal from his program (like better wheeling) - so it seems likely he can be beat - which is expected: assembler vs. C++ = no contest. Whether anyone feels like doing all that work is another matter! But I think I'll be within a factor of 2 without much trouble.

After a while I'll clean it up and post it. Just wanted to share the good news: this caching technique does, in fact, work :eusa_dance:
Title: Re: Prime numbers
Post by: hutch-- on August 20, 2015, 03:19:17 PM
On an i7 quad, running it with 8 threads should move it along a fair bit faster.
Title: Re: Prime numbers
Post by: jj2007 on August 20, 2015, 04:59:08 PM
Compliments :t
Have no time right now to participate, but I'm watching with awe ;-)
Title: Re: Prime numbers
Post by: rrr314159 on September 15, 2015, 02:00:41 PM
Here is a pretty simple, pretty fast version of the segmented sieve of Eratosthenes prime number algorithm - only single-thread.

This version does 2 billion in about 1.7 seconds. It's hard to say how it compares to Chris Wallich's "primesieve"; not much slower (ignoring threading). It beats his "Simple (single-core) Segmented sieve of Eratosthenes" by 25%. Of course it's much smaller build, 4K (about 350 lines of asm).

This is very simple 32-bit code, uses SSE, compiles under ml.exe or JWasm.

In primes.asm, you may need to set the SEGMENT_SIZE (on my machine it's 32768) and the MAX_INDEX which is half the top number, in this case, 1 billion. It requires a bit more ram than MAX_INDEX, in this case, a gig.

Here are sample runs:

Intel(R) Core(TM) i5-3330 CPU @ 3.00GHz; Windows 8; 4 threads (only 1 used)
prime time seconds 1.715
number of primes 98222287 to 2000000001
1st prime 2
2nd prime 3
3rd prime 5
4th prime 7
5th prime 11
6th prime 13
7th prime 17
8th prime 19
9th prime 23
10th prime 29
11th prime 31
12th prime 37
number of primes 12 to 39
prime 1999999973
prime 1999999943
prime 1999999927
prime 1999999913
prime 1999999873
prime 1999999871
prime 1999999861
prime 1999999853
prime 1999999829
prime 1999999817
prime 1999999811
number of primes 11 above 1999999801

Intel(R) Core(TM) i5-3330 CPU @ 3.00GHz; Windows 8; 4 threads (only 1 used)
prime time seconds 0.7966
number of primes 50847534 to 1000000001
1st prime 2
2nd prime 3
3rd prime 5
4th prime 7
5th prime 11
6th prime 13
7th prime 17
8th prime 19
9th prime 23
10th prime 29
11th prime 31
12th prime 37
number of primes 12 to 39
prime 999999937
prime 999999929
prime 999999893
prime 999999883
number of primes 4 above 999999801


The zip includes:

primes.exe
primes.asm      ; main routine
primes.inc      ;  includes and support routines: print, timer, sys-info
makeit.bat      ; "make" file

[edit] made another version about 15% faster, updated the .zip file, apologies to those who already downloaded it
Title: Re: Prime numbers
Post by: zedd151 on September 15, 2015, 05:23:16 PM
Even on my sloooowww laptop, it does a decent job. :t

Genuine Intel(R) CPU           T2060  @ 1.60GHz; Windows XP; 2 threads (only 1 used)
prime time seconds 2.685
number of primes 98222287 to 2000000001
1st prime 2
2nd prime 3
3rd prime 5
4th prime 7
5th prime 11
6th prime 13
7th prime 17
8th prime 19
9th prime 23
10th prime 29
11th prime 31
12th prime 37
number of primes 12 to 39
prime 1999999973
prime 1999999943
prime 1999999927
prime 1999999913
prime 1999999873
prime 1999999871
prime 1999999861
prime 1999999853
prime 1999999829
prime 1999999817
prime 1999999811
number of primes 11 above 999999900
Title: Re: Prime numbers
Post by: zedd151 on September 15, 2015, 05:39:35 PM
From the updated version:


Genuine Intel(R) CPU           T2060  @ 1.60GHz; Windows XP; 2 threads (only 1 used)
prime time seconds 2.442

~~~~
~~~~
Title: Re: Prime numbers
Post by: TWell on September 15, 2015, 05:52:59 PM
This AMD is so slow :(AMD E-450 APU with Radeon(tm) HD Graphics 1.65GHz; Windows 7; 2 threads (only 1 used)
prime time seconds 5.594
number of primes 98222287 to 2000000001
1st prime 2
2nd prime 3
3rd prime 5
4th prime 7
5th prime 11
6th prime 13
7th prime 17
8th prime 19
9th prime 23
10th prime 29
11th prime 31
12th prime 37
number of primes 12 to 39
prime 1999999973
prime 1999999943
prime 1999999927
prime 1999999913
prime 1999999873
prime 1999999871
prime 1999999861
prime 1999999853
prime 1999999829
prime 1999999817
prime 1999999811
number of primes 11 above 1999999801
Title: Re: Prime numbers
Post by: zedd151 on September 15, 2015, 05:57:39 PM
Quote from: TWell on September 15, 2015, 05:52:59 PM
This AMD is so slow :(

Are you running AV?
If so, maybe thats  a contributing factor?
Else its possible that the code has optimizations that work best on Intel ???
Title: Re: Prime numbers
Post by: rrr314159 on September 15, 2015, 05:57:48 PM
@zedd,

Thanks a lot zedd, sorry for the two versions, BTW you might look at the timer macros in primes.inc, since you've been working on such things in the laboratory

@TWell,

thanks, my AMD A6 is a bit faster than E-450 at 4.2 but still a lot slower than I5

BTW zedd slower AMD's could be partly due to optimizing for Intel, but mainly they're just slower (with some exceptions; for instance fsin and fcos are much faster on AMD's)
Title: Re: Prime numbers
Post by: zedd151 on September 15, 2015, 06:03:36 PM
Quote from: rrr314159 on September 15, 2015, 05:57:48 PM
...BTW you might look at the timer macros in primes.inc

Great, I'll look into that.

And your'e welcome.
Title: Re: Prime numbers
Post by: jj2007 on September 15, 2015, 06:04:07 PM
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz; Windows 7; 4 threads (only 1 used)
prime time seconds 2.084

JWasm and ML 8.0+ accept the source, but ML 6.15 doesn't like it, internal error at the mask0 etc declarations.
Title: Re: Prime numbers
Post by: sinsi on September 15, 2015, 06:14:03 PM
Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz; Windows 8; 8 threads (only 1 used)
prime time seconds 1.218

Windows 10 actually... :biggrin:
Title: Re: Prime numbers
Post by: rrr314159 on September 15, 2015, 06:28:02 PM
@TWell, (and anyone else also, )

I forgot to mention, check the cache size, it should be set to your L1 cache. In the current code it's 32768. Try half that much, or twice, see if it makes a difference

@jj2007,

Thanks, mask0 is an oword in .data section. If anyone cares I can post the (older, slower) masking routine that doesn't use owords

@sinsi,

thanks, your 'puter is the speed king ... wouldn't be surprised if you could increase the cache size also, do even better ... ? My sys-info routine doesn't know about Win 10 yet; I stole it from siekmanski so blame him :biggrin:
Title: Re: Prime numbers
Post by: Siekmanski on September 15, 2015, 06:39:11 PM
Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz; Windows 8.1; 12 threads (only 1 used)
prime time seconds 0.3928
number of primes 98222287 to 2000000001
1st prime 2
2nd prime 3
3rd prime 5
4th prime 7
5th prime 11
6th prime 13
7th prime 17
8th prime 19
9th prime 23
10th prime 29
11th prime 31
12th prime 37
number of primes 12 to 39
prime 1999999973
prime 1999999943
prime 1999999927
prime 1999999913
prime 1999999873
prime 1999999871
prime 1999999861
prime 1999999853
prime 1999999829
prime 1999999817
prime 1999999811
number of primes 11 above 1999999801

Title: Re: Prime numbers
Post by: zedd151 on September 15, 2015, 06:49:59 PM
Quote from: rrr314159 on September 15, 2015, 06:28:02 PM
..should be set to your L1 cache. In the current code it's 32768. Try half that much, or twice, see if it makes a difference

hmmmmm....

ml 6.14 doesn't like mfence


Microsoft (R) Macro Assembler Version 6.14.8444
Copyright (C) Microsoft Corp 1981-1997.  All rights reserved.

Assembling: primes.asm
primes.asm(48) : error A2008: syntax error : MFENCE
RDTSCFLOAT(3): Macro Called From
  InitCycles(17): Macro Called From
   primes.asm(48): Main Line Code
primes.asm(54) : error A2008: syntax error : MFENCE
RDTSCFLOAT(3): Macro Called From
  @RDTSCFLOAT(2): Macro Called From
   ElapsedCycles(3): Macro Called From
    @ElapsedCycles(2): Macro Called From
     movflt(1): Macro Called From
      get_time(1): Macro Called From
       primes.asm(54): Main Line Code


lemme grab jwasm...
Title: Re: Prime numbers
Post by: zedd151 on September 15, 2015, 07:07:45 PM
Just for kicks, I have no idea how big my L1 cache is.


Genuine Intel(R) CPU           T2060  @ 1.60GHz; Windows XP; 2 threads (only 1 used)

prime time seconds 4.036  L1 = 65536

prime time seconds 2.46   L1 = 32768 ; seems optimal

prime time seconds 2.972  L1 = 16384

prime time seconds 3.772  L1 = 8192


used jwasm :t

Quote
JWasm v2.11, ~~~
primes.asm: 409 lines, 2 passes, 46 ms, 0 warnings, 0 errors~~
Title: Re: Prime numbers
Post by: rrr314159 on September 15, 2015, 07:20:24 PM
@Siekmanski,

thanks, wow .4 seconds, sinsi eat your heart out! Chris Wallich was using an I7-4770, more comparable to sinsi's I guess ... it seems that this algo is about the same speed as his, if I make a 64-bit version with threads

@zedd151,

apparently your cache is same as mine 32768. This prime number algo is very sensitive to cache size, as u can see
Title: Re: Prime numbers
Post by: jj2007 on September 15, 2015, 07:48:57 PM
Quote from: rrr314159 on September 15, 2015, 06:28:02 PMThanks, mask0 is an oword in .data section.

Workaround is two qwords, but caution with the order ;-)
Title: Re: Prime numbers
Post by: rrr314159 on September 15, 2015, 07:57:28 PM
Yes jj2007, least significant byte is tricky ;-) ... ML 6.15 doesn't like mfence either ... not sure it's worth it to care about a 6.15 version? Is it the case that some people can't (or, won't) upgrade to 8.0?
Title: Re: Prime numbers
Post by: zedd151 on September 15, 2015, 09:11:10 PM
Quote from: rrr314159 on September 15, 2015, 07:57:28 PM
Is it the case that some people can't (or, won't) upgrade to 8.0?

For me at least, I only upgrade anything on an 'as-needed' basis.
I never simply upgrade because of a so-called 'improved' version.

(that's why I still run WinXP, don't need (yet) anything else)  8)
Software compatibility? That's why I am learning to write my own  :biggrin:
Title: Re: Prime numbers
Post by: jj2007 on September 15, 2015, 09:23:47 PM
The upgrade is indeed a clumsy procedure. Not everybody is willing to install that behemoth of Visual Crap.
asm:
  mask0 qword 0001000101000101h, 0100010000010100h
  mask1 qword 0000010000010001h, 0101000100000101h
  mask2 qword 0100000100010000h, 0001010001000000h
  mask3 qword 0100000001000101h, 0100000100010000h
  mask4 qword 0001010000010001h, 0101000100000100h
  mask5 qword 0000010100000100h, 0001010001010001h

inc:
    db 0Fh, 0AEh, 0F0h ; mfence
    db 0fh, 31h ; rdtsc
Title: Re: Prime numbers
Post by: rrr314159 on September 16, 2015, 01:47:50 AM
Thanks jj2007,

next version I'll put these changes in for those 6.15 fans out there. If anyone wants to change it themselves: the place in primes.asm to substitute the mask0, mask1 ... block is obvious. The mfence and rdtsc db statements would replace those instructions in primes.inc
Title: Re: Prime numbers
Post by: TWell on September 16, 2015, 04:37:10 PM
SEGMENT_SIZE = 8192
AMD E-450 APU with Radeon(tm) HD Graphics; Windows 7; 2 threads (only 1 used)
prime time seconds 8.54

SEGMENT_SIZE = 16384
AMD E-450 APU with Radeon(tm) HD Graphics; Windows 7; 2 threads (only 1 used)
prime time seconds 7.385

SEGMENT_SIZE = 32768
AMD E-450 APU with Radeon(tm) HD Graphics; Windows 7; 2 threads (only 1 used)
prime time seconds 6.541