Author Topic: Instr, strstr, find$  (Read 15201 times)

jj2007

  • Member
  • *****
  • Posts: 8501
  • Assembler is fun ;-)
    • MasmBasic
Instr, strstr, find$
« on: July 14, 2014, 09:20:15 PM »
Hi,
Can I have some timings please? Thanks :t

AMD Athlon(tm) Dual Core Processor 4450B (SSE3)
40057   cycles for 100 * MbInstr 0
40767   cycles for 100 * MbInstr 1
40017   cycles for 100 * MbInstr 2
40965   cycles for 100 * MbInstr 4
51485   cycles for 100 * crt_strstr
52835   cycles for 100 * M32 find$

nidud

  • Member
  • *****
  • Posts: 1528
    • https://github.com/nidud/asmc
Re: Instr, strstr, find$
« Reply #1 on: July 14, 2014, 09:57:25 PM »
AMD Athlon(tm) II X2 245 Processor (SSE3)

38870   cycles for 100 * MbInstr 0
40286   cycles for 100 * MbInstr 1
38311   cycles for 100 * MbInstr 2
39355   cycles for 100 * MbInstr 4
23442   cycles for 100 * crt_strstr
51846   cycles for 100 * M32 find$

38482   cycles for 100 * MbInstr 0
40338   cycles for 100 * MbInstr 1
38316   cycles for 100 * MbInstr 2
39653   cycles for 100 * MbInstr 4
23432   cycles for 100 * crt_strstr
52037   cycles for 100 * M32 find$

38801   cycles for 100 * MbInstr 0
40392   cycles for 100 * MbInstr 1
39449   cycles for 100 * MbInstr 2
39780   cycles for 100 * MbInstr 4
23438   cycles for 100 * crt_strstr
52430   cycles for 100 * M32 find$

FORTRANS

  • Member
  • *****
  • Posts: 1006
Re: Instr, strstr, find$
« Reply #2 on: July 14, 2014, 10:44:49 PM »
Intel(R) Pentium(R) M processor 1.70GHz (SSE2)

42200   cycles for 100 * MbInstr 0
42483   cycles for 100 * MbInstr 1
42815   cycles for 100 * MbInstr 2
43101   cycles for 100 * MbInstr 4
33541   cycles for 100 * crt_strstr
38404   cycles for 100 * M32 find$

42136   cycles for 100 * MbInstr 0
42894   cycles for 100 * MbInstr 1
42309   cycles for 100 * MbInstr 2
42928   cycles for 100 * MbInstr 4
33481   cycles for 100 * crt_strstr
38438   cycles for 100 * M32 find$

42199   cycles for 100 * MbInstr 0
42577   cycles for 100 * MbInstr 1
42297   cycles for 100 * MbInstr 2
43530   cycles for 100 * MbInstr 4
33490   cycles for 100 * crt_strstr
38416   cycles for 100 * M32 find$

18   bytes for MbInstr 0
18   bytes for MbInstr 1
18   bytes for MbInstr 2
18   bytes for MbInstr 4
22   bytes for crt_strstr
15   bytes for M32 find$

97   = eax MbInstr 0
97   = eax MbInstr 1
97   = eax MbInstr 2
97   = eax MbInstr 4
97   = eax crt_strstr
97   = eax M32 find$

--- ok ---

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: Instr, strstr, find$
« Reply #3 on: July 14, 2014, 10:45:22 PM »
Jochen,

your timings:
Code: [Select]
Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)

22112   cycles for 100 * MbInstr 0
22187   cycles for 100 * MbInstr 1
22319   cycles for 100 * MbInstr 2
22243   cycles for 100 * MbInstr 4
28535   cycles for 100 * crt_strstr
30875   cycles for 100 * M32 find$

22176   cycles for 100 * MbInstr 0
22185   cycles for 100 * MbInstr 1
22194   cycles for 100 * MbInstr 2
22304   cycles for 100 * MbInstr 4
28572   cycles for 100 * crt_strstr
30766   cycles for 100 * M32 find$

21962   cycles for 100 * MbInstr 0
22149   cycles for 100 * MbInstr 1
22309   cycles for 100 * MbInstr 2
22278   cycles for 100 * MbInstr 4
28534   cycles for 100 * crt_strstr
30747   cycles for 100 * M32 find$

18      bytes for MbInstr 0
18      bytes for MbInstr 1
18      bytes for MbInstr 2
18      bytes for MbInstr 4
22      bytes for crt_strstr
15      bytes for M32 find$

97      = eax MbInstr 0
97      = eax MbInstr 1
97      = eax MbInstr 2
97      = eax MbInstr 4
97      = eax crt_strstr
97      = eax M32 find$

--- ok ---

Gunther
Get your facts first, and then you can distort them.

jj2007

  • Member
  • *****
  • Posts: 8501
  • Assembler is fun ;-)
    • MasmBasic
Re: Instr, strstr, find$
« Reply #4 on: July 14, 2014, 11:23:32 PM »
Jochen,

your timings:
Code: [Select]
Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)
22112   cycles for 100 * MbInstr 0
28535   cycles for 100 * crt_strstr
30875   cycles for 100 * M32 find$

Gunther,
I love your CPU :greensml:

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)

41492   cycles for 100 * MbInstr 0
41718   cycles for 100 * MbInstr 1
41553   cycles for 100 * MbInstr 2
42373   cycles for 100 * MbInstr 4
32945   cycles for 100 * crt_strstr
37785   cycles for 100 * M32 find$

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: Instr, strstr, find$
« Reply #5 on: July 15, 2014, 02:05:30 AM »
Jochen,

Gunther,
I love your CPU :greensml:

me too.  :lol: :lol: :lol:

Gunther
Get your facts first, and then you can distort them.

dedndave

  • Member
  • *****
  • Posts: 8789
  • Still using Abacus 2.0
    • DednDave
Re: Instr, strstr, find$
« Reply #6 on: July 15, 2014, 02:06:30 AM »
prescott w/htt
Code: [Select]
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)

54210   cycles for 100 * MbInstr 0
53810   cycles for 100 * MbInstr 1
54892   cycles for 100 * MbInstr 2
55200   cycles for 100 * MbInstr 4
42894   cycles for 100 * crt_strstr
59477   cycles for 100 * M32 find$

53732   cycles for 100 * MbInstr 0
54695   cycles for 100 * MbInstr 1
54596   cycles for 100 * MbInstr 2
55538   cycles for 100 * MbInstr 4
44184   cycles for 100 * crt_strstr
57831   cycles for 100 * M32 find$

54744   cycles for 100 * MbInstr 0
54076   cycles for 100 * MbInstr 1
54848   cycles for 100 * MbInstr 2
55803   cycles for 100 * MbInstr 4
43372   cycles for 100 * crt_strstr
57899   cycles for 100 * M32 find$

sinsi

  • Member
  • *****
  • Posts: 1054
Re: Instr, strstr, find$
« Reply #7 on: July 15, 2014, 07:17:48 AM »
Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (SSE4)
23621   cycles for 100 * MbInstr 0
23818   cycles for 100 * MbInstr 1
23339   cycles for 100 * MbInstr 2
23404   cycles for 100 * MbInstr 4
22305   cycles for 100 * crt_strstr
31867   cycles for 100 * M32 find$


AMD A10-7850K APU with Radeon(TM) R7 Graphics   (SSE4)
35325   cycles for 100 * MbInstr 0
35340   cycles for 100 * MbInstr 1
35409   cycles for 100 * MbInstr 2
37523   cycles for 100 * MbInstr 4
37294   cycles for 100 * crt_strstr
42007   cycles for 100 * M32 find$

I can walk on water but stagger on beer.

jj2007

  • Member
  • *****
  • Posts: 8501
  • Assembler is fun ;-)
    • MasmBasic
Re: Instr, strstr, find$
« Reply #8 on: July 15, 2014, 11:55:39 AM »
Interesting:

Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4) - Gunther
22112   cycles for 100 * MbInstr 0
28535   cycles for 100 * crt_strstr
30875   cycles for 100 * M32 find$

Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (SSE4) - Sinsi
23621   cycles for 100 * MbInstr 0
22305   cycles for 100 * crt_strstr
31867   cycles for 100 * M32 find$

dedndave

  • Member
  • *****
  • Posts: 8789
  • Still using Abacus 2.0
    • DednDave
Re: Instr, strstr, find$
« Reply #9 on: July 15, 2014, 12:22:47 PM »
i don't have to remind you how many different versions of MSVCRT there are   :P
i am a little surprised you compare them

jj2007

  • Member
  • *****
  • Posts: 8501
  • Assembler is fun ;-)
    • MasmBasic
Re: Instr, strstr, find$
« Reply #10 on: July 15, 2014, 12:31:44 PM »
Would be nice to see where they differ :biggrin:

New test:

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)

3734    cycles for 10 * MbInstr 0
3302    cycles for 10 * crt_strstr
3780    cycles for 10 * M32 find$
4237    cycles for 10 * MB Instr old

3734    cycles for 10 * MbInstr 0
3290    cycles for 10 * crt_strstr
3792    cycles for 10 * M32 find$
4232    cycles for 10 * MB Instr old

3735    cycles for 10 * MbInstr 0
3292    cycles for 10 * crt_strstr
3785    cycles for 10 * M32 find$
4230    cycles for 10 * MB Instr old

sinsi

  • Member
  • *****
  • Posts: 1054
Re: Instr, strstr, find$
« Reply #11 on: July 15, 2014, 12:56:14 PM »
C:\Windows\SysWOW64\msvcrt.dll  7.0.9600.16384

Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (SSE4)

2547    cycles for 10 * MbInstr 0
2169    cycles for 10 * crt_strstr
3159    cycles for 10 * M32 find$
2195    cycles for 10 * MB Instr old

2552    cycles for 10 * MbInstr 0
2220    cycles for 10 * crt_strstr
3141    cycles for 10 * M32 find$
2204    cycles for 10 * MB Instr old

2564    cycles for 10 * MbInstr 0
2190    cycles for 10 * crt_strstr
3169    cycles for 10 * M32 find$
2175    cycles for 10 * MB Instr old

I can walk on water but stagger on beer.

jcfuller

  • Member
  • **
  • Posts: 174
Re: Instr, strstr, find$
« Reply #12 on: July 15, 2014, 07:36:28 PM »
AMD Athlon(tm) II X2 250 Processor (SSE3)
++++++++++++++++++++
4947    cycles for 10 * MbInstr 0
5109    cycles for 10 * crt_strstr
5292    cycles for 10 * M32 find$
3894    cycles for 10 * MB Instr old

4828    cycles for 10 * MbInstr 0
5109    cycles for 10 * crt_strstr
5298    cycles for 10 * M32 find$
3895    cycles for 10 * MB Instr old

4827    cycles for 10 * MbInstr 0
5106    cycles for 10 * crt_strstr
5353    cycles for 10 * M32 find$
3881    cycles for 10 * MB Instr old


jj2007

  • Member
  • *****
  • Posts: 8501
  • Assembler is fun ;-)
    • MasmBasic
Re: Instr, strstr, find$
« Reply #13 on: July 15, 2014, 08:37:50 PM »
Thanxalot to everybody :icon14:

Won't be easy to reconcile all CPUs.
Background to this exercise: A real life application where I tried to search a 250MB text file (Thunderbird inbox...) for pattern A near pattern B, where "near" means +- 500 bytes. If pattern A is frequent, and pattern B is only present towards the end of the file, the exercise gets incredibly slow.

So I wrote a new version of Instr_() that takes a search limit, in this case: 2*500 bytes as an additional parameter. And voilĂ , searching the inbox is a factor 20 or so faster. But the additional parameter slows down the simple search a little bit, and this thread is aimed to investigate that problem.

As a side effect, it will be possible to search non-text files (i.e. with embedded zeros), if the len is known.

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: Instr, strstr, find$
« Reply #14 on: July 15, 2014, 08:46:59 PM »
Jochen,

your timings:
Code: [Select]
Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)

2449    cycles for 10 * MbInstr 0
4062    cycles for 10 * crt_strstr
3051    cycles for 10 * M32 find$
2235    cycles for 10 * MB Instr old

2451    cycles for 10 * MbInstr 0
2822    cycles for 10 * crt_strstr
3059    cycles for 10 * M32 find$
2232    cycles for 10 * MB Instr old

2448    cycles for 10 * MbInstr 0
4063    cycles for 10 * crt_strstr
4311    cycles for 10 * M32 find$
3503    cycles for 10 * MB Instr old

Gunther
Get your facts first, and then you can distort them.