News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

New line tokeniser.

Started by hutch--, October 24, 2014, 03:04:03 PM

Previous topic - Next topic

hutch--

I wrote this one in PB and it was an easy port to MASM. It differs from the tokeniser in the masm32 library in that it preserves empty lines and does not left trim tabs and spaces. I could not get a timing on it with a 4.5 meg file and got a 32 ms timing on a 17.5 meg file on my i7 so its probably fast enough. use should be general purpose but it is capable of identifying line number from the array index so it can be used to map a text or source file. Builds at 3k so it should not blow out your hard disk.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
                    include \masm32\include\masm32rt.inc
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

comment * -----------------------------------------------------
                        Build this  template with
                       "CONSOLE ASSEMBLE AND LINK"
        ----------------------------------------------------- *

    line_tokeniser PROTO :DWORD
    get_lcnt       PROTO :DWORD

    .data
      caesar \
      db "Friends, Romans, countrymen, lend me your ears;",13,10
      db "I come to bury Caesar, not to praise him.",13,10
      db "The evil that men do lives after them;",13,10
      db "The good is oft interred with their bones;",13,10,0

      ptxt dd caesar

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    print chr$(13,10)
    inkey "Thats all folks, press a key to exit..."
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL pMem  :DWORD
    LOCAL lcnt  :DWORD

    mov pMem, rv(line_tokeniser,ptxt)   ; tokenise text
    mov lcnt, ecx                       ; save the line count

    push esi
    push edi

    mov esi, pMem                       ; load array into ESI
    mov edi, lcnt                       ; use EDI as the counter

  @@:
    print [esi],13,10                   ; display each line of text
    add esi, 4                          ; increment to next pointer
    sub edi, 1                          ; dec the counter
    jnz @B                              ; loop back if not zero

    pop edi
    pop esi

    free pMem                           ; release memory from tokeniser

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

line_tokeniser proc src:DWORD

  ; ---------------------------------------------------
  ; tokeniser for CRLF delimited text
  ; ---------------------------------------------------
  ; replaces the ascii 13 with zero and writes apointer
  ; to the allocated memory as an array of pointers
  ; return value in EAX = pointer array address
  ; return value in ECX = the line count
  ; array address must be de-allocated using
  ; GlobalFree() or the macro "free".
  ; ---------------------------------------------------

    LOCAL lcnt :DWORD
    LOCAL pMem :DWORD
    LOCAL alen :DWORD

    push src
    call get_lcnt               ; get the line count
    mov lcnt, eax               ; store line count in variable
    lea eax, [eax*4]            ; set pointer array length
    mov alen, eax               ; store the array size in alen

    mov pMem, alloc(alen)       ; allocate the pointer array

    mov edx, src                ; source address in ESI
    mov ecx, pMem               ; pointer array address in EBX

    mov [ecx], edx              ; load array address into 1st member of array
    add ecx, 4
    sub edx, 1

  lbl1:
    add edx, 1
    movzx eax, BYTE PTR [edx]   ; zero extend byte into EAX
    test eax, eax               ; test for zero
    jz lbl2                     ; exit loop on zero
    cmp eax, 13                 ; test for ascii 13
    jne lbl1                    ; short loop back if not 13

    mov BYTE PTR [edx], 0       ; write terminator at ascii 13 location
    add edx, 2                  ; step over ascii 13 and 10
    mov [ecx], edx              ; write the next line start to pointer
    add ecx, 4                  ; increment to next pointer
    jmp lbl1                    ; long loop after writing pointer

  lbl2:
    mov ecx, lcnt               ; return the line count in ECX
    mov eax, pMem               ; the array pointer in EAX

    ret

line_tokeniser endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

get_lcnt proc src:DWORD

  ; --------------------------------------
  ; count ascii 13 to determine line count
  ; --------------------------------------
    mov edx, [esp+4]                    ; the source address
    sub edx, 1
    xor eax, eax
    jmp lbl1

  pre:
    add eax, 1                          ; increment the counter
  lbl1:
  ; -----------
  ; unroll by 4
  ; -----------
    add edx, 1
    movzx ecx, BYTE PTR [edx]
    cmp ecx, 13
    je pre
    test ecx, ecx
    jz lbl2

    add edx, 1
    movzx ecx, BYTE PTR [edx]
    cmp ecx, 13
    je pre
    test ecx, ecx
    jz lbl2

    add edx, 1
    movzx ecx, BYTE PTR [edx]
    cmp ecx, 13
    je pre
    test ecx, ecx
    jz lbl2

    add edx, 1
    movzx ecx, BYTE PTR [edx]
    cmp ecx, 13
    je pre
    test ecx, ecx
    jnz lbl1

  lbl2:
    ret 4

get_lcnt endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start

jj2007

#1
Little test:
main proc

    LOCAL pMem  :DWORD
    LOCAL lcnt  :DWORD
    LOCAL pContent
    and pContent, 0
    mov pContent, InputFile("\Masm32\include\Windows.inc")
    mov pMem, rv(line_tokeniser, pContent)   ; tokenise text
    mov lcnt, ecx                       ; save the line count

    push esi
    push edi

    mov esi, pMem                       ; load array into ESI
    mov edi, lcnt                       ; use EDI as the counter
;     mov edi, 9

  @@:
    lodsd
    .if eax>127
    print eax, 13, 10
    .else
    print "ERROR ################", 13, 10
    .endif
    sub edi, 1                          ; dec the counter
    jnz @B                              ; loop back if not zero

    pop edi
    pop esi
    free pContent
    free pMem                           ; release memory from tokeniser

    ret

main endp


The line count is correct - the problem is elsewhere. Check what happens with the pointers to empty lines; at first sight, I can't find any problem in your code, it really looks correct, but ...

Speedwise it looks quite OK. Recall is over twice as fast, but that one requires SSE2, of course.

TouEnMasm


The intel strchr use a different method to find a char in a string

        page    ,132
        title   strchr - search string for given character
;***
;strchr.asm - search a string for a given character
;
;       Copyright (c) Microsoft Corporation. All rights reserved.
;
;Purpose:
;       defines strchr() - search a string for a character
;
;*******************************************************************************

        .xlist
        include cruntime.inc
        .list

page
;***
;char *strchr(string, chr) - search a string for a character
;
;Purpose:
;       Searches a string for a given character, which may be the
;       null character '\0'.
;
;       Algorithm:
;       char *
;       strchr (string, chr)
;       char *string, chr;
;       {
;         while (*string && *string != chr)
;             string++;
;         if (*string == chr)
;             return(string);
;         return((char *)0);
;       }
;
;Entry:
;       char *string - string to search in
;       char chr     - character to search for
;
;Exit:
;       returns pointer to the first occurence of c in string
;       returns NULL if chr does not occur in string
;
;Uses:
;
;Exceptions:
;
;*******************************************************************************

        CODESEG

found_bx:
        lea     eax,[edx - 1]
        pop     ebx                 ; restore ebx
        ret                         ; _cdecl return

        align   16
        public  strchr, __from_strstr_to_strchr
strchr  proc \
        string:ptr byte, \
        chr:byte

        OPTION PROLOGUE:NONE, EPILOGUE:NONE

        .FPO    ( 0, 2, 0, 0, 0, 0 )

        xor     eax,eax
        mov     al,[esp + 8]        ; al = chr (search char)

__from_strstr_to_strchr label proc

        push    ebx                 ; PRESERVE EBX
        mov     ebx,eax             ; ebx = 0/0/0/chr
        shl     eax,8               ; eax = 0/0/chr/0
        mov     edx,[esp + 8]       ; edx = buffer
        test    edx,3               ; test if string is aligned on 32 bits
        jz      short main_loop_start

str_misaligned:                     ; simple byte loop until string is aligned
        mov     cl,[edx]
        add     edx,1
        cmp     cl,bl
        je      short found_bx
        test    cl,cl
        jz      short retnull_bx
        test    edx,3               ; now aligned ?
        jne     short str_misaligned

main_loop_start:                    ; set all 4 bytes of ebx to [chr]
        or      ebx,eax             ; ebx = 0/0/chr/chr
        push    edi                 ; PRESERVE EDI
        mov     eax,ebx             ; eax = 0/0/chr/chr
        shl     ebx,10h             ; ebx = chr/chr/0/0
        push    esi                 ; PRESERVE ESI
        or      ebx,eax             ; ebx = all 4 bytes = [chr]

; in the main loop (below), we are looking for chr or for EOS (end of string)

main_loop:
        mov     ecx,[edx]           ; read  dword (4 bytes)
        mov     edi,7efefeffh       ; work with edi & ecx for looking for chr

        mov     eax,ecx             ; eax = dword
        mov     esi,edi             ; work with esi & eax for looking for EOS

        xor     ecx,ebx             ; eax = dword xor chr/chr/chr/chr
        add     esi,eax

        add     edi,ecx
        xor     ecx,-1

        xor     eax,-1
        xor     ecx,edi

        xor     eax,esi
        add     edx,4

        and     ecx,81010100h       ; test for chr
        jnz     short chr_is_found  ; chr probably has been found

        ; chr was not found, check for EOS

        and     eax,81010100h       ; is any flag set ??
        jz      short main_loop     ; EOS was not found, go get another dword

        and     eax,01010100h       ; is it in high byte?
        jnz     short retnull       ; no, definitely found EOS, return failure

        and     esi,80000000h       ; check was high byte 0 or 80h
        jnz     short main_loop     ; it just was 80h in high byte, go get
                                    ; another dword
retnull:
        pop     esi
        pop     edi
retnull_bx:
        pop     ebx
        xor     eax,eax
        ret                         ; _cdecl return

chr_is_found:
        mov     eax,[edx - 4]       ; let's look one more time on this dword
        cmp     al,bl               ; is chr in byte 0?
        je      short byte_0
        test    al,al               ; test if low byte is 0
        je      retnull
        cmp     ah,bl               ; is it byte 1
        je      short byte_1
        test    ah,ah               ; found EOS ?
        je      retnull
        shr     eax,10h             ; is it byte 2
        cmp     al,bl
        je      short byte_2
        test    al,al               ; if in al some bits were set, bl!=bh
        je      retnull
        cmp     ah,bl
        je      short byte_3
        test    ah,ah
        jz      retnull
        jmp     short main_loop     ; neither chr nor EOS found, go get
                                    ; another dword
byte_3:
        pop     esi
        pop     edi
        lea     eax,[edx - 1]
        pop     ebx                 ; restore ebx
        ret                         ; _cdecl return

byte_2:
        lea     eax,[edx - 2]
        pop     esi
        pop     edi
        pop     ebx
        ret                         ; _cdecl return

byte_1:
        lea     eax,[edx - 3]
        pop     esi
        pop     edi
        pop     ebx
        ret                         ; _cdecl return

byte_0:
        lea     eax,[edx - 4]
        pop     esi                 ; restore esi
        pop     edi                 ; restore edi
        pop     ebx                 ; restore ebx
        ret                         ; _cdecl return

strchr  endp
        end

Fa is a musical note to play with CL

jj2007

Quote from: ToutEnMasm on October 24, 2014, 05:05:06 PM
The intel strchr use a different method to find a char in a string

Oh really? Is it faster? Can you post timings?

@Hutch: GOTCHA!

    lea eax, [eax*4+1]            ; set pointer array length
...
  lbl1:
    add edx, 1
  lbl1a:
    movzx eax, BYTE PTR [edx]   ; zero extend byte into EAX
..
    jmp lbl1a                   ; long loop after writing pointer

TouEnMasm


Not really interested in that.I have my own routines to work with text.Perhaps in another post
Fa is a musical note to play with CL

jj2007

Quote from: ToutEnMasm on October 24, 2014, 06:32:36 PM
I have my own routines to work with text.

Interesting. Maybe we can do a speed contest?  :icon14:

hutch--

The basic version with the identical code does not crash, It runs to the end of the source then crashes. I am burdened at the moment because my Win7 does not have a post mortem debugger.

A bit later : The extra label does it, redirecting the output produces and identical file.

TouEnMasm

Quote
Interesting. Maybe we can do a speed contest?
Ok,
The start point of my routines are here:
http://www.masmforum.com/board/index.php?topic=11061.0

The counter of lines count the word 13,10 not only 13.
An html file is a text file and use 10 and 13,10

The SSE2 instructions are in use and it must be difficult to find faster.

I have made more codes using that but all comments are in french,some of them are lost in the forum in english.



Fa is a musical note to play with CL

hutch--

Here is try 2, unrolled part of the tokeniser, reduced the unroll in the line counter and it run at about 630 meg/sec as 486 compatible code. My test piece was a 315 meg text file and it kept timimg at just under 500 ms so its speed is OK for 486 code. You will need to supply your own test text file and insert the correct name in the example.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
                    include \masm32\include\masm32rt.inc
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

comment * -----------------------------------------------------
                        Build this  template with
                       "CONSOLE ASSEMBLE AND LINK"
        ----------------------------------------------------- *

    line_tokeniser PROTO :DWORD
    get_lcnt       PROTO :DWORD

    .data
      caesar \
      db "Friends, Romans, countrymen, lend me your ears;",13,10
      db "I come to bury Caesar, not to praise him.",13,10
      db "The evil that men do lives after them;",13,10
      db "The good is oft interred with their bones;",13,10,0

      ptxt dd caesar

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    print chr$(13,10)
    inkey "Thats all folks, press a key to exit..."
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL hMem  :DWORD
    LOCAL pMem  :DWORD
    LOCAL lcnt  :DWORD

    mov hMem, InputFile("big2.txt")     ; 315 meg test file

  ; ------------------
  ; time the tokeniser
  ; ------------------
    fn GetTickCount
    push eax

    mov pMem, rv(line_tokeniser,hMem)   ; tokenise text
    mov lcnt, ecx                       ; save the line count

    fn GetTickCount
    pop ecx
    sub eax, ecx

  ; -----------------------------------------------------
  ; remove the RET to display the file contents
  ; don't do it on a BIG file or it will never finish. :)
  ; -----------------------------------------------------
    print str$(eax)," ms",13,10
    ret

    push esi
    push edi

    mov esi, pMem                       ; load array into ESI
    mov edi, lcnt                       ; use EDI as the counter

  @@:
    print [esi],13,10                   ; display each line of text
    add esi, 4                          ; increment to next pointer
    sub edi, 1                          ; dec the counter
    jnz @B                              ; loop back if not zero

    pop edi
    pop esi

    free hMem
    free pMem                           ; release memory from tokeniser

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

line_tokeniser proc src:DWORD

  ; ----------------------------------------------------
  ; tokeniser for CRLF delimited text
  ; ----------------------------------------------------
  ; replaces the ascii 13 with zero and writes a pointer
  ; to the allocated memory as an array of pointers
  ; return value in EAX = pointer array address
  ; return value in ECX = the line count
  ; array address must be de-allocated using
  ; GlobalFree() or the macro "free".
  ; ----------------------------------------------------

    LOCAL lcnt :DWORD
    LOCAL pMem :DWORD
    LOCAL alen :DWORD

    push src
    call get_lcnt               ; get the line count
    mov lcnt, eax               ; store line count in variable
    lea eax, [eax*4]            ; set pointer array length
    mov alen, eax               ; store the array size in alen

    mov pMem, alloc(alen)       ; allocate the pointer array

    mov edx, src                ; source address in ESI
    mov ecx, pMem               ; pointer array address in EBX

    mov [ecx], edx              ; load array address into 1st member of array
    add ecx, 4
    sub edx, 1

  lbl1:
    add edx, 1
  nxt:
    movzx eax, BYTE PTR [edx]   ; zero extend byte into EAX
    test eax, eax               ; test for zero
    jz lbl2                     ; exit loop on zero
    cmp eax, 13                 ; test for ascii 13
    je wrtptr                   ; short loop back if not 13

    add edx, 1
    movzx eax, BYTE PTR [edx]   ; zero extend byte into EAX
    test eax, eax               ; test for zero
    jz lbl2                     ; exit loop on zero
    cmp eax, 13                 ; test for ascii 13
    jne lbl1                    ; short loop back if not 13

  wrtptr:
    mov BYTE PTR [edx], 0       ; write terminator at ascii 13 location
    add edx, 2                  ; step over ascii 13 and 10
    mov [ecx], edx              ; write the next line start to pointer
    add ecx, 4                  ; increment to next pointer
    jmp nxt                     ; long loop after writing pointer

  lbl2:
    mov ecx, lcnt               ; return the line count in ECX
    mov eax, pMem               ; the array pointer in EAX

    ret

line_tokeniser endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

get_lcnt proc src:DWORD

  ; --------------------------------------
  ; count ascii 13 to determine line count
  ; --------------------------------------
    mov edx, [esp+4]                    ; the source address
    sub edx, 1
    xor eax, eax
    jmp lbl1

  pre:
    add eax, 1                          ; increment the counter
  lbl1:
  ; -----------
  ; unroll by 2
  ; -----------
    add edx, 1
    movzx ecx, BYTE PTR [edx]
    cmp ecx, 13
    je pre
    test ecx, ecx
    jz lbl2

    add edx, 1
    movzx ecx, BYTE PTR [edx]
    cmp ecx, 13
    je pre
    test ecx, ecx
    jnz lbl1

  lbl2:
    ret 4

get_lcnt endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start

jj2007

Allright, folks, here is the speed test, including CompteurLines - although I have a suspicion that it does really just count the lines:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX)
Hutch   Yves    Recall
2177    902     719 us
2184    1054    874 us
2136    894     1052 us
2115    943     895 us
2061    943     844 us
2094    933     856 us
3039    1015    1109 us
2129    945     853 us
2108    945     844 us
2128    954     858 us
3017    977     853 us
1939    865     991 us
2105    875     888 us


Results Hutch:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
; It is not recomended that WINDOWS.INC be modified but if you need to add
; equates or structures to WINDOWS.INC, do not write anything after the
; following conditional assembly directive that display the duplicate
; warning or it will be duplicated if the file is included more than once.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

ELSE
echo ------------------------------------------
echo WARNING Duplicate include file windows.inc
echo ------------------------------------------
ENDIF

Thats all folks, press a key to exit...

Gunther

Jochen,

the application doesn't work under Windows 7-64.

Gunther
You have to know the facts before you can distort them.

jj2007

Quote from: Gunther on October 25, 2014, 11:25:29 AMthe application doesn't work under Windows 7-64.
Interesting - it's written under Windows 7-64 ::)

Error messages? Where does it stop, if it starts at all?

Anybody else having problems?

sinsi

No problems here


AMD A10-7850K APU with Radeon(TM) R7 Graphics   (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX)
Windows 7 x64 Ultimate
Hutch   Yves    Recall
2731    1530    1389 us
2473    1437    1684 us
2442    1282    1626 us
2329    1525    1600 us
2608    1778    1682 us
2438    1384    1558 us
2391    1357    1567 us
2390    1393    1538 us
2410    1349    1456 us
2329    1660    1821 us
2413    1374    1563 us
2403    1393    1523 us
2404    1339    1774 us

Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX)
Windows 8.1 x64 Pro
Hutch   Yves    Recall
1839    919     674 us
1716    829     776 us
1684    768     734 us
2048    775     789 us
1621    735     727 us
1644    734     699 us
1656    884     746 us
1603    736     695 us
1678    770     723 us
1647    746     704 us
1799    804     745 us
1623    745     705 us
1597    806     694 us

hutch--

It worked OK on my Win7 64 so it may be a security setting. What I would be interested in is a long linear test rather than cycling through a much smaller file as you end up with cache thrashing rather than an accurate speed reading. I used a 315 meg text file to benchmark against so that cache thrashing was not a factor and it was running at about 630 meg/sec on my i7.

There is a technique that is very crude where you simply allocate a massive pointer array buffer and do not use the line count code and this will certainly up the speed when its only a single pass rather than a double pass but to be safe you would have to be able to cater for the full file length being nothing but 13,10 line delimiters but its very memory hungry.

TouEnMasm

Windows XP,access violation bad adress
Quote
(9b4.f5c): Access violation - code c0000005 (!!! second chance !!!)
eax=00000000 ebx=7ffd9000 ecx=00000000 edx=00000000 esi=7c91d96e edi=0099b678
eip=004014ae esp=0012ff8c ebp=0012ffa0 iopl=0         nv up ei pl zr ac pe cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000257
*** WARNING: Unable to verify checksum for C:\DOCUME~1\Luce\LOCALS~1\Temp\Répertoire temporaire 1 pour LineTokenisers (1).zip\LineTokeniserHutch.exe
*** ERROR: Module load completed but symbols could not be loaded for C:\DOCUME~1\Luce\LOCALS~1\Temp\Répertoire temporaire 1 pour LineTokenisers (1).zip\LineTokeniserHutch.exe
LineTokeniserHutch+0x14ae:
004014ae 0fb60a          movzx   ecx,byte ptr [edx]         ds:0023:00000000=??

Fa is a musical note to play with CL