OK, I am back, stuff up was in the memory writes after the sort was finished. This is the batch file I tested it with.
@echo off
@echo Sort ascending 100,000,000 words
ssort big.txt result.txt /a
@echo -------------------------------
@echo Sort descending 100,000,000 words
ssort result.txt next.txt /d
@echo -------------------------------
@echo Sort ascending 100,000,000 words
ssort next.txt result.txt /a
@echo -------------------------------
@echo Sort descending 100,000,000 words
ssort result.txt big.txt /d
@echo -------------------------------
@echo Sort ascending 100,000,000 words
ssort big.txt big.txt /a
: ----------
: hex editor
: ----------
hxd big.txt
pause
Run 4 times back and forth and there are no embedded zeros. Thanks for finding this, I was a slob and did not exhaustively test the faster output version as it was producing the same file length. The file IO is still the slowest part of the app, the tokeniser and sort algo are hard to time without a massive source file, I have tested it on 10 million words and it will handle bigger with no problems.
LATER : Here is the output with 100 million words.
Sort ascending 100,000,000 words
big.txt loaded at 1000041101 bytes
100000000 lines tokenised
100000000 lines sorted
Writing result.txt to disk
That's all folks !
-------------------------------
Sort descending 100,000,000 words
result.txt loaded at 1000041101 bytes
100000000 lines tokenised
100000000 lines sorted
Writing next.txt to disk
That's all folks !
-------------------------------
Sort ascending 100,000,000 words
next.txt loaded at 1000041101 bytes
100000000 lines tokenised
100000000 lines sorted
Writing result.txt to disk
That's all folks !
-------------------------------
Sort descending 100,000,000 words
result.txt loaded at 1000041101 bytes
100000000 lines tokenised
100000000 lines sorted
Writing big.txt to disk
That's all folks !
-------------------------------
Sort descending 100,000,000 words
big.txt loaded at 1000041101 bytes
100000000 lines tokenised
100000000 lines sorted
Writing big.txt to disk
That's all folks !
Press any key to continue . . .