News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

I need a program called "FindLine"

Started by learn64bit, November 29, 2022, 04:41:33 AM

Previous topic - Next topic

learn64bit

Source
a.txt
b.txt
Result
c.txt

e.g.
a.txt file bytes
  00h
   1 byte file, didn't have line seperator "0Dh 0Ah"
b.txt file bytes
  0Dh 0Ah 0Dh 0Ah 00h 0Dh 0Ah 0Ah
   8 bytes file
   It has 3 empty lines
    first empty line
     before line seperator, and not after line seperator
    second empty line
     before line seperator, and after line seperator
    third empty line
     not before line seperator, and after line seperator
   two not empty line
    first
     00h
    second
     0Ah
c.txt file bytes should be
  00h

Why "FindTheCommonLines"? Because we have line sepetator "0Dh 0Ah" here
We don't care about empty lines, so just don't look at them

I wonder if this program already exist?

HSE

Quote from: learn64bit on November 29, 2022, 04:41:33 AM
I wonder if this program already exist?

Look that in your mind program is very clear. Just build it  :thumbsup:
Equations in Assembly: SmplMath

jj2007

Please do yourself a favour: write what you need in your native language, then let DeepL translate it. Plus, zip a.txt and b.txt and post them here. Maybe somebody will understand then what you want :rolleyes:

NoCforMe

Wait, wait: before taking JJ's advice (which is good advice*), let me ask: is what you're really looking for here something that could be called "Find common lines"? Because based on your description (what I can figure out in it, anyhow) is that you want a count of lines that are identical between a.txt and b.txt, not including blank lines. Is that right?

* I have to second JJ's kind of annoyed suggestion, because it annoys me too: people who post stuff here in English that's impossible to understand. Before you accuse me of being an Anglocentric snob or something, let me point out two things: one, that this happens to be an English-language forum, and two, that if I were to post such incomprehensible gibberish in your native language, you'd be annoyed too.
Assembly language programming should be fun. That's why I do it.

hutch--

 :biggrin:

If I had a buck$ for every incomprehensible question ever posted, I would own 2 Cadillacs, a house on the Riviera and at least 2 Lear jets.

Perhaps what our friend needs is a little more comprehension and some better explanation of what he is after.

learn64bit

I want c.txt. c.txt is a file.

FindLine is the program's name
FindTheCommonLines is the main function's name

common is a cn english word, like common ground wire, live wire

Keep the questions comming, I will answer them.

NoCforMe

But did I describe accurately what you want the program to do? That wasn't clear in your original post.

As I said, it sounds to me like you want the output file to contain all lines that are common to both input files, excluding blank lines. Is that so?
Assembly language programming should be fun. That's why I do it.

learn64bit

Yes.

My test a.txt is a 242,206 KB, b.txt is 923,293 KB

(Maybe I will need other options, like only check line's first 40 bytes[sha1sum],  like only check line's first 64 bytes[sha256sum], ideas is welcome)

hutch--

Now the next question is, are you comparing only lines or is there multi-line text ?

learn64bit

a.txt and b.txt have 00h byte, 0D byte, and FFh byte... etc.., if that is ok, it is text file
the biggest line is under 4mb

NoCforMe

An (ASCII) text file normally doesn't have any zero or FF bytes in it. What type of text file are you using here exactly?

The normal end-of-line characters would be either carriage return (0D) or carriage-return/line feed (0D/0A).

A 4 megabyte line? really? That's huge!

Sounds weird to me.

Assembly language programming should be fun. That's why I do it.

learn64bit

Yes, weird to me too

4mb is fine, below WinAPI will do

invoke HeapAlloc,eax,4*1024*1024
HeapSize

LocalAlloc
LocalSize

GlobalAlloc
GlobalSize

NoCforMe

Yes, we know you can easily allocate a 4MB buffer, no problemo. But a 4MB line???? You mean one line of text? or are you talking about the size of the entire file?

You gotta be a lot more clear about what you're trying to do here.
Assembly language programming should be fun. That's why I do it.

learn64bit

Yes, the biggest line is almost 4mb in size. the biggest file is almost 1gb

NoCforMe

Sounds like records in a database, not any kind of text file. How could you have a 4MB line of text?
Assembly language programming should be fun. That's why I do it.