Author Topic: moving thread from pb forums to here because of their server problems  (Read 19578 times)

jj2007

  • Member
  • *****
  • Posts: 13871
  • Assembly is fun ;-)
    • MasmBasic
Re: moving thread from pb forums to here because of their server problems
« Reply #30 on: August 30, 2015, 11:18:34 PM »
New version attached, with over a dozen Replace() statements. Open MS Excel and drag *.txt over the exe

Occasionally, you may see NAME# in Excel. This is because Excel interprets hyphens or plus signs at the beginning of a tab-delimited area as numeric fields.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 10572
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: moving thread from pb forums to here because of their server problems
« Reply #31 on: August 31, 2015, 12:24:04 AM »
I gather in the longer term that you want to batch process a large number of files which may have at least slightly different notation so I wonder if its worth doing a sequence of searches with INSTR on each file to find if it has a known header or footer ?

The Line Input code is an old timer that performs OK but there is probably a faster way to do it, I envisaged something like a linear word search with INSTR to find the lead and trailing strings for each page them grabbing each page with MID$. Alternatively if its only particular pages you require, with page numbers you can scan the text for the page notation or if you need multiple pages create an array of page offsets so you can index your way through them.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

bobl

  • Member
  • **
  • Posts: 72
Re: moving thread from pb forums to here because of their server problems
« Reply #32 on: August 31, 2015, 08:15:41 PM »
JJ
Thanks for your extra work. It's very much appreciated.
Hutch.
That sounds like very good advice and thank you for it.
I've only got the reports for a handfull of companies at the moment but even these few confirm that I'm up against quite a bit of non-uniformity.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 10572
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: moving thread from pb forums to here because of their server problems
« Reply #33 on: August 31, 2015, 08:27:10 PM »
Ok, I guess the trick is to make a lookup list of easily identifiable keywords or phrases that can identify a particular file layout from a given company. Now if you have multiple similar phrases you could stack the order to try the longer ones first then the shorter ones after it.
Code: [Select]
1. Annual Report and Accounts
2. Annual Report
etc ....
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

bobl

  • Member
  • **
  • Posts: 72
Re: moving thread from pb forums to here because of their server problems
« Reply #34 on: August 31, 2015, 10:03:46 PM »
>you could stack the order to try the longer ones first then the shorter ones after it
That's a very good point.
Yes I'll do that and once again thanks for the advice.