News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

still not sure how to take care of data for my text editor

Started by gelatine1, July 02, 2014, 04:02:41 AM

Previous topic - Next topic

gelatine1

I am still not quite sure how to store the data for my text editor.

Before I simply had a big array with all characters stored in it. I realised that would be very bad in case someone inserts a character. I would have to reallocate all the data and that could be very time consuming for large files.

So that is why I decided to use something else. Now I have this big array but they are divided into parts of 128 characters for each line. And then I also had another array containing the amount of characters for the current line.
Inserting a character could be done quite easily by only reallocating the data in the current line.
But, once the user removes a line for example, the whole array would have to be reallocated.

That's why I thought of a third way to store my data. I still have this big array divided in sections of 128 characters. I still have the array that contains the amount of characters for each line.
I added a third array now. This array contains a pointer to all the lines. Once the user removes a line I could just remove the pointer and reallocate all those pointers. The data would remain in the big array but it wouldn't have to be reallocated since there is no pointer pointing to that part left.
The disadvantage of that though is that in the end I would have a lot of bits and pieces in the data that should be scrambled back together somehow to store it in a file for example.

I am not sure if there are any better ways to store the data ? Or should I use one of the methods I mentioned before ? Or some kind of combination ?

If I was not clear enough then please ask me.

Thanks in advance,
Jannes


dedndave

i am trying to think of a really slick idea for this...

consider that a single line might be very large (no CR/LF)
if it's displayed with line-wrapping, it might be a paragraph that adjusts to the window width
but, in the file, it is one big line

so, maybe, there is one format for the file data
and another format for the displayed data - just thinking, here   :P

dedndave

so - you might make a rather large buffer for the file text
you could even use file-mapping for that   :biggrin:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa366556%28v=vs.85%29.aspx

but, fortunately, only so much text can be displayed at any given time
so - you create an array of pointers into the mapped file text
it wouldn't have to be a very large array
i would think it could be in the .DATA? section, rather than allocated memory
when they scroll or turn off/on line-wrap, you re-organize the array of pointers instead of moving all that data around

qWord

You can use (at least) a linked list rather than (string) pointer arrays. This would speed up insertion an deletion of lines enormously but reduce speed for random access. More complicated, but probably efficient for large files, is to use some kind of tree structure...
MREAL macros - when you need floating point arithmetic while assembling!

gelatine1

Dave,
I assume you meant I should create a file view of the current data to be displayed and that I should just edit that? But how exactly does this solve any inserting or deletion reallocation?

qWord,
How would you maintain the string's order in the tree ?

dedndave

i don't know - i'm just spit-balling ideas   :biggrin:

you would probably create a temporary file, anyways, and work on that
if the size of a block changes (which is very likely), you could swap between 2 temporary files
the exact logistics are something that you'd have to work out
but - a good design starts with ideas   :P

of course, we could google around or look at other source code to see how others do it
but - it's fun to come up with you're own method first, then look at others for comparison

gelatine1

Quote from: dedndave on July 02, 2014, 05:50:55 AM
of course, we could google around or look at other source code to see how others do it
but - it's fun to come up with you're own method first, then look at others for comparison

I totally agree on that :D

Quote from: dedndave on July 02, 2014, 05:50:55 AM
if the size of a block changes (which is very likely), you could swap between 2 temporary files

Well that would be linear in file size and that's not really what we want huh? :p

jj2007

Quote from: qWord on July 02, 2014, 04:50:31 AMYou can use (at least) a linked list rather than (string) pointer arrays. This would speed up insertion an deletion of lines enormously but reduce speed for random access. More complicated, but probably efficient for large files, is to use some kind of tree structure...

Inserting a string into an array is a fairly trivial act: 2803 ms for 14172 inserts = 0.198 ms/insert (testbed attached)
0.2 milliseconds for inserting a line into an array of 113,000 lines that holds the full content of \Masm32\include - will anybody notice that delay??

Jannes, what is your reason for not using a RichEdit control (or a normal edit control), as everybody else does? You want to learn it the hard way?


gelatine1

Quote from: jj2007 on July 02, 2014, 06:03:08 AM
Inserting a string into an array is a fairly trivial act: 2803 ms for 14172 inserts = 0.198 ms/insert (testbed attached)
0.2 milliseconds for inserting a line into an array of 113,000 lines that holds the full content of \Masm32\include - will anybody notice that delay??

Oh well, I didn't realise it was so fast... But anyway I just like to have things to be optimized. I got a training for the IOI last year (unfornately I'm not selected) and now I simply have to optimise everything I get now :)

Quote from: jj2007 on July 02, 2014, 06:03:08 AM
Jannes, what is your reason for not using a RichEdit control (or a normal edit control), as everybody else does? You want to learn it the hard way?

What's so special, useful about this ? How could it be used ? Or just what is it ?
I think the main reason that I wouldn't use something like that is education :) I'm here to learn not really to actually create something (although it would be nice if I did) and I believe I learn more by avoiding things like that.

jj2007

Quote from: gelatine1 on July 02, 2014, 06:43:03 AMOr just what is it ?

A RichEdit control is like a normal edit control but with many more features (MSDN). For example, qEditor and RichMasm use it, also the attached TinyIDE. Doing all that by hand is Sisyphus' work...

Tedd

Allocate one array for line pointers.
Allocate another character array for each single line -- whose pointers go in the lines-array.

The arrays for each line will necessarily be different sizes - this is fine - but you may as well round them up to 16, so a few random insertions don't require a full reallocation. Larger modifications may require new allocations, but only for the given line - just replace its line pointer and it's reinserted.

The line-pointers array can easily start with space for a few 1000 line pointers. And grow if necessary - just copy the pointers to the new one and off you go.
Insertion/removal of entire lines is simply a matter of shifting up/down the line pointers after that point.
You also get line line-numbering for free, since you know which line pointer you're currently using. And the total number of lines.

Saving the text to file should be pretty obvious; and you can choose your own line-ending.
Potato2

gelatine1

Yes Tedd, that was my third idea I think I am just going to use that :)

qWord

Quote from: gelatine1 on July 02, 2014, 05:31:20 AMHow would you maintain the string's order in the tree ?
By definition? My thought was very similar to a B+ tree. Because the line numbers change, they can't used as key. Instead the quantity of lines that each node represent is used as key. Only the root element saves a line number. For the leafs I would use not-too-large pointer arrays.

However, as said, its complicated and you better keep it simple and just use a plain pointer array.
MREAL macros - when you need floating point arithmetic while assembling!

qWord

Quote from: jj2007 on July 02, 2014, 06:03:08 AMwill anybody notice that delay??
Interesting statement from a known clock cycle hunter  :biggrin:
MREAL macros - when you need floating point arithmetic while assembling!

gelatine1

Quote from: qWord on July 03, 2014, 12:00:05 AM
Quote from: gelatine1 on July 02, 2014, 05:31:20 AMHow would you maintain the string's order in the tree ?
By definition? My thought was very similar to a B+ tree. Because the line numbers change, they can't used as key. Instead the quantity of lines that each node represent is used as key. Only the root element saves a line number. For the leafs I would use not-too-large pointer arrays.

However, as said, its complicated and you better keep it simple and just use a plain pointer array.

The only tree I knew was this one (http://en.wikipedia.org/wiki/Tree_(data_structure)) But Ill take a closer look in that B+ tree :) thanks!