News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Small database application with source and test DB generator.

Started by hutch--, December 21, 2014, 12:35:48 AM

Previous topic - Next topic

hutch--

The attached application is a "Cardfile" style small database written as a test piece for a database file that is stored as plain text with each record being written to a single line of text. The purpose of the test was to produce a data storage technique that has very high data density which in turn reduces the memory demand and increases the number of records that can be stored in an "in memory" database. This is contrasted against a fixed record database which generally contains a reasonable amount of wasted space. As far as capacity, the app will handle a 1 million record databases OK, it will load 5 million but the PB sort runs out of memory.

The advantage of the technique is it can be loaded directly into memory then parsed on a line by line basis into a dynamic string array so that record access is based on an array index and this makes the load and save reasonably fast. The logic was that parsing a single line into its component parts both in and out of the display made using a single line of text viable. Navigation is an area that I consider the app has shortcomings, if it was an app that I was going to use for my own purpose it would have a listview control in report mode as I am used to wading through massive lists to find things but in terms of general purpose software, it would be classed as bad design that would scare most users away.

It has forward and reverse, select by record number, 2 stage name search, family name with the option of given name, first and last and an accelerating fast forward and reverse which work OK but it really does need a list of some sort and this would require an application redesign. New records are added to the end of the database and to keep the data in sorted order, there is a menu option to sort, save and reload the database. With the name search, if you search for "smith" you will get the first entry for that name, if you search for "smith" with the optional given name IE: "Bill" you increase the accuracy of the search so that if there is no "Bill Smith" in the database, the search will tell you that the record cannot be found. The search is case insensitive.

The source code has the option of using US American states where the default is that states in Australia. It is done with an equate at the start of the main source file. To build the app you will need to edit the "makepb.bat" file and change the paths to your own. The header file are Jose Roca's so you will need to have Jose's includes to build the app. The code design is primarily Windows API code and it has used PowerBASIC's dynamic string arrays and the intrinsic parsing functions built into PB which are well suited for working with dynamic string arrays. There are 2 assembler functions, one to format text to clean up any user mess ups, the other is a case insensitive string compare to add some pace to the array scan. A single function is simply faster that 2 case conversions and a string compare.

There is an attached directory that contains an app to make test database files of different sizes that are driven by a set of batch files for different database record counts. The small ones are reasonable in time waiting but the 1 million record test db is very slow and its "go have a cigarette and make a cup of coffee" before it finishes. It is slow to make sure the random algo is seeded regularly to avoid duplicates.

i would like to thank all of the PB guys who have made comments on what is expected in a modern database as I have not written a database since the middle 1990s. I looked at MSACCESS and thought "YUKKKK" and have not bothered since.

Gunther

Hi Hutch,

Quote from: hutch-- on December 21, 2014, 12:35:48 AM
i would like to thank all of the PB guys who have made comments on what is expected in a modern database as I have not written a database since the middle 1990s. I looked at MSACCESS and thought "YUKKKK" and have not bothered since.

is your data base relational? Could it be normalized?

Gunther
You have to know the facts before you can distort them.

hutch--

Gunther,

It probably does not qualify on either count, it is a database of a fixed number of fields stored in a plain text file with each entry being on a single line. Each database record is parsed both in and out of the display. The design consideration was to have a high level of data density so that more records could be fitted into the same amount of memory without the wasted space of fixed length records. It is effectively variable length fields divided by a separator so that you don't need extra allocated space to store records of variable length.

It means that constructing queries on different fields requires more parsing of a full record scan but the gain in memory efficiency makes up for the extra work. The sort criteria is based of the first field and it would take a specialised sort to do it any other way.

jj2007

Quote from: hutch-- on December 21, 2014, 09:06:54 AMThe sort criteria is based of the first field and it would take a specialised sort to do it any other way.

Try the attached table viewer:
- drag a db*.txt file over ConvertDb2ExcelTab.exe (replaces 7fh with tab, making it compatible to Excel)
- hit Enter when the proggie invites you to do so
- click on "Name", then on "City" etc to sort the records.

Result will look as shown below. It uses a stable sort so that you can sort by several columns. The sort may take a while, on my i5 the one Million records database needs almost one second (but try the same in Excel :bgrin:).

Ctrl F launches a rudimentary FIND function - the match will be in the second line from the top.

hutch--

Seems to work OK, one of the next tasks is to create a listview control in report mode but I only want the sir name and given names for navigation purposes. I may have a play with the sort, save and reload to get the capacity up, it was running out of memory when the list count was over 5 million.

I don't like the include file I have available for the listview control so I have been working on another one, got all of the structures from MSDN and pulled the messages out of the masm32 include file. A listview control will be useful for doing queries if I can design enough filters to make it useful.

Gunther

You have to know the facts before you can distort them.