News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Data Degradation

Started by Nyatta, October 13, 2015, 01:06:33 AM

Previous topic - Next topic

Nyatta

Quick question: my algorithm will require data stored to the hard drive that remains perfectly in tact, say 1 gb without data degradation otherwise the entire structure will topple over, is this feasible? Say one bit loses it's charge and throws off everything. I'm uncertain as to how relevant data degradation is and what I can expect from it.

hutch--

If your hard disk is sound, you should have no problems. If you application is a genuinely high security in terms of content then placing it in a RAID array may be safer but multiple copies on different disks is also viable in security terms.

KeepingRealBusy

The hard drive only has a single two way pipe, data can go onto the disk, or come out of the disk. You have no control of what happens on the disk itself. Let us say that you write a sector of data. this sector is internally protected with a CRC (hamming code) which not only allows the dive to detect if the data is intact, it also allows any single bit failure to be detected AND CORRECTED. What you see if you read the sector is the corrected data. The hard drive can also flag the internal directory so that, if the file is deleted, the space will not be made available for subsequent use where a second failure (now multiple bit)  cannot be corrected.

Memory chips can and do use the same method to protect data.

Now the problem, in your case, is to determine what  "remains perfectly in tact" means. Things get fuzzy when automatic error correction comes into play.

Please explain this requirement more completly.

Dave.

rrr314159

Quick answer, such data fails very rarely. As hutch says use multiple RAID disks if it's worth the trouble. Operationally, roll-your-own CRC-type checks to ensure the copy is good. I think you mean the data needs to be intact for current use, right? - not as months and years go by. If the latter, of course, back it up religiously.

Bottom line, proceed with your project, this issue is not a show-stopper
I am NaN ;)

Nyatta

Quote from: KeepingRealBusy on October 13, 2015, 01:26:05 AM
Please explain this requirement more completly.
I mean to say that I'll be loading data from the drive, a single bit's value changing over time from an imperfect drive will ruin everything that comes after it. It's a hardware questions, not a MASM32 question: I was uncertain which section to ask this in.

Quote from: hutch-- on October 13, 2015, 01:12:38 AM
If your hard disk is sound, you should have no problems.
Okay, I was just uncertain as to how capable modern drives are at holding on to data over long periods of time.

Quote from: rrr314159 on October 13, 2015, 04:48:33 AM
As hutch says use multiple RAID disks if it's worth the trouble.
I'm hoping to be able to transfer files and have them used on a wide range of Windows systems, I am only now accounting for the possibility that data may change without my say.

Using a non-faulty average drive can I expect to lose bits to my file over the course of a year?

rrr314159

Quote from: NyattaFauxUsing a non-faulty average drive can I expect to lose bits to my file over the course of a year?

- No. Consider how long the lib and inc files in masm32 directory (to use one example) last; we don't expect them to drop bits here and there, even as years go by. However - I thought you were talking seconds; like, write the gig of data to disk, then access it during this run of the program. After a year, you still don't expect trouble, but ... if bad things will happen with corrupt data (like, customer's accounts will lose dollars, or weapons will be launched inadvertently) you definitely want to use CRC-type checks to ensure data validity, and have another copy (s) to use instead. Make sure you shut down the program with appropriate message b4 making mission-critical mistakes.

OTOH if it's just a video game or something, I wouldn't worry about it. Someday when the whole thing is up and running look at implementing such data validity checks

BTW if data does get corrupted, far more likely due to human error. For instance, someone opened the file with a text editor (trying to reverse-engineer your prog, perhaps). So, if that's a concern, data validation would be a good idea
I am NaN ;)

Nyatta

Quote from: rrr314159 on October 13, 2015, 05:57:04 AM
OTOH if it's just a video game or something, I wouldn't worry about it. Someday when the whole thing is up and running look at implementing such data validity checks
I'll eventually play around with game engines, however I'm currently just playing with data compression and I want to know my limits. I'll certainly implement data validity checks when the time comes however.

Thanks for the information.

dedndave

what i suggest is to compress small blocks, say 4 Kb or something (32 Kb, 64 Kb are a nice sizes)
if a bit goes haywire, you lose 4 Kb
still, missing 4 Kb is a bad file - lol

as to data degradation
there is no such thing as a perfect storage media
some type of data loss is inevitable
the only way to hedge is to backup your stuff - that's the user's responsibility, not yours

we zip files all the time
make a zip file of something visual, like a jpeg
alter 1 bit in the middle of the file someplace and see what happens
try unzipping it with windows extract and with 7-zip

Nyatta

Quote from: dedndave on October 13, 2015, 08:34:03 AM
we zip files all the time
make a zip file of something visual, like a jpeg
alter 1 bit in the middle of the file someplace and see what happens
try unzipping it with windows extract and with 7-zip
That is a very great example of what I was hoping to sort out, I opened up a zip file and changed 1 bit with a hex editor, and tah-dah, corrupt file! I'm glad my structure makes the compression as vulnerable as any other compression, and not extra so, this is great news. Yet another technological wonder that I need to wrap my head around...
I hope to have a working test within a month or so, from there it's further optimization and compression.

GoneFishing

No need in so much time for simple CRC check ;)

jj2007

If your data is valuable, such as sources, private documents, your Ph.D. thesis, make regular backups, and put them into distant places. Not 3 USB sticks in the same drawer (thieves will take all three, or they may burn down with the house), but rather one at home, one in the car, one at work. Use 6 sticks, and alternate between them: set A on Monday, Wednesday, Friday, set B on Tuesday, Thursday, Saturday. One bad scenario is to save bad data to a backup without knowing it...

Modern harddisks are not "lossy" per se. They may fail, ultimately. But if you have a bloated Windows OS in the gigabyte range, imagine what happens if somewhere inside these gigabytes one byte in Kernel32.dll decides to become push eax instead of pop eax... the OS simply can't allow that to happen.