Difference: FightEntropy (23 vs. 24)

Revision 2416 Feb 2013 - TobyCabot

Line: 1 to 1
Deleted:
<
<

Introduction

  Information that's stored in computers "rusts" at least as fast as information stored in real-world media. I'm reminded of this every time I walk through the cemetery down the street from my house: many headstones manufactured in the 17th century still convey the information that they did when they were new. On the other hand, when my Mom "upgraded" to Windows '95 a few years back she found that all of the documents she had written in her low-end Microsoft word processor were completely illegible on the new version of the high-end office suite, even though only 5 years had passed. She was lucky to be computer-naive - she lost nothing since she had printed all of the documents on paper. If she had used the computer the way that the manufacturers wanted her to she would have either lost the documents or I would have spent a lot of time scraping the data out of them.

What can you do? To start with, be aware of information entropy, and decide whether or not you care about it. If you don't care then you're wasting your time reading this document; if you do then I hope that you'll learn something from it.

Changed:
<
<

Discussion

In my (admittedly brief) experience with computers, data becomes inaccessible for two reasons: the physical media that it's stored on becomes unreadable (i.e. http://www.informationweek.com/story/IWK20010719S0003 ), or the format that the data is stored in becomes indecipherable. While both reasons produce the same result, they are different in terms of the actions that you need to take to prevent them.

>
>
In my experience with computers, data becomes inaccessible for two reasons: the physical media that it's stored on becomes unreadable (i.e. http://www.informationweek.com/story/IWK20010719S0003 ), or the format that the data is stored in becomes indecipherable. While both reasons produce the same result, they are different in terms of the actions that you need to take to prevent them.
  Media I've used:
  • cassette tape
Line: 21 to 18
 
  • CDROM
  • DVD-ROM
  • USB Thumb drive
Added:
>
>
  • CompactFlash
  • SecureDigital cards
  Media I haven't used, but know about:
  • Hollerith punchcards
Line: 30 to 29
 
  • Zip drives (100MB, 250MB)
  • Jaz drives
Changed:
<
<
Media can become unreadable if the media itself fails (magnets, scratches) or if the reader breaks and can't be fixed. This happens frequently because the media and the machines that read it are physical devices and therefore age and degrade over time. A failure in either the media or the reader is enough to render the data lost forever, but if one floppy fails you lose only the data on that floppy but you lose it permanently. If the floppy reader fails then you lose access to all of the data on all of that type of media, but you can potentially get it back by finding someone else that has that type of reader and borrowing it from her.
>
>
Media can become unreadable if the media itself fails (magnets, scratches, click of death) or if the reader breaks and can't be fixed. The media and the machines that read it are physical devices and age and degrade over time. A failure in either the media or the reader is enough to render the data lost forever: if one floppy fails you lose only the data on that floppy but if your floppy reader fails then you lose access to all of the data on all of that type of media. You can potentially get it back by finding someone else that has that type of reader and borrowing it from her, but you don't want to count on that.
 
Changed:
<
<
The contents of the media can become unreadable, even if the media is readable, if there is no software that can decipher the format that the data was stored in. You can think of storing data in terms of encrypting it using an encryption algorithm. Some encryption algorithms are stronger (i.e. harder to decipher) than others, and some algorithms are more widely understood than others. The software that you use to read and write the data are the encryption key. If you understand the "encryption algorithm" that you're using to write the data then you will not have to worry about deciphering it, but if you don't understand the algorithm then you are dependent on that software to read it for you. My Mom was dependent on software that could read data encrypted in a certain Microsoft algorithm that neither she nor I understood, and over time Microsoft themselves "lost the key" to that algorithm so her data was permanently locked up and Mom didn't have the key.
>
>
Data can become unreadable, even if the media is readable, if there is no software that can decipher the format that the data was stored in. Think of storing data in terms of encrypting it using an encryption algorithm. Some encryption algorithms are stronger (i.e. harder to decipher) than others, and some algorithms are more widely understood than others. The software that you use to read and write the data are the encryption key. If you understand the "encryption algorithm" that you're using to write the data then you will not have to worry about deciphering it, but if you don't understand the algorithm then you are dependent on that software to read it for you. My Mom was dependent on software that could read data encrypted in a certain Microsoft algorithm that neither she nor I understood, and over time Microsoft themselves "lost the key" to that algorithm so her data was permanently locked up and Mom didn't have the key.
  This is the most important reason (among many) why you should never store any data in a format that's not well documented. If the only person that understands the format is the person or company that produced it, they can decide at any time that they don't want to support it anymore and you have very little recourse. You could try to figure out how the format works by looking at your documents, but the process (known as "reverse-engineering") is time-consuming and boring and may be illegal in some cases. Note that the key distinction is not whether the format is proprietary or non-proprietary, it's whether the designer of the format has provided enough documentation of it that other people can read and write it. Some proprietary formats, for example Adobe Portable Document Format, are very well documented so many tools can read and write them.
View topic | History: r26 < r25 < r24 < r23 | More topic actions...
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding The Caboteria? Send feedback