walkitout (walkitout) wrote,

Boringly More About File Formats, Character Encoding and Physical Media

I'm running up against a persistent idea that file formats that involve computers are or will be longer lasting that physical formats. So I'm going to think about that for a while (in my written voice) and try to figure out why that idea is kicking around and whether it is plausible.

I suppose I should start with Hollerith cards. While no file format before 1980 can make any claim to be a mass consumer format, and it would be at least another decade before a file format could make a not-immediately-laughable claim to be a mass consumer format, we have had computer file formats around for a long while. He devised an electric tabulator in 1889 that was used to tabulate the 1890 census (and really, making me think about this makes the genealogist in me start crying. I am Not Joking. I am pretty callous when it comes to data loss, but the loss of the 1890 census was a tragedy.). His company would go on to become (part of) IBM, and punch cards would continue into ... if not right now, then damn close. Wikipedia seems to think some voting machines still use punch cards.

But "punch cards" are like "magnetic tape". That's a physical medium that a file format is saved onto, and it's important for my purposes to distinguish between them. For consumer mass media like LPs, CDs, DVDs, etc., you can talk merge the two, because the mass consumption format doesn't have two flavors of vinyl with incompatible formats stored atop (Do Not Start With Me. I don't want to hear about it. _Mass consumption_ is an important qualifier), for example. But there were a lot of different formats on punch cards, magnetic tape, etc.

There was one Rabidly Successful physical format "punch card", 80 columns wide: the IBM punch card that lasted about 50 years starting in 1928. However, the character format used did not stay stable over the 50 years of the active life of the card (I'm assuming the punch card died as an active format sometime in the early 1980s with the arrival of the minicomputer; obvs, it continued as an increasingly niche product, just like LPs at Urban Outfitters, etc.).

EBCDIC, for example, got its start in 1963, on punch cards, for the 360. And people _still use EBCDIC_, but not typically on punch cards, altho often on modern hardware still running within the constraints of the 360 system. EBCDIC is an overlapping 50 year format: its physical media has changed but the character encoding is roughly constant. (ASCII, similar, natch.)

At this point in the analysis, I resort to, Really? Why? Why 360? Why VMS (altho I think that one is finally dead)? Why, Why, Why in 2013 Fortran? Really, Why? (Insert existential angsty scream here.) That's simple: the people and organizations who most desperately needed something better/other than paper in the 1950s adopted these systems as soon as they became available, and it's been really fucking hard to migrate off of them. The IRS. The FBI. Insurance Companies. Etc.

If you want to understand the lifetimes of formats (whether physical, character or file), you need to give some thought to the people and organizations who use them. If a group of people and organizations refuses to migrate off a format, It Will Continue. When the people who use a format migrate to another, It Will End.

It only _looks_ like a technical problem. It's not. And that's why I generally don't really care when a format dies. I am neither an archivist nor a collector; I'm an active user and I bring along with me the stuff I care about and leave behind what I am done with. Libraries, and other individuals and organizations have a different set of priorities, that can be much more difficult to reconcile to lived reality.

ETA: As computer file formats increasingly reflect the desires of the mass consumption market, they will quit acting like the seemingly persistent and somewhat stable formats of whatever the computer version of yesteryear is. As the mass market demands AND GETS the ability to migrate from one format to another (witness iTunes Match, but before that, we insisted on products that let us record stuff off the radio and off of TV and so forth), we'll see format lifetimes evolve in concert.
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 1 comment