Organizing Computer Resources

Or, How I Learned to Stop Worrying and Love the DDC

If you are like me, you love order but drift towards chaos. Ever since I have owned a computer I have struggled with the problem of organizing my computer materials: principally the files on my hard discs, but also the floppies, the manuals, the folders of print-outs, and the documentation that computing generates. I went through four main periods in this quest for order.
  1. Application-oriented organization. In the very beginning, I made the embarrassing mistake of classifying documents according to the application that made them: all the Superpaint documents in one folder, all the WriteNow documents in another, all the Excel spreadsheets in a third. This application-oriented system has nothing to recommend it, because the application that generates a document says little about that document's essence: a Superpaint picture of a Blue Jay and a Writenow description of a Blue Jay are more closely related than, say, Superpaint pictures of a Blue Jay and a cucumber. I quickly abandoned this approach.

  2. Ad hoc project-oriented organization. My second scheme was an ad-hoc hierarchical classification that attempted to group materials by related projects: birding materials in one folder, cookbooks in another, addresses in another. This seems to be the system most people use in practice. While certainly better than the application-oriented scheme, it leaves much to be desired. For one thing, it merely displaces the organizational problem from documents to projects. It imposes on the user the constant burden of making classification decisions with no guidance as to how they are to be made. Should cookbooks and diets be lumped in a "food" project? Do "figures of speech" go under linguistics or rhetoric? The same problem arises with applications: should word processors and layout programs be grouped together? Should SCSI drivers go with scanners in a peripherals category, or should they be in an operating systems category, or in a group by themselves? And so on, and so on.

    A second problem with the ad hoc system is that it does not lend itself to the linearization needed for organizing floppy disks and hard-copy folders. "Biology:birds:bird lists:DAM:USA:1992" may be a reasonable path in a hierarchical file system, but it is too long to write on the spine of a floppy or on a file folder label.

  3. Home-brewed formalized organization. To solve the problems of the ad hoc system, I moved a couple of years ago to a four-letter system. Inspired by the OSType used so widely on the Macintosh, I devised a system of 4-letter mnemonics to use for linearization: 'RECR' for 'recreation', 'ARMS' for 'arts, musical', 'GRAP' for 'graphics applications', and so on. I was quite pleased with this system for a long time. It gave me short labels to use in organizing floppies, manuals, and hard-copy folders, as well as the files on my desktop. It was extensible, and it was fairly easy to learn.

    Eventually, however, I grew to hate this system. Like the ad hoc system, the four-letter scheme imposed classification decisions that had to be made in the dark. Even the decision as to whether the system was what librarians call 'faceted', i.e. composed of meaningful atoms, or enumerative, i.e. an unanalysed whole, was never made: 'GRAP' is faceted (GRaphics + APplication) while 'RECR' is enumerative. I came to feel that I was constantly making decisions that others almost certainly had made before, and had thought about harder than I could afford to.

  4. Dewey! It finally occurred to me to try using library call numbers as an organizing principle. It is an obvious idea, but I had always thought that library classification systems would be useless for that purpose because I had made the same mistake I had made in the application-oriented system: I had assumed that all my files would be jammed into one tiny category for "computer software", and that the library classifications would offer no guidance as to how to subdivide that category. But this is wrong. It is like assuming that a book on taxation will be classified under "bookbinding", since it is itself a bound book. It violates what library scientists call the "rule of application", to the effect that works should be placed at their point of application; computer programs for birdwatching go with birds, not with computers, and so on.
Once I grasped that principle, it became clear that library classifications were ideally suited to my organizational needs. They have evolved over a long period of time to solve exactly the sorts of problems that confronted me, and they embody a tremendous amount of collective wisdom. Importantly, they address the linearization issue that was a large part of my problem: libraries need short codes on books so that they can be reshelved quickly and accurately.

I devoted some thought to the question of whether to use the Library of Congress or the Dewey Decimal classification system. Thanks to many years of misguided government subsidies during the sixties and seventies, the LC system is more widely used in the libraries I am most familiar with. On the other hand, the DDC is more international, being by far the most widely used system in the world. Support for it inside the Library of Congress is much stronger now than it used to be, and nearly all books processed by the Library of Congress these days are given both LC and DDC call numbers. Because of its decimal nature, the DDC lends itself better than the LC codes to the "broad" classification which best suits my needs: as long as I do not have too many music programs, I can simply label them "780" and not worry about making finer distinctions. Furthermore, the DDC's uniform decimal notation has always seemed to me more user-friendly, easier to remember, and just generally more aesthetically pleasing than the LC system, with its mixture of one- and two-letter bases and its odd elevation of naval and military science to highest-level categories. When I discovered the Abridged DDC, which is tailored for small collections like mine and contains everything I needed to start classifying in one concise volume, my choice was clear.

Classifying the files on my Macintosh using the DDC was a joy. All the decisions I had found so onerous had already been made for me - all I had to do was look them up. Cookbooks and diets are separate, as are SCSI drivers and scanners; figures of speech belong under rhetoric, not linguistics; and so on.

It would have been nice, of course, to be able to add the DDC codes as new Finder attributes on the files, but alas, that is not a facility the Mac OS supports, so I was stuck with embedding them in file names. One question I had to resolve was whether the "virtual field" containing the DDC code in the file names would be fixed-length. If the field is variable, the file names do not line up and are hard to scan:

005.7 Databases
332.024 Accounting
740 Graphics
793.93 Adventure games
On the other hand, if they are fixed, then the call numbers themselves cannot exceed that fixed limit, and shorter numbers have to be padded with blanks, thus wasting valuable space. The suitability of the DDC for broad classification is very useful here, since it allows one to broaden the classification until the call number is short enough. I finally compromised on fixed fields limited to one place after the decimal; in practice this works pretty well. Here is what my highest-level "Applications" directory looks like now:
004.5 Disks & Backup
004.6 Communications
004.7 Peripherals
005.1 Programming
005.4 Utilities
005.7 Databases
006.4 Character Recognition
006.5 Sounds
332   Accounting
370   Education
400   Linguistics
526   Cartography
530.8 Color Tools
650.1 Time Management Tools
652.5 Word Processing
658.1 Spreadsheets
658.4 Presentations
686.2 Desktop Publishing
740   Graphics
780   Music
794   Games
The hierarchical nature of the DDC fits in nicely with the hierarchical file system on the Macintosh: I could have combined all the 658's into a "General Management" folder, with or without subfolders for 658.1 and 658.4. Or I could have put all the 650's into a "Management and auxiliary services" folder, or even all the 600's into a "Technology" folder. This approach works with any modern file system (which DOS and WINDOWS, unfortunately, are not; their seven-character limit on file names means that one cannot mix DDC codes and descriptive names).

I was pleasantly surprised at how easy the abridged DDC was to use. In many cases assigning a category was simply a matter of looking up a term like "time management" in the index and using the number it indicated, although I was always careful to verify that I had understood the term correctly by consulting the schedules.

The major payoff of using the DDC in this way is the peace of mind that comes with knowing that I will never ever again have to make a classification decision on my own. In effect, I now have a vast army of librarians at the Library of Congress making those decisions for me. When I buy a new program, say Willmaker, I won't have to puzzle over where to put it; I will simply flip to the Relative Index, look up the code for "Wills", and create a new folder named "346.05". When someone comes out with software for managing soccer tournaments, I will not have to concoct some new label like "SMST"; instead I can just file it under "796.334 068".

There are two principles I discovered along the way which make the entire system more workable. The first is that "book numbers" are not necessary for small collections such as mine. A book number is a code which when combined with the classification code ensures a unique call number; usually they are based on the title or on the author's name. It does not bother me if more than one item has the same code. For example, I am quite content simply to label all my word processing floppies "652.5", and to flip through them to find the one I want. Generating unique book numbers would of course be simple, but I have not found it necessary.

The second principle is that I am not being graded on how accurately I follow the DDC system. I am sure that professional librarians would be shocked at some of my assignments, but who cares? I try to follow the DDC guidelines because doing so maximizes the rewards for using the system, but if I slip up or deliberately contravene the recommendations, the earth will not tremble.

I have gone on to use the DDC to categorize my personal files, my computer manuals, my books, my road maps, my hand tools, my mail-order catalogues, my medicine cabinet, and my record collection. Even my son's toys are labeled: the box with his car collection is "388.3 Automobiles", his coin collection "737 Numismatics", and his stuffed animals "591.074 Zoological Collection". I have not yet labeled my spice rack "633.8" nor my vegetable bin "635", but it may come to that.

The consistency gained by having a single classification scheme for all these different collections is a powerful convenience. The Classification-in-Publication (CIP) data inside most recent books greatly simplifies things, and I look forward to the day when CIP spreads to software publishing, and every floppy disk has DDC and LC numbers printed on it by the manufacturer. I believe the world urgently needs a comprehensive, broadly applicable classification scheme for knowledge, and I am increasingly convinced that the DDC fulfills that need.

Comments Home


Internet Cataloguing-in-Publication Data
Mundie, David A.
    Organizing Computer Resources / David A. Mundie
    Pittsburgh, PA : Polymath Systems  1995
    025.4 dc-20
                                        [MARC]

© 1995 by David A. Mundie