The History of the DeCSS Haiku

by Seth Schoen

Works like Hesiod's Theogony are not just spoken poetic entertainment; they delineate the world view of their culture. In the same way, the DeCSS epic instructs the "listener" in the world view and cultural values of those opposing [censorship of] DeCSS.

Leigh Ann Hildebrand, slashdot comment, February 25, 2001

we have only words against

John Dos Passos, "The Camera Eye (51)", in The Big Money

I wrote the poem known as the "DeCSS Haiku" three years ago, in 2001. (The poem's full title is "How to decrypt a / DVD, in haiku form / Thanks, Prof. D. S. T.") The 456-stanza work, sometimes described as an "epic", was an anonymous contribution to Prof. David S. Touretzky's "Gallery of CSS Descramblers", which collects a variety of ways of expressing technical information about the decryption of DVDs. My poem has now become a part of the folklore of the Internet.

The poem includes a traditional opening invocation to the Muse:

Now help me, Muse, for
I wish to tell a piece of
controversial math.

It proceeds to describe, using only haiku-like verses with lines of five, seven, and five syllables, all the mathematical steps required to convert an encrypted DVD into a usable form.

Prof. Touretzky created his Gallery shortly after U.S. movie studios began their quest to suppress the publication of such information. The studios had filed a lawsuit captioned Universal v. Reimerdes (later known as Universal v. Corley). Touretzky was concerned about the free speech implications of the case, and the purported distinction between computer software and other forms of expression. As Touretzky explains:

If code that can be directly compiled and executed may be suppressed under the DMCA, as Judge Kaplan asserts in his preliminary ruling, but a textual description of the same algorithm may not be suppressed, then where exactly should the line be drawn? [The Gallery of CSS Descramblers] was created to explore this issue, and point out the absurdity of Judge Kaplan's position that source code can be legally differentiated from other forms of written expression.

Touretzky set about collecting a remarkably wide variety of descriptions of the DVD decryption process, with the aim of promoting critical thought about what expression people are prepared to censor, and why. This process resulted in an outpouring of creativity from the Internet community, with the DVD CSS algorithm described and redescribed from an assortment of scientific and artistic angles. Most contributors seemed to view the creation of each new adaptation of DeCSS as a form of political protest. As Touretzky's correspondence with the Motion Picture Association of America made clear, each adaptation was also a thorny new legal question: could this version be called a "circumvention device"? Could the courts suppress its publication? Nobody seemed able to offer a clear answer; a studio lawyer was later willing to opine to the Wall Street Journal only that there were practical limits to the industry's willingness to spend money fighting these works. So when the studios asked Touretzky to take his Gallery off the Internet, he put the question to them directly: which versions did they object to? They told him that they would consider the question "and respond appropriately at the proper time". Professor Touretzky is still waiting.

Impressed by other people's contributions to the rapidly-growing gallery, I decided I had to make some kind of effort of my own. I had particularly admired Joe Wecker's song "Descramble (This Function Is Void)", and I imagined that my contribution would have to be in the realm of literature rather than of visual art. I toyed with translating a description of the algorithm into Latin (on the theory that this might appeal to lawyers, who readily recognize that Latin is expressive and meaningful even though the vast majority of people can't understand it). I quickly abandoned this project when I ran into trouble with technical terminology such as "array", "bit", "shift", "pointer", etc. ("Algorithm" could probably be "ratio" and "loop" might be "iteratio", but there were still dozens of outstanding terminological difficulties.)

Instead, I settled on the idea of putting the algorithm into haiku form. A strange tradition current among programmers calls for the use of the 5-7-5 pattern -- preferably cleverly -- to express technology, or jokes about technology, or really anything at all, just for the fun or the challenge of writing within the constraint. I remember particularly that the UC Berkeley Computer Science Undergraduate Association has a mysterious tradition of writing haiku poems about the chemical element zinc. The tradition seemed to start with a 1995 transcript of a conversation in which CS students began to write poems about zinc, but it continued within and without the Berkeley CSUA, and I know that I personally helped spread the tradition to other forums and communities. (This being geek humor, it also expanded to include horrible puns. I perpetrated "Healthful supplement / present here where food's prepared: / it's the kitchen zinc", "Sensations coming / from metallic reactions: / a zinking feeling", "Gossips and costly / seagoing vessels conjoined / thus: loose lips, zinc ships", and "Rene Descartes finds / his elemental self: I / zinc, therefore I am". Evan Prodromou countered with "Enough! This is the / worst form of humor. I won't / zinc to your level", but nonetheless offered "Shipwreck survivor / either saves his load or flees. / Which one? Zinc or swim?".) It is probably fair to say that almost all uses of haiku in the technical community are silly. Nonetheless, considerable effort often goes into their creation. In a better-known example, the on-line magazine Salon asked its readers in 1998 to submit computer error messages in the form of haiku; the results became an instant Internet classic. ("A file that big? / It might be very useful. / But now it is gone.")

It's hard to say, then, how this particular (and rather odd) haiku tradition started, or why it seemed to hold particular appeal for computer programmers. But it was obvious to me that a description of DVD decryption in haiku stanzas would fit perfectly into the tradition, and I decided to create one.

Writing the bulk of the poem itself took me around 15 hours over the course of several days, excluding the CSS tables (whose construction I describe below). As Leigh Ann Hildebrand observed, there were classical influences at work in my efforts; I realized immediately that I would need to begin with an invocation of the Muse. (My poetic skills were not up to constructing dactyllic hexameters, and I had already settled on the haiku form.) I used Prof. Touretzky's article "The CSS Decryption Algorithm" as my main source for the technical details, but I set myself a strict rule against using hexadecimal constants, because they seemed unpoetic. Everything had to come back into decimal form, because a number is a number. I also felt that it was important to include passages honoring and praising heroes (even using an epithet in the traditional epic way: "wise Andreas Bogk", only partly metri causa) together with a substantial amount of context. After all, one of the ways long poetry maintains interest is by telling several stories at once, and by painting scenery. Finally, I felt that expressing the fear of censorship directly and repeatedly within the poem itself created an interesting tension. It emphasized that the poem had really been written by a human author with a human voice and his own interests and passions. Aware of the prospect of censorship, the poem confronts would-be censors directly and takes them to task. By contrast, most source code is relatively defenseless: it can't fight against its own suppression, and it gives less direct evidence of being in a human voice, leading some people to accept its stigmatization as "merely mechanical" or "merely functional". I feel that it is essential that the poem constantly pleads for its own life -- an effect accomplished comically yet powerfully by Joe Wecker in "Descramble (This Function Is Void)":

I hate the DMCA,
It makes this song illegal.

This says: if you censor this, you are not censoring a "machine": you are censoring a person -- a singer, a poet, an author, who knows you and knows what you are doing.

At first, I told nobody about my project. I submitted the poem to Prof. Touretzky through an anonymous remailer, which concealed my identity even from him. Later, I revealed the secret to a few friends. In light of the DVD CCA's willingness to sue even a t-shirt vendor for printing the algorithm on clothing, I thought it might be prudent not to be publicly identified as the haiku author. Meanwhile, Touretzky (who still didn't know my name) was very pleased with my work and immediately included it in his Gallery. (Touretzky responded to the lawsuit against the t-shirt publisher by saying, memorably, that "if you can put it on a t-shirt, it's speech".)

A few months after I finished the DeCSS Haiku, Paul H. Henry, a writer in Seattle, complained about Western "joke haiku" forms, arguing that the practice I discussed above typically lacks literary merit, that it does not accord with the traditional Japanese cultural meaning of haiku, and that the verses produced in this way technically are not haiku but senryu. I think this critique should probably receive more attention than it has. It's clear that the practice of writing 5-7-5 verses and calling them "haiku" seizes on only one aspect of the haiku form and entirely removes it from its original cultural context. I freely admit that my poem has no cultural continuity with the ancient Japanese haiku artform, although I think it has its own sort of literary merit. There is in some sense more art in other forms of constrained writing such as lipography, in which a letter or group of letters, typically "e", is omitted. I have in fact extensively practiced writing and speaking without the letter "e", and have been able to do it reasonably naturally for hours at a time. (You can see some of my very first attempts; I've gotten a lot better since then.) I considered writing the "DeCSS Lipogram" -- presumably it would have been called "Unscrambling Hollywood's DVD CSS Algorithm: A Lipographic How-To for Linux DVD Support, or Satisfying Your Curiosity" -- but I never did so.

A greater challenge might have been making the verses more authentic as haiku by adding kigo, or seasonal references. "Plane to CPTWG / lands in L.A. summer heat / time to meet John Hoy." In fact, the absence of seasonal references or other lyrical elements, and the primary focus on factual matters, are clues that my poem properly falls into the ancient tradition of didactic poetry. Poems that teach skills or argue points of view have existed for thousands of years; Vergil's Georgics, which offers practical advice to farmers, and Lucretius's De Rerum Natura, which makes an extended philosophical argument, are well-known Roman models. Didactic poetry has rarely been as popular as more imaginative literature, but it has had an uninterrupted presence in art up to the present day. (Students are still constructing their own mnemonic didactic poetry to assist in memorization.)

After my poem was published, it got a lot of attention: it was mentioned in articles in The Wall Street Journal, The San Francisco Chronicle, The New York Times Magazine, Wired, and elsewhere. (The New York Times Magazine featured and excerpted it as an exemplar of one of the great ideas of the year 2001 -- that code is speech -- in an end-of-year piece by Prof. Siva Vaidhyanathan. The Wall Street Journal piece, "Banned Code Lives in Poetry and Song" by David P. Hamilton, even quoted in full the stanzas containing the alleged Xing player key. Hamilton had, through an attorney, arranged an interview with me in San Francisco's Financial District. As the only journalist who knew the identity of the haiku's author, he was careful to avoid giving any clues that might have helped movie studio lawyers find me.)

The poem also appeared as required reading in several law schools. It turned up on syllabi for courses taught by Jessica Litman, Dennis Karjala, and Barbara Simons, among others. It may have been the first time certain professors found themselves teaching poetry in their copyright classes. The poem was also featured as an artwork in the catalogue of Carrie McLaren's exhibition Illegal Art: Freedom of Expression in the Corporate Age, which opened in galleries in New York, San Francisco, and Chicago.

Most readers are not aware that the DeCSS Haiku contains an easter egg. There is a mnemonic for pi in the two stanzas

Now I want a drink
(mnemonics in crypto poems
are great!); exercise

from singing so long
makes me thirst for a glass of
soda, slice of pie.

This mnemonic is partly inspired by the mnemonic for pi devised by James Jeans (given by Martin Gardner in chapter 11 of The Scientific American Book of Mathematical Puzzles and Diversions), and partly by the reported saying of medieval scribes: "Nunc scripsi totum pro Christo, da mihi potum". (As Dave Barry says, I am not making this up.)

If you count the number of letters in each word in the stanzas given ("now" = 3, "I" = 1, "want" = 4, etc.), you obtain the first twelve digits of the decimal expansion of pi. The poem offers two separate hints to their presence: the reference to "mnemonics in crypto poems", and the mention of a "slice of pie". This is a crypto poem, these lines are a mnemonic, and they offer their reader a tiny slice of pi. (It also bears mentioning that, at the time I wrote the poem, I had an account on a Unix machine called soda and another account on a second machine called pie.)

The poem contains a list of over seven hundred numbers (the "CSS tables" in the original version of DeCSS, which subsequent research revealed are not actually required for a CSS implementation). These numbers are listed in English in 5-7-5 syllabic form, and many people have wondered whether I constructed these stanzas by hand or automatically. The answer is that I used the NetBSD number(6) program together with some shell scripts to convert the "csstab" from hexadecimal C arrays into lists of numbers in English. Then I wrote a program of my own in Python to attempt to place long lists of English numbers into 5-7-5 form, using a trial-and-error algorithm I developed. (It isn't possible to scan arbitrary lists of numbers in a 5-7-5 syllabic pattern without some padding. For example, "one hundred seventy-seven" can't ever be the first line of a haiku. My program was able to add appropriate padding syllables to avoid this and other difficulties. In this case, it would have generated something like "and then one hundred / seventy-seven ..." to avoid a line break in the middle of a word. It could also attempt to write the number as "one hundred and seventy-seven", although that wouldn't help in the particular situation just described.)

My friends Don Marti and Danny O'Brien have each written related haiku-finding programs that locate 5-7-5 syllabic patterns in existing English text, using the Carnegie Mellon phonetic dictionary to determine each word's syllabic value. Don's program is lost to posterity, but Danny's is available on-line; it makes use of the Python generator feature.

Before it was lost, Don's program found the following haiku hidden in an early interview with Richard Stallman (as Stallman lamented the social costs of the present copyright system):

One person gains one
dollar by destroying two
dollars' worth of wealth.

I wrote the DeCSS Haiku because I was angry at the attempts by lawyers for the entertainment industry and the government to trivialize the free expression rights of programmers and other people engaged in technical communication. I believe, and I said repeatedly in my poem, that one of the reasons courts and others have countenanced the censorship of software is that they do not understand it. (This isn't the only reason. Dorothy Denning and Michael Shamos, who can read programs perfectly well, have have argued seriously against their own First Amendment rights to do so. Another CMU computer scientist by the name of Seth Goldstein even said that "Computer code is not speech because it does something. It's an artifact. It's closer to a machine than it is a poem.") So some people do consider code different (although they frequently make the mistake of ignoring the human volition involved in making machines use software to do things). The legal questions that would face a practitioner or a judge are significantly more involved, on account of the epicycles that have accumulated in First Amendment doctrine over time (strict scrutiny, intermediate scrutiny, content-neutrality, the O'Brien test...). Yet ultimately much comes down to cultural framing of a free speech controversy as a free speech controversy: if a court accepts that something is "a First Amendment case", the speaker is extremely likely to win, especially if the party on the other side is a government. It's hard to avoid the inherent sympathy Judge Patel bears toward Professor Bernstein (a speaker whose expression is crushed by the awesome might of government bureaucracy) or the equally apparent suspicion with which Judge Kaplan regards Emmanuel Goldstein (a self-avowed hacker seemingly hell-bent on trouble). These attitudes seem to me to be visible behind all the doctrinal questions; without committing myself for all time to a position in a contentious area of legal theory, I would say that Judge Patel fought to show why her case was a free speech case and that Judge Kaplan fought to show why his was not. The question of which approach seems natural would then be not primarily a question of legal doctrines, standards, or precedents. It would instead be a conceptual, cultural battle: shall programs be compared to epidemics of disease (evil, menacing, worthy only of quarantine) or to books in libraries (the cornerstones of our culture and our civilization)?

I wish programmers got more worked up about that metaphor. It ought to offend them. Your work, creating new and useful technology, is like an outbreak of cholera or botulism?

Despite the fact that some prominent computer scientists have taken a different view, it is clear in practice that support for the idea that "code is speech" has some amount of connection or correlation to programming experience. In my experience, programmers are more likely than non-programmers to sympathize readily with, for example, the plaintiff in the Bernstein crypto-export case. One simple explanation is that people most strongly or most easily value freedoms for which they have some use, or of which they have some experience. People who like chocolate are probably denser at the front lines of the movement to legalize chocolate than are those who are allergic to it. (I do not disparage the ranks of altruistic activists; they are real, and they are heroes.) We can go beyond this observation, though, by considering what people intuitively believe about what one software is. Is code something appliance-like you get in a box in a store, or is it something you read and write and talk about? Have you been to a party where people were talking about creating software? Their fundamentally different personal experiences of software color people's intuitions about the kind of thing software is (and also about what they ought to be able to do with it). (Neal Stephenson discusses these contrasts at some length in his In the Beginning Was the Command Line, where he writes about the metamorphosis of software from cultural activity into packaged consumer product. As Stephenson recognizes, the tension between these two experiences of software is very much in play today. But selling software with a fixed feature set, as its feature set, as a product at retail is an idea that Stephenson points out had to be invented.)

I think this chasm of experience is real and difficult to bridge. I know that most of my family members have never experienced software as a text they wrote. Almost all of them have only thought of software in terms of what it did, not how it worked, how it was written, what it taught them, or how it could be made to do something else. (In a vastly more trivial poem on the freedom to tinker, popularly called "Fair Seuss", I took all these for granted. I take this to be one of Ms. Hildebrand's observations about world-view in my poetry.)

But telling people that there is more to software than they have experienced will not do much. Nothing will go further to add depth and realism to the larger culture's view of computer programming than making computer programming itself a part of the larger culture. I believe that a future in which the basic experience of programming is considered an ordinary part of education is a future that will better appreciate (and make use of) the nature of software than our present can. We have spoken idly of "computer literacy" without defining it (or engaging with its dozens of critics). We programmers might forget easily that literacy itself is still a major struggle in the world, or that even recently it was not assumed that everyone would know how to read and write. We might forget easily how powerful the consequences have been of changing that assumption and making, or trying to make, literacy an everyday skill.

And making a basic knowledge of programming ubiquitous would also have profound political consequences: for one thing, it would undermine the reasoning that emphasizes that only a very few people know how to make practical use of technical information. Those who reason this way say that it is only natural to try to frighten or coerce those few people to induce them not to share the fruits of their skill with the public; this is the whole rationale for the DMCA device trafficking ban, founded on a distinction between software toolmakers (the few) and software tool users (the many). It is also the rationale for the DMCA definition of "effective", which seems bizarre to programmers with its reference to "ordinary course of operation", as if the use of computers didn't routinely involve altering their function.

The larger culture today assumes -- and many people rely on the assumption -- that "normal" people don't know how to program, that only hackers do that. In many ways, programming is a socially marginal activity, viewed with considerable suspicion. (I rarely experience this suspicion myself since so many of my friends are part of programming communities, but I have no doubt of its existence.) But reading books and periodicals -- traditional literacy -- is not a marginal activity. Teaching reading and writing requires a decision that these skills are valuable and teachable (since, unlike spoken language, some conscious effort is required for children to acquire these skills). Can we make programming become respected and ubiquitous in the way that reading, writing, and arithmetic are?

[Boondocks image]
(Credit: The Boondocks, by Aaron McGruder, March 3, 2001, panel 2. Copyright 2001 Aaron McGruder. Distributed by Universal Press Syndicate. Used pursuant to 17 USC 107.)

Aaron McGruder's "The Boondocks" comic strip for March 3, 2001 shows a student asking his teacher "Why is it perfectly legal to post a diagram of how to build a bomb on the net, but you can't post a code that descrambles DVDs?". It is a striking question. It is a concrete question; real and possibly lethal explosives techniques are lawfully available on-line.

So somehow in our world the law allows us to say how to kill people, but not how to decrypt DVDs. We can also depict things that many people find offensive -- including graphic acts of violence -- and we can even advocate violence, as long as it isn't sufficiently particularized to threaten identifiable people. (Famously, you can apparently under the law tell people how to make thermonuclear weapons, though no precedent addresses that question head-on.) And just recently, even where other countries suppressed practical directions for sabotaging railways, they continue to be available in the United States. I hope I am clear that I do not want any of these other things suppressed. It's just bizarre to see that information of concrete use to the commission of direct physical violence is better protected in our country today than valuable technical information that was, in fact, used, as we predicted, to create innovative and useful DVD players for Linux and other platforms. An observer might be shocked to compare the Bernstein and Corley cases. She would perceive that the courts have concluded that it's wrong to censor software in the name of preventing terrorism, but it's all right when what's at stake is the ability to copy movies.

As some First Amendment historians read the story, it was difficult in every age to make people recognize that various things were speech, even though later ages grew comfortable with that recognition. (Today's contested forms of speech include not only software but also body art and erotic dance; dance in general was also once considered non-speech, perhaps for some of the reasons I describe below.)

The highest irony in light of the position of movie studios on the legal protection of software expression is that movies were originally not recognized as speech. Mutual Film Corp. v. Industrial Commission of Ohio, 236 U.S. 230 (1915), addressed the question head-on and held that movies were unprotected by the First Amendment. Other courts of the same era took a similar line. It's hard to give a single reason for this that would make any sense to us today. We might say that movies were unprotected because they are not textual (and the roots of the First Amendment center on protection of text, especially of text that argues a position, because of the important place of argument in our traditions). We might say that they were unprotected because they are mediated by machinery (and, indeed, motion pictures were regulated extensively by judges unafraid to say that they were merely machines). What's more, they were seen as creating a risk of adverse effects, and so they were unprotected for fear of the consequences they might produce. But it might be best of all to say that movies were not recognized as protected mainly because they were unfamiliar.

Movies were eventually recognized as enjoying practically as much First Amendment protection as a book -- but this took time. In fact, it took decades for the current understanding of filmmaking to develop. Joseph Burstyn, Inc., v. Wilson, 343 U.S. 495 (1952) concluded that films did have first amendment protection, and. Freedman v. Maryland, 380 U.S. 51 (1965) came closer yet to extending protection generally equal to that of books. (Thanks to Robert Corn-Revere's "Internet and First Amendment in Speech" for a concise summary of this history. The particular regulations and rationales for regulation of film in the Mutual Film era can be discussed in more detail, but this should suffice for the point: "common sense" 90 years ago was that new media were importantly different and importantly less protected; the new media had to fight both in the courts and popular culture to come to eventual parity with more traditional media.)

We are just now with software roughly where Mutual Film Corp. and other decisions of its era were with film.

Because of this, we have an odd problem today; we are constantly trying to show the literary merit of software. We make reference to the fact that some people do read software as literature (in the Perl poetry competitions, the IOCCC, etc.), or that they discuss matters of software style and readability. We must emphasize the extent to which people read and discuss software, and especially the extent to which they study the form and technique of the composition of the software, its literary merit, etc. But nobody today holds other textual works, like books, to such standards. (Do we worry about showing the extent to which people read cookbooks for literary merit, or home repair manuals for literary merit, or textbooks for literary merit, or tables of logarithms for literary merit? Are these books not speech because of their functionality?)

It's sad to see much of the self-identified "creative community" fail to support intellectual freedom here. Particularly galling is the short-sighted proprietary attitude toward the First Amendment expressed by some people in the culture industries, who suggested in Corley and many other contexts that the First Amendment smiles on the culture industries' expression and overlooks harm to other people's expression. They refer to themselves, but not their critics (and not Jon Johansen, not me, not any Gallery of CSS Descramblers contributor, not any programmer) as creative. We know free speech, the industries seem to say, because we practice it, and free speech is our speech, the First Amendment is our amendment. After all, how could artists possibly be censors?

As Cindy Cohn says, we don't mind if the movie studios use the protection of the First Amendment, "we'd just like them to share".

Looking back at the Corley case, I am frustrated. I am frustrated not only that we lost, not only that the censorship continues, and not only that allies of the studio plaintiffs keep on trivializing programmers' speech rights. More than anything, I'm frustrated that public opinion mainly dismisses what happened as a matter of pursuing hackers. Public opinion says the hackers got what was coming to them, because they were hackers. The court of public opinion, with some exceptions, seems to be affirming the Second Circuit.

Meanwhile, serious problems with the DMCA go unaddressed, and the traditional legal status of reverse engineering is under attack. Public opinion is not rising to defend it; since I wrote the DeCSS Haiku, property rhetoric has continued to its success in making people "brand / tinkerers as thieves". We are not communicating effectively, even though so many of our best cultural traditions are on our side. As I told the Wall Street Journal, "a computer program is a literary work [and it is] strange and difficult [...] to classify computer programs and technical information as something other than speech". Yet having people take this seriously, escaping from marginalization, and even capturing the public's attention, will be a major task. It calls for cultural changes that will not be accomplished without creativity.

If, reader, however, you are at a loss for what political act to take; if you are no designer of network architectures, no writer of legal briefs or op-eds, no poet and unacknowledged legislator -- then I have a suggestion for you. Learn to program, perhaps in the Python language, which was created specifically to be a first programming language for new programmers. Then teach one other person to be a programmer, too.

Thanks to James S. Tyre, David P. Hamilton, David S. Touretzky, Cindy Cohn, Wendy Seltzer, Joseph Wecker, and Jon Lech Johansen. Thanks also to the following people mentioned in the poem: Ian Goldberg, David Wagner, Frank Stevenson, wise Andreas Bogk, Bruce Schneier, Eben Moglen, and the 2600 amici. (Thanks, finally, to Eric Temple Bell, Albert Einstein, Pythagoras, and Calliope.)

In the hope of a world with more speech and more speakers, with more poets and more storytellers:

sit mihi fas audita loqui, sit numine vestro
pandere res alta terra et caligine mersas.

Aeneid VI, 266-7