This is one of the most interesting things I’ve seen in a long time, not just project-wise, but for the author’s long explanation as well.

The gist is this: we allow for the copyright of a particular waveform, the product of an artist’s recording. We even allow for the copyright of a digital representation of that waveform, even though the constituent bits themselves can’t be copyrighted in an of themselves. This is enough of a quandary, but the author’s program, Monolith, uses a basis file and an element file (which, theoretically, we could say is a copyrighted work), and by its particular algorithm produces a file that contains no data from either file. It is then, however, possible to get back the original copyright work by applying the algorithm in reverse (as it were).

Consider this simple fact: for a given Element file and any other file of the same length (call it fileA), it is possible to choose a Basis file that, when munged with the Element, will produce fileA as the resulting Mono file. Therefore, if a copyright holder claims that she owns the information in all Mono files that are munged from her work, she is also claiming copyright over all possible binary files that are the same length as her work. For example, suppose that fileA is an MP3 of a Beatles song, and the Element file is an MP3 of a Britney Spears song copyrighted by Jive Records. It is possible to find a Basis file that, when munged with the Spears song, will produce the Beatles song as the Mono file. Jive Records certainly cannot claim copyright over the Beatles song (which is copyrighted by Apple Records), nor can they claim copyright over any other Mono files munged from MP3s of their songs.

It’s a sticky situation. My immediate reaction is to think of this as encryption, and I suppose it’s analogous to such: the basis file is something like a cryptographic key—either public or private—and it alters the file in such a way as to become unreadable without unlocking it. I’ve no doubt that a Mono file contains no recognizable data from either input file, but much of what he talks about on the page is self-defeating.

Think of it this way:

  1. A recognizable waveform is copyrightable (recognizable being a qualitative judgment)
  2. A digital representation of the waveform is also copyrightable insofar as its interpretation reproduces a close enough facsimile of the original work.
  3. If constituent bits or groups of bits themselves are not unique to a particular song or encoding of a song, then surely what’s being applied is the spirit of the law: if it sounds like Britney’s single, regarldess of its encoding, then it’s Britney’s single.
  4. In the spirit of the law, then, encrypting or munging a copyrighted digital representation of an analog waveform does not make the resulting file irreconcilable with the original, insofar as we are concerned. The bits may be distinguishable, but as far as the interpretation of the law goes, I’m not sure how much of a future Monolith has as the savior of P2P

But, then, the author himself claims it to be a thought experiment (with some proof-of-concept code to back it up), and it’s certainly fascinating. Reading the explanation really got my mind churning over the problem. Check it out.

§1063 · March 29, 2006 · Tags: ·

10 Comments to “Monolith”

  1. ffanatic says:

    So correct me if I’m incorrect on this: Monolith is capable of taking a digital representation of something copywritten (a song from a CD, for example) and creating a new file that contains none of the data from the original file, but yet, is still the same song? Or am I missing something? Either way, it’s quite interesting

  2. Ben says:

    When you say “same song,” you touch upon the crux of the issue. What is a “song” in the digital world but a binary reinterpretation of a unique analog waveform?

    For the sake of illustration, I would call Monolith analogous to traditional encryption: you see, the algorithm uses the XOR operator on each bit of the (in this example) copyrighted song. Therefore, the resulting binary file is a completely altered (you’re not likely to find any significantly long strings of bits common to the Before and After) version of an arbitrary digital representation of an analog waveform. You see where this is going?

  3. ffanatic says:

    I think I do. It makes my brain numb.

  4. I didn’t have much time to read most of the second link, but I am not quite sure how he is taking care of the fact that the digital ‘1’s and 0’s’ are sampled from the analog signals, and represent their magnitudes at the sampled points. I didn’t see how this complication was addressed.

    I’d definitely agree that there are ways to mask and recover an analog signal, though.

  5. To add to the previous post: I now realize that what I mentioned about the analog signals at the end is very obvious, and sounds kind of redundant.

  6. Ben says:

    DB, to address your question would, I think, do little more than reiterate what the author explains on the project page. However, in brief:

    A commonsense interpretation of the law lends itself to the notion that even an arbitrary binary representation of an analog signal can come under copyright law, regardless of whether it is recognizable either as bits or only when a sound card converts back in analog.

    However, the point he addresses is: at what length does a string of bits become copyrighted? If you looked at 1101001001001110100101010101, would you be able to tell what song it was? What about something twice the length? At the very least, it would be meaningless without the correct headers/metadata that tells the computer how to interpret it.

    In that much of this is speculation and theory, its practical application may differ significantly, but I found it to be a fascinating concept.

  7. I didn’t pay too much to the legal argument, and mainly focused on the mathematical one.

    It is, indeed, an interesting concept, but binary numbers in general don’t really offer that much room for data manipulation, so I don’t think it would be practically possible to alter the digital representation of something as dynamic as a sound signal, and then recover the original wave through any form of algorithm.

    I also think that putting a copyright on the sequences of ‘ 0’s and 1’s’ is taking it a bit too far.

  8. Ben says:

    Well, you’re not recovering the original wave, as such. If song is the original analog waveform (allowing for my misuse of mathematical notation), then song′ is an arbitrary digital sampling of that waveform. If we were to accept that a .mono file is in fact a derivative work and can be represented by song″, then it wouldn’t be at all difficult to simply reverse the process, provided that the constituent data is still available.

    Indeed, there’s some cross-platform proof-of-concept code there, and since it’s little more than XOR encryption, I don’t see the technical hurdles being of as much interest as the legal debate that it provides.

  9. […] As Ben noted a few days ago with the advent of software such as Monolith (see his entry here), the legality of this matter can only become more complicated. […]

  10. […] This sounds awfully familiar to me. In fact, it’s more or less the exact same concept as Monolith, which I blogged about just this past March. Essentially, both authors/developers claim that enough encryption and/or tying files to other files make these arbitrary digital representations indistinguishable from one another and therefore uncopyrightable. […]

Leave a Reply