The Analysis Software That Wrecked J.K. Rowling’s Anonymity
Earlier this week, the literary world was shocked to find out that Robert Galbraith, the “first-time” author of the critically acclaimed novel The Cuckoo’s Calling, was none other than the UK’s best-selling author ever, Harry Potter creator J.K. Rowling.
A writer at the British Sun-Times newspaper received an anonymous tip via Twitter that Rowling was the true author of the crime mystery (the tipster’s identity also has recently been revealed, as that of a family friend of a partner at the law firm representing Rowling). After the newspaper staff investigated the claim and gathered what they felt was enough evidence to be reasonably sure of the pseudonym, they asked Rowling’s people outright. The suspect confessed, ending a saga worthy of a detective novel itself.
It’s the investigative work in the middle that’s really interesting. How could it be confirmed beyond a reasonable doubt that Rowling wrote the novel? Well, it turns out that an author’s writing style is as unique as a fingerprint, and with the right information, you can make a very educated guess about whether two manuscripts match.
The science is called stylometry, the analysis of a person’s writing style. Stylometry examines word choice; the frequency, sequence, and length of words; and other telling tendencies. (For example, Rowling’s cover was blown in part by her penchant for using Latin phrases and certain pairs of adjacent words.)
Experts in the field, known as forensic linguists, maintain that a trained eye can pick out these similarities between texts. But computer algorithms are much faster and just as accurate, if not more. Patrick Juola, a professor of computer science at Duquesne University, was one of the people asked to examine The Cuckoo’s Calling. Juola used software he designed, the Java Graphical Authorship Attribution Program—which, incidentally, is a free download available for anyone to play around with.
Software like this has been used to identify the true author of a will and even to analyze Shakespeare’s plays. But another use coming to light is in seeking out hackers, malware writers, and anyone else who wouldn’t want their online identities traced back to them. Graduate students at Drexel University studied leaked conversations and texts from hundreds of anonymous users in underground online forums. By using stylometry the students were able to determine 80 percent of the creators.
By tracing a person’s language, the typical methods of evading digital detection—such as keeping several anonymous accounts or using different IP addresses—wouldn’t apply.
However, the Drexel researchers recognized the security implications of there being nowhere left on the Internet to hide. “Authorship recognition can be a legitimate threat to privacy and anonymity,” they said. They released two open source tools: JStylo, which they used in identifying all those secret forum contributors, and Anonymouth, which works in tandem with JStylo and helps authors evade detection by providing suggestions for disguising their writing.
While the anonymizing software likely will just be deployed by trolls, maybe there’s another Rowling type out there who will use it for the liberation of publishing under a nonfamous nom de plume.