Can an Algorithm Predict the Next Bestselling Novel?

big_data_bestseller_400.jpg

It’s not always easy to identify what will be hip and trendy years from now, but big data is attempting to bridge this gap. In general, due to the unpredictability of human behavior, it can be difficult to spot cultural anomalies before they happen. Despite these challenges, algorithms are being applied to various practices in both the business world and elsewhere. One innovative way in which it’s being used is by attempting to predict the next bestselling novel.

Jodie Archer, author of an upcoming book called The Bestseller Code: Anatomy of the Blockbuster Novel, claims to have found an interesting way to use algorithms and big data to discover what’s hot in the literature department. This particular algorithm, called the “bestseller-ometer”, looks at what particular qualities make for the most successful fiction. As reported by The Atlantic, the algorithm can identify a bestseller more than 80 percent of the time.

This success is attributed to the algorithm’s ability to identify bestselling fiction from the New York Times bestseller list. Basically, this is one of the many attempts that computing systems have made toward predicting the behavior of the human brain, and it could change the way that publishing companies accept and publish manuscripts. After all, if a book doesn’t sell, why publish it?

The biggest question that this algorithm attempts to answer is: “Why do we all read the same book?” It’s compelling, to say the least. Everyone has different tastes in literature. The academic who carries a pocket thesaurus around in his suitcase might find an escape in a good science-fiction short story or another piece of genre fiction. On the other hand, a book that’s destroyed by critics might be surprisingly successful. Thus, readers find different traits to be more valuable than others.

Aided by English professor Matthew L. Jockers, Archer built the algorithm to find out what makes a reader so interested in a certain piece of literature. The Bestseller Code looks at the various processes and strategies used by the algorithm to identify the context and other important parts of popular fiction. The list is quite long, but it includes a plethora of tropes that are generally found in literature, including:

  • Authoritative voice
  • Colloquial (everyday) language
  • Action-oriented characters
  • Cohesion
  • Human closeness

One other major idea that needs to be taken into account is that of the “zeitgeist,” or time-sensitive concepts. Basically, what’s contemporary is what sells. This adds an element of the unknown and makes it difficult to predict what will be popular in the near future. There’s also the interesting notion of the human element, which makes it difficult, if not impossible, to foresee how they will act in the future. In a way, it makes sense that a human should be picking the next bestseller, as the algorithm cannot empathize with characters or be moved by a good story. After all, a computer can analyze semantics as much as it wants, but it’s not the one reading the book. That’s the job of readers all over the globe.

While it’s possible that big data can make strides in the way we understand how humans think, it’s important to understand that humans are unpredictable by nature. Any attempts to predict the future based on statistics or metrics, while seemingly helpful, could mean nothing, as people often behave irrationally or beyond reason. While technology is a great way to bridge this gap, it’s still important to remember that people are people, not machines.

What are your thoughts on using big data to find new audiences and better understand your own market? Let us know in the comments.