How better metadata would help the New York Times

by Evan Goris, on May 23, 2014 8:00:00 AM

Spending 15 years on a recipe database without ever tagging ingredients or cooking time. The lack of useful metadata associated with their content is symptomatic of the New York Times’ failure to adapt to the digital age. How the paper and its online readers could profit from proper metadata.

The New York Times' newsroom innovation team were given six months to ask big questions about the newspaper's digital strategy. The report was leaked to the press and has been talk of the digital town for some time now.

Transformation to digital-first

I won't spend too much time capturing the highlights. There are some good summaries like this one from Nieman Journalism Lab questioning the capability of a print-based organization, founded in the 19th century, and led by people who don't truly understand digital, to successfully transform into a digital-first operation.

Because at Xillio we care about quality content and powerful metadata, I'd like to focus on the New York Times' attempts at structuring and tagging their immense collection of content dating back to the newspaper's founding in 1851.

Metadata team have failed

You'd expect a content-centric organization with such a massive archive to have adopted structuring and tagging information ages ago. How else can such a huge knowledge database be put to use effectively? Granted, there is an “archive, metadata, and search team” in place within the newspaper’s organization. But it looks like they have done a terrible job so far. Probably because they haven't been given the room they deserve.

“The Times is woefully behind in its tagging and structured data practices,” writes Nieman Lab. This harsh conclusion can easily be supported by looking at just one article. A news story titled ‘Full Recovery Still Years Away for Many in Euro Zone’, serving as a random example, has only three visible attributes: category (International Business), author (Liz Alderman), and publication date (May 15th, 2014).

Missing metadata and their benefits

It didn’t take the innovation team much imagination to think of additional metadata that could have been added to stories and the way these tags could be put to use. Applying their suggestions to the above article:

  • Add the geographic location of the story, allowing content to surface based on a user’s location.
  • Assess the timeliness of the piece: will it be used tomorrow to wrap the fish, will it stay in the news for a couple of weeks, or is it a story with ‘eternal’ value? This will make it easier to feature and recommend older (yet still relevant) stories.
  • Indicate the type of story. Is it breaking news, a profile, a news analysis? This type of metadata not only helps presenting content better, it also allows for a much better analysis of users’ reading behavior. One reader could be more interested in analyses, where another one only comes for breaking news stories.
  • Add a story thread like ‘Economic crisis in the Euro Zone’. This helps readers to follow ongoing stories and is helpful in organizing the archives. By the way: it took seven years for the Times to begin to tag stories “September 11”!

(Compare this to The Guardian who can not only quickly assemble and publish a dossier, but even allow users of the website to put together their own dossiers by combining self-chosen tags. The British newspaper have published a recommendable series around their metadata strategy aptly labeled ‘Tags are magic’.)

Poor tagging is costing them

The New York Times decided against better tagging in 2010, the internal report indicates. A fatal decision. Poor metadata means the paper is losing money by the truckloads. 15 years (!) have been spent on putting together a useful recipe database with all efforts failing for a trivial reason: recipes were never tagged by ingredients and cooking time.

The money lost cannot be compensated by additional income from for example photography licenses either because lacking structural data, the sale of photos cannot be automated.

Much work to be done then at the New York Times. Structuring data and adding metadata to content are only a small portion of it. Small, but essential to survival in the digital age.

How better metadata would help the New York Times

Call in the specialists

The innovation team also call for hiring digital talent. One of the much needed roles is a content specialist to teach all newsroom reporters how to summarize and tag their stories in a meaningful way. Talented information architects wouldn’t hurt either. The content management system needs to be redesigned to allow for easy tagging of content in the first place. This could be a tough assignment as the report states NY Times’ leaders would be ‘horrified’ if they understood the content management situation.

And if the New York Times need help quickly adding metadata to their huge archive – we’re here to help.



Xillio blog

On this blog, you will find more information about Xillio, our products, and market developments.

Subscribe to updates