The World Bank have investigated the popularity of their policy reports and found almost one-third is never read! I’m not surprised. The format in which the reports are presented and the quality of the metadata associated with them tell a large part of the story.
The World Bank conducted internal research to investigate the exposure of their policy reports. From a database that contains 130,000 publically available documents, for this report they have selected a total of 1,600 reports produced between 2008 and 2012. They were all published on the external website as PDF documents. The World Bank wanted to know how many times the report had actually been downloaded and how many citations were drawn from it.
The results are shocking, but not surprising either. I'll go into that later. It turns out:
To put these numbers into perspective: worldbank.org attracts around 3 million monthly visits.
Source: World Bank report
The Washington Post concludes solutions to world problems might be buried in PDFs nobody reads. They stress the omnipresence of PDF as the single document format in the think thank industry, the government, and universities. Data journalist Christopher Ingraham invites all of them to perform the kind of research the World Bank have bravely done now.
I find it baffling, by the way, that the World Bank should receive praise for researching the effectiveness of their publications. Their goal is not just to make information available; it is to have as many people as possible actively use the info to make the world less poor. So why, after throwing out 300 reports a year since 2008, should the first Omniture report be created 6 years later? I would have expected them to monitor this from the get-go.
There are two main reasons why these figures do not surprise me. The first has to do with the nature of the portable document format, the second with the metadata associated with the reports.
PDF is in many respects the easy way out for people who haven't truly adopted online. You write up your document in a word processor, as if you intend to print it, and save it as a PDF to throw it online. The contents of the PDF are completely locked inside. Which is a pity particularly in case of the World Bank, who could unleash a lot of useful data for data journalists and others to elaborate upon.
PDF is a mobile-unfriendly document format, although this has seen much improvement in recent years through for instance Chrome being able to display a PDF in a browser window. Nevertheless, with more and more people only accessing the internet through their mobile phones, PDF is not the best format. HTML is the way to go. Not only for reasons of accessibility, but also for easier indexing by search engines.
Which brings me to the second point. One main reason for documents not being downloaded is people can't find them. Or even if they can, it is unclear what to expect from their contents, so users won't bother. Ironically, the research into their reports not being read is only available as a PDF. Let's take a look at its metadata. For starters, the report has a totally uninterpretable document title: WPS6851.pdf. The document properties aren't much to look at either with 'World Bank Document' as title and no subject or keywords.
Source: World Bank report (PDF)
Admittedly, the page on the website where the PDF is published does provide some meta information. There we see – for what it’s worth – that this is what the World Bank calls a Policy Research Working Paper. It provides author profiles, an abstract, and a document name: "Which World Bank reports are widely read?" The meta description provided to Google is merely the repeated title.
My question: why aren't these metadata included in the actual document? And I'm wondering if an abstract alone is enough to trigger one's interest. It's on the longer side counting 1162 characters. A decent meta description of 156 characters max. should suffice and can be used in many useful ways. A couple of keywords would be really helpful, too.
In conclusion, it’s a good thing the World Bank are analyzing the popularity of their reports. Many other mass information producing organizations should follow their lead. But two fundamental issues with the way their policy reports are published are left unaddressed. A closed document format like PDF and the lack of meaningful metadata associated with the document aren’t helping its contents being found and shared, freely and easily.
--------------------------------------------------------------------------------------------
Want to understand which content on your repository is unpopular? Or which content on file shares is redundant, obsolete and trivial? Or, do you just want to know what content is located on your repository in general? A file share analysis can be useful, for example, in case of a content migration or implementation of a new WCM or ECM. Learn which insights a content analysis can give. Download the report.