I don't know the size of my site. Can you help?
by Corné van Leuveren, on Nov 13, 2013 1:58:00 PM
How can you control your brand image and the communications to your target audiences if you don't know what's on your site? You need insight into the size of your website to make better decisions.
It's often difficult to know the size of a site which has been running for quite a while, especially if it's been ported from legacy platforms and administered by changing groups across your organization.
The business pressures of redesigns and site structure changes often leave large sections of the site orphaned and no longer accessible via the menu structure but still squirrelled away and publicly available via search engines.
The old editors have moved on and nobody really knows what's published and how large the site is. Furthermore it's hard to know which pages are ever visited; are these pages still relevant and important and if so, to which of your target audiences?
How can you control your brand image and the communications to your target audiences if you don't know what's on your site? Escaping from this situation is not so difficult if you know where you're going, and have solid information on hand to make the right decisions to get you there.
Your Own Crawler
Gathering this information doesn't have to be a tedious manual process since the tools exist to automate it and do it for you. Much like a search engine will crawl your site for information, it's possible to configure your own crawler to iterate over the entire site and gather statistics about your pages, their contents, and the relationships between them.
Site Size Revealed
By following all the internal links in your site navigation and within the pages, you'll reveal how large your site really is, and what content it contains. Size is a factor of not only editorial content such as pages, but also downloads such as Office documents, PDFs, and images. All of these things contribute to site size and must be accounted for.