Content Auditing: A 60 Minute Challenge

The challenge:

A colleague stated “I want to keep an up to date content audit with our group of diffuse content publishers (10 or so) across a few sites.”

The initial approach:

Suggested by the user – I’d like to keep a register of content that people maintain whenever they publish a site. In other words, “as soon as they publish a page, or update a page, they find our shared library of content tracker spreadsheet, and log it into the tracker”. As a practitioner, my view is that this becomes an overhead for each user that doesn’t really have value to them – and they’ll stop doing it as soon as no one is watching.

I asserted that there must be an automatic content audit / sitemapper tool available for a use case like this. So I went looking for it as part of a 60 minute project.

Attempt 1: What does Google say?

Main hits to the term “sitemapper” included

I got to a successful sitemap in three minutes with https://visualsitemaps.com/ – a good service, if you need to visualise all the content in your site, and like many, they have a freemium software model that tiers for bigger sites. XML Sitemaps was also super simple, and great if you need only text / xml.

API connections that I could use to build this into an alerting or workflow tool, unfortunately, I didn’t find. This might be a symptom of the quick search minutes, but this might also be a hole in the product offering generally available.

Attempt 2: Build-it-yourself

I didn’t attempt this, but next approach would be to script it:

  • Use wget with the appropriate options to map a full site.
  • Check the sitemap into a git repo
  • Push that repo and changes up to github
  • Put alerts around the github repo and plug into Slack or another notifier channel
  • If need be, automate using github actions

A bit overkill for a kicking tires project, but reasonable if I had a live issue to solve for a client. I’ll be sure to update this post if I ever get there.

Attempt 3: Build-it-from-opensource

I don’t think this is approach is really needed. The KISS principle applies, absolutely.

Summary

There are few good services out there that could form the bones of this utility – worth a further look if this is a serious operating issue. Personally, I think this is a missing product in a lot of site-monitoring toolsets, but not quite big enough to warrant a full-business build.

Time Commitment

This entire exercise took about 45 minutes. This includes about 40% time loss rewriting this post twice more due to Squarespace's unstable in-app markdown renderer.

Next
Next

Hello World.