Content Auditing: A 60 Minute Challenge
The challenge:
A colleague stated “I want to keep an up to date content audit with our group of diffuse content publishers (10 or so) across a few sites.”
The initial approach:
Suggested by the user – I’d like to keep a register of content that people maintain whenever they publish a site. In other words, “as soon as they publish a page, or update a page, they find our shared library of content tracker spreadsheet, and log it into the tracker”. As a practitioner, my view is that this becomes an overhead for each user that doesn’t really have value to them – and they’ll stop doing it as soon as no one is watching.
I asserted that there must be an automatic content audit / sitemapper tool available for a use case like this. So I went looking for it as part of a 60 minute project.
Attempt 1: What does Google say?
Main hits to the term “sitemapper” included
- Site map generator services, including Visual Sitemaps, XML Sitemaps, Slickplan
- Browser extensions like Sitemapper
- Site map open source libraries, like Sitemapper
I got to a successful sitemap in three minutes with https://visualsitemaps.com/ – a good service, if you need to visualise all the content in your site, and like many, they have a freemium software model that tiers for bigger sites. XML Sitemaps was also super simple, and great if you need only text / xml.
API connections that I could use to build this into an alerting or workflow tool, unfortunately, I didn’t find. This might be a symptom of the quick search minutes, but this might also be a hole in the product offering generally available.
Attempt 2: Build-it-yourself
I didn’t attempt this, but next approach would be to script it:
- Use wget with the appropriate options to map a full site.
- Check the sitemap into a git repo
- Push that repo and changes up to github
- Put alerts around the github repo and plug into Slack or another notifier channel
- If need be, automate using github actions
A bit overkill for a kicking tires project, but reasonable if I had a live issue to solve for a client. I’ll be sure to update this post if I ever get there.
Attempt 3: Build-it-from-opensource
I don’t think this is approach is really needed. The KISS principle applies, absolutely.
Summary
There are few good services out there that could form the bones of this utility – worth a further look if this is a serious operating issue. Personally, I think this is a missing product in a lot of site-monitoring toolsets, but not quite big enough to warrant a full-business build.
Time Commitment
This entire exercise took about 45 minutes. This includes about 40% time loss rewriting this post twice more due to Squarespace's unstable in-app markdown renderer.