December brought us the latest piece of algorithm update fun. Google rolled out an update which was quickly named the Maccabees update and the articles began rolling in (SEJ , SER).

The webmaster complaints began to come in thick and fast, and I began my normal plan of action: to sit back, relax, and laugh at all the people who have built bad links, spun out low-quality content, or picked a business model that Google has a grudge against (hello, affiliates).

Then I checked one of my sites and saw I’d been hit by it.

Hmm.

Time to check the obvious

I didn’t have access to a lot of sites that were hit by the Maccabees update, but I do have access to a relatively large number of sites, allowing me to try to identify some patterns and work out what was going on. Full disclaimer: This is a relatively large investigation of a single site; it might not generalize out to your own site.

My first point of call was to verify that there weren’t any really obvious issues, the kind which Google hasn’t looked kindly on in the past. This isn’t any sort of official list; it’s more of an internal set of things that I go and check when things go wrong, and badly.

Dodgy links & thin content

I know the site well, so I could rule out dodgy links and serious thin content problems pretty quickly.

(For those of you who’d like some pointers on the kinds of things to check for, follow this link down to the appendix! There’ll be one for each section.)

Index bloat

Index bloat is where a website has managed to accidentally get a large number of non-valuable pages into Google. It can be sign of crawling issues, cannabalization issues, or thin content problems.

Did I call the thin content problem too soon? I did actually have some pretty severe index bloat. The site which had been hit worst by this had the following indexed URLs graph:

However, I’d actually seen that step function-esque index bloat on a couple other client sites, who hadn’t been hit by this update.

In both cases, we’d spent a reasonable amount of time trying to work out why this had happened and where it was happening, but after a lot of log file analysis and Google site: searches, nothing insightful came out of it.

The best guess we ended up with was that Google had changed how they measured indexed URLs. Perhaps it now includes URLs with a non-200 status until they stop checking them? Perhaps it now includes images and other static files, and wasn’t counting them previously?

I haven’t seen any evidence that it’s related to m. URLs or actual index bloat — I’m interested to hear people’s experiences, but in this case I chalked it up as not relevant.

Poor user experience/slow site

Nope, not the case either. Could it be faster or more user-friendly? Absolutely. Most sites can, but I’d still rate the site as good.

Overbearing ads or monetization?

Nope, no ads at all.

The immediate sanity checklist turned up nothing useful, so where to turn next for clues?

Internet theories

Time to plow through various theories on the Internet:

  1. The Maccabees update is mobile-first related
    • Nope, nothing here; it’s a mobile-friendly responsive site. (Both of these first points are summarized here.)
  2. E-commerce/affiliate related
    • I’ve seen this one batted around as well, but neither applied in this case, as the site was neither.
  3. Sites targeting keyword permutations
    • I saw this one from Barry Schwartz; this is the one which comes closest to applying. The site didn’t have a vast number of combination landing pages (for example, one for every single combination of dress size and color), but it does have a lot of user-generated content.

Nothing conclusive here either; time to look at some more data.

Working through Search Console data

We’ve been storing all our search console data in Google’s cloud-based data analytics tool BigQuery for some time, which gives me the luxury of immediately being able to pull out a table and see all the keywords which have dropped.

There were a couple keyword permutations/themes which were particularly badly hit, and I started digging into them. One of the joys of having all the data in a table is that you can do things like plot the rank of each page that ranks for a single keyword over time.

And this finally got me something useful.

The yellow line is the page I want to rank and the page which I’ve seen the best user results from (i.e. lower bounce rates, more pages per session, etc.):

Another example: again, the yellow line represents the page that should be ranking correctly.

In all the cases I found, my primary landing…