What is Google’s SMITH algorithm?

What is Google’s SMITH algorithm?

About 10 years ago, Google went through a phase of naming its algorithm updates after animals. You might remember the Panda, Penguin and Hummingbird updates.

The search engine has now shunned this practice and has given its recent updates more factual and, frankly, boring names like ‘May 2020 Core Update’. One notable exception is BERT (Bidirectional Encoder Representations from Transformers), which was an algorithm designed to help Google better understand human language, and it came with an appropriately human-sounding name. Now, Google looks to be introducing a common surname to its list, with SMITH reportedly the newest kid on the algorithm update block.

Several sources, including Search Engine Journal (SEJ), have mentioned a new Google paper about the Siamese Multi-depth Transformer-based Hierarchical (SMITH) algorithm. This paper doesn’t appear to be available to the general public, but Barry Schwartz from SEJ notes that Google claims it outperforms BERT in certain areas.

Write-ups suggest that SMITH is, in many ways, a development of BERT, rather than a replacement for it. Schwartz’s article is worth a read for a thorough dive into what SMITH is all about, but what interests me is the idea that SMITH is better at understanding longer documents, and what this means for content production.

Currently, BERT is used to highlight sections of a text that relate to a user’s search query, and this can form the basis for Google’s featured snippets, which pull out a section of a piece of content that becomes viewable to the reader without clicking through to the site. However, BERT rarely understands the context of what it has identified in relation to the rest of the piece. It might pull out a small throwaway comment, or highlight an argument that is countered or refuted later in the piece. The idea is that SMITH would better comprehend the full document and how well it related to a search query.

That opens up a common debate about what the ideal length is for a blog – something content developers have never really agreed upon. A consensus of 400 – 600 words has emerged, but studies say the average length of a blog is over 1,000. A move towards SMITH could place a higher value on longer content, and mean that structured, well-balanced pieces are favoured in search engine results ahead of punchy short articles that provide definite but not necessarily well-argued answers and advice.

As for if and when Google will start using SMITH, that’s not entirely clear. The suggestion is that SMITH is currently being discussed in a speculative way, but as we’ve seen just this week, Google has a habit of tweaking features and not telling us until much later. With Search Engine Roundtable reporting two possible Google algorithm updates in the space of five recent days, it would not be a major surprise if SMITH is already in effect.

We’ll certainly keep an eye out for ranking fluctuations, and for advice on how to keep your content in tune with Google algorithm developments, we’re here to help at Engage Web.

Content Team Leader at Engage Web
John works for Engage Web as a Content Team Leader and regularly contributes to the website and programmes of his beloved Chester F.C.
John Murray

Get in touch

    Please confirm we can contact you


    Book a consultation with Engage Web