Joshua Paling

The bare minimum developers should know about SEO

Slides and transcript of my recent talk given at both #rorosyd and Alt.net meetups this November:

Hi I'm Joss, and I am not an SEO. However, I know enough to be able to call out really bad SEO advice when I see it, and that's something I feel every developer could benefit from, because in the span of your career, there's a non-trivial chance you'll come across it. I should warn you that I've compromised flow and ease of following along for just cramming a lot of info into 15mins, so I'll move quickly, but I'll post stuff on my blog afterwards.

A quick bit of prerequisite knowledge before I start: Nowadays, when Google Bot sees your site, it basically sees it through a headless browser. It is aware of CSS and some javascript - so things like tabs, expandable sections, it's aware of those.

The nature of SEO, the fact that business owners are so desperate to rank on the first page of google, combined with the fact that SEO is relatively mysterious to most people, and the fact that it's so hard to accurately track results from implementing website copy changes and other SEO advice, through to improved search results, and then on to improved conversion rates, and ultimately, cash in the bank and an improvement for the businesses bottom line - the fact that whole pipeline is so tricky to track, means that SEO can be a very smoke-and-mirrors industry, and while there are good SEOs, there's certainly others who are basically just good talkers (which, by the way, I'm becoming increasingly convinced is the single most important skill to have if you want to do well in any business!). Anyway… I aim to give you a sort of top-level, common-sense, view of SEO from 1000 ft. And we'll start with a very brief and drastically over-simplified example of iteratively building a search engine.

So, world domination and conspiracy theories aside, Google has one agenda. When you search for Pizza, it wants to give you the worlds best resources for pizza. So imagine you're building a search engine from scratch way back when the internet was a baby. And you think "this is simple. I'll just count the number of times a page mentions pizza, and the one that mentions it the most comes up first.". But Tony, from Tony's Pizza, catches on to this. So he puts pizza pizza pizza hundreds of times, in white text against a white background, at the bottom of the page, just to get to the top. And get to the top he does. He ranks number 1 for pizza. But you catch on to him and realise: it's not a genuinely great resource, he's just managed to find a loophole in your algorithm. So you go back to the drawing board, and say I'm actually going to penalise sites that do that spamming trick. So Tony's Pizza drops off the first page. And you say "what I'm actually looking for, is an optimal keyword density.

On the best sites, Pizza will make up 3% of the words on the page. Of course, people catch on and cheat that too, making sure their sites hit the optimal keyword density, even if it means making the reading experience more clunky for the end user. So you go back to the drawing board, and say "well, there's nothing magic about 3% as a keyword density. It's probably more of a bell curve situation. And if a keyword appears in the main heading of the page, or in the first paragraph, that should count for a bit more than a word down the bottom. And if a word appears in the domain name itself, like tonyspizza.com.au, that counts for even more, and so on. Of course, people catch on to these changes too, and the cycle continues.

The specifics I mentioned are drastically over simplified examples, but there is this constant cat-and-mouse game in SEO, where Google's constantly trying to make their algorithm more bullet-proof, and SEOs are constantly trying to figure out how the algorithm works, both to gain better understanding of what google sees as a good resource, and potentially also, to look for loopholes - shortcuts to the top.

Practices falling under the former umbrella, understanding what google sees as a good resource, and trying to make your site one, are recommended by Google, and are sometimes referred to as "white hat SEO". The later - cheating the algorithm - is referred to as "Black hat SEO". It can work, but it can often work very well short term, then result in bad results long term - in some rare cases even a banning from Googles results.

So if you only remember one thing from this talk, make it this: Google's algorithm is really good now, and getting better all the time. It takes into account over 200 factors when ranking a page, and there's no silver bullet, or easy cheats. The simplest, and most future proof approach to SEO is to actually make your site a genuinely great resource for the terms you want to rank on. If it's pizza, that might mean providing guides to how to make your own pizza dough, reviews of the best at-home pizza ovens, reviews of the best local pizza restaurants, and so on. Make your site the kind of site that users would WANT to see at the top of their search results.

Now, back to building our simplified search engine.

One of the things that have been used to rank sites for a long time, even prior to google, is links. Links are like votes. If someone links to a site, it's like they're saying "hey, this is a good resource, check it out". So you've got the same cat/mouse game with links, of course. In our over-simplified example, you'd start by simply counting the number of links. Most links wins. But then people open "link farm" sites, where you can pay a small fee, and get hundreds of links from these link farms and online directories. So you develop a means to rank links on quality - higher quality links count for more votes. Links are still a very central part of SEO, and google's constantly refining how it uses links to determine good quality sites. At one point, the anchor text of links was very important to google.

So in our pizza example, we'd be trying to get people to link to our pizza site with the anchor text "best pizza" or something like that. People had some fun with this, and they link-bombed George W Bush's site with the anchor text miserable failure, so in 2004 if you googled that term, Bush came up #1. In response to link pranks like this, in 2007 google released an update to it's algorithm that basically stopped them from working, which is a shame because I'm sure we could have had some great fun with Tony abbott. Anyway, links are very important, like I said, so I'll spend some time now talking about links.

The #1 rule: you are not allowed to pay for links in an attempt to manipulate your ranking. People do do this. It does work, but if google's algorithm suspects you've paid for links, it'll raise a flag, and someone will look into the situation manually, and you can potentially be banned from search results. Because essentially you're paying to look like a good resource when you're not actually one. Of course, paying for ads is fine - what's not OK is paying for links which are strategically placed and designed to manipulate your ranking rather than simply have people click on them. Soliciting unpaid links in a similar manner is absolutely fine, that's just called marketing. All businesses should do it, and google has no problem with it.

Next point, not all links are created equal. What if Tony's Pizza has an inbound link coming from cheapviagraonline.com.au. Should that rank his site any higher? Does that lend credibility, or take it away? If anything it should rank his site lower. What about a link from something like yellow pages online. Just a big directly of businesses with links to each one. Well, Google will count for something, but not much.

What about if a very popular and highly respected site like the SMH's food and lifestyle section, did an article on the best pizza in Sydney, and links to Tony's Pizza. Well that's huge, right? That's hundreds of times more of a vote than the yellow pages link. What if that same article also linked to 4 other highly popular pizza sites? Google looks at stuff like that. And that'd put Tony's in the same company as those other great resources, so that adds credibility and boosts rankings too.

There's two things to be aware of from this example. First, links from high ranking sites are much more influential. Second, links from sites on topics relegated to your site much hold more weight that others. This idea of related sites is sometimes described as your "online neighbourhood". The "neighbourhood" of a pizza site is other pizza sites, primarily, also other food sites, other italian sites, perhaps, and so on. So if a food blogger reviewed your site, that'd be a good link too.

One thing I hope is clear to everyone is just how common-sensical all the points I've mentioned so far are. Of course sites in your neighbourhood should hold more weight. Of course a link from SMH should be better than one from cheap viagra.

Now, for lack of time I'll move through some other points on links quite quickly:

varied anchor text in inbound links is good, because it looks natural. If 80% of inbound links to Tony's Pizza had the exact same anchor text, say "best pizza", then that may look like a bit of a spammy artificial attempt to manipulate ranking. And just expanding that point, all good SEO looks natural. If some SEO advice seems awkward or makes for a more clumsy user experience in an attempt to boost rankings, it's likely wrong, or at least outdated.

Inbound links from .gov and .edu top level domains are particularly good, those TLDs are considered particularly trustworthy (because the government never lies)

Old links are also good. A link to Tony's Pizza that'd been around for four years is better than one created last month. And the age of a domain name is a factor in determining credibility, both of the domain itself (eg tonyspizza.com.au), and the age of domains that link to that site.

Next, reciprocal links. You might've heard they can be an issue. Let's pretend we're google bot, who's very pragmatic, and consider reciprocal links. If SMH linked to Tony's pizza from an article, and Tony's site linked back to that article, saying "check it out", is that OK? Of course! Should that devalue the SMH link in any way? No way. You'd absolutely expect sites to have a certain percentage of reciprocal links.

But what about this situation… imagine if 80% of all inbound links to Tony's were reciprocal links. Does that look natural? No. That looks like Tony's gone around saying "I'll scratch your back if you scratch mine". So a high percentage of reciprocal links can be detrimental to SEO.

I'll move on from links now for lack of time, but the main thing is: quality over quantity, and it should all look natural.

I'll mention a few other gotchas and common SEO questions:

Hidden text. Can google see it? Does it think there's anything wrong with it?

There are certainly legitimate uses for hiding text, such as tabs, expandable sections, etc. And as long as the hidden content isn't pulled in dynamically via ajax (and nowadays possibly sometimes even if it is), google can see the content, and it will index it.

Let's say I google for how to make pizza dough. And imagine Tony's pizza has content about that, hidden by default on tab number 2. Google can see that content. But is that the resource I want to see first in my search results? Probably not. I'd probably rather see a page that has the pizza dough content visible and in plain view by default. So because hidden content is significantly less important in the user experience, google places significantly less weight on it accordingly. In terms of ranking well for "pizza dough" searches, opting for separate pages rather than tabs may well have been a better option.

Likewise, in-page links, like you might use on single page sites. Is google aware of them? Yes. It'll be able to tell that the bits you link to are important bits of the page. But is the best resource on a topic likely to be a single page site? Probably not. If SEO is a critical part of your strategy, you're better off having a substantial amount of content, linked to across several pages.

And note again that this is all stuff that very much appeals to common sense. So SEO isn't nearly as mystical as it may first seem, and if you do get SEO advice that seems mystical, or that just seems a bit spammy, deceptive or low class, that's a big red flag. Being a genuinely great resource is key.

Writing copy. When writing content for your site, you should think about the types of terms people will be searching for when you want them to find your content. And, you should make an effort to include those terms in your copy. Of course, making sure to do so in a natural, non-spammy, non-awkward way.

Furthermore, once you think you know the terms people will search for, you should use tools like Google Trends to check, and try to find more popular synonyms or related search terms. For example, if you're an airline, optimising your site for the term "discount airfares" is big mistake. As the blue line shows, no one searches for that. They all search for "cheap flights" instead, which is shown in red.

Coming back to our pizza dough example, Google Trends tells us that "pizza dough recipe" is a far more popular search term than "homemade pizza dough" or "how to make pizza dough", and we should keep that in mind when writing our copy.

While I'm talking about search terms, sometimes people get unnecessarily caught up in a single "trophy phrase". For example "I don't care about any other terms, I just want to rank #1 for best pizza in Sydney". That strategy is almost always a mistake. People search for lots of different terms, and of the millions of queries google receives per second, about 15% are completely new terms, never search before. There's this idea in SEO of optimising for "long tail keywords", that is longer, more specific, less frequently searched for terms. The "long tail" idea is a visual metaphor for the search terms vs frequency. If you could imagine every search term ever lined up along the x-axis, and the number of times it's been used plotted on the y-axis, the graph would look like this

It looks like most searches come in that green area, of very frequently searched terms. But actually, that tail goes on and on and on. The majority of searches come from the long tail, from the humungous number of infrequently searched for terms. And that's why putting too much attention on a single trophy phrase can be a bad idea.

I'm probably over 15 minutes at this point, so I'll wrap up. The cool word to use these days, so you can all feel hip, rather than SEO, is inbound marketing. Kinda like how the term UX has replaced design. Inbound marketing is basically an acknowledgement of the fact that your site gets inbound traffic via lots of source, not just google searches, so it's important to have a holistic strategy that thinks about social media, paid ads, natural search, driving recurring traffic, email marketing, even driving traffic from offline sources and so on, and depending on your business model, you may proportion your time and resources differently across those different areas.

So that's all I've got, happy to take any questions.