So when I set out to design my most recent site, I made sure that I validated each and every page of the site. But then I got to thinking while it may make my site easier to index, does that mean that it will improve my search engine rankings? How many of the top sites have valid HTML?
To get a feel for how much value the search engines place on being HTML validated, I decided to do a little experiment. I started by downloading the handy Firefox HTML Validator Extension (https://addons.mozilla.org/en-US/firefox/addon/html-5-validator/) that shows in the corner of the browser whether or not the current page you are on is valid HTML. It shows a green check when the page is valid, an exclamation point when there are warnings, and a red x when there are serious errors.
I decided to use Google trends to determine the top 5 most searched terms for the day. I then searched each term in the big three search engines (Google, Bing, and Yahoo) and checked the top 10 results for each with the validator. That gave me 150 of the most important data points on the web for that day.
The results were particularly shocking to me only 7 of the 150 resulting pages had valid HTML (4.7%). 97 of the 150 had warnings (64.7%) while 46 of the 150 received the red x (30.7%). The results were pretty much independent of the search engine or term. Google had only 4 out of 50 results validate (8%), Bing had 3 of 50 (6%), and Yahoo! had none. Now I realize that this isn’t a completely exhaustive study, but it at least shows that valid HTML doesn’t seem to be much of a factor for the top searches on the top search engines.
Even more surprising was that none of the three search engines home pages validated! How important is valid HTML if Google, Yahoo!, and Bidon’tont even practice it themselves? It should be noted, however, that MSNs results page was valid HTML. Yahoos homepage had 154 warnings, MSNs had 65, and Googles had 22. Googles search results page not only didnt validate, it had 6 errors!
In perusing the web I also noticed that immensely popular sites like ESPN.com, IMDB, and Facebook don’t validate. So what is one to conclude from all of this?
It’s reasonable to conclude that at this time valid HTML isn’t going to help you improve your search position. If it has any impact on results, it is minimal compared to other factors. The other reasons to use valid HTML are strong and I would still recommend all developers begin validating their sites; just don’t expect that doing it will catapult you up the search rankings right now.
Also published on Medium.