Matt Cutts’ Secret Google Sauce Tasting

Matt Cutts generously shares useful SEO insights by doing a special SEO questions and answers on Google video:

Session 1 includes: Qualities of a good site
Pageviews are not really a factor on when things are updated in sitemaps.
General guidelines: site has to be crawlable (use text browser like Lynx), if you can get through your entire site you’re in good shape; use sitemaps; create reasons why people would want to link to you, make sure that people who are relevant to your niche know about you; think “hook”, viral things, content is a good way to get links.

Session 2 includes: SEO myths, large site launches and Google Images
Having a few sites on the same IP is not something to worry about too much. If you have 1 to 5 sites with different themes on the same ip or server and have enough unique content to support it, it is not really a problem.
If you are launching new sites with millions of new pages it is better to launch a few thousand pages at a time. Otherwise you might attract scrutiny. Make sure you have good content.
Google did an update on the Google images index last week.

Session 3 includes: search engine optimization vs end user optimization, spam detection tools, cleanliness of code
Both SEO and user-friendliness are important (findability vs conversion). The trick is to make both factors it the same thing. If you do that you’re in very good shape: compelling content that is easy for users and serach engines to go around.
Spam tools at Google are not available to the outiside world. One thing you can look at is Yahoo site explorer which shows you backlinks. There are also tools that will show you everything on one ip address. To make sure your site is clean, use sitemaps that will tell you of any problems the crawler encountered.
Cleanliness of code (W3C): writing code with errors just happens so don’t put it on the top of your list, but in general it’s a great idea to validate your code.

Session 4 includes: static vs dynamic urls, geotargeting
Static and dynamic pages are treated in a similar way when it comes to ranking (PR flow). Opt for 2 or 3 parameters at the most, avoid long numbers and getting rid of extra parameters is a good idea. Use a little bit of mod rewrite to make it look like a static page.
The way that Google defines cloaking is showing different content to Googlebot than to users. Geotargeting is not cloaking in Google guidelines. What will get you into trouble is treating Googlebot in some special way (e.g. if you are geotargeting per country, don’t make a special country for Googlebot).

Session 5 includes: merging domains using 301s, theme a site using directories, split testing
Anytime there is a merger doing a 301 is no problem, as long as it’s on topic.
Tree-like architecture and a topic break-down is a good idea. Your keywords will end up in a directory-like way.
Split-testing: if you can, split-test in an area that Googlebot can get to using htaccess file or robots.txt to prevent it from being indexed.

Session 6 includes: all about supplemental results
The beathen path: a problem with a one-word search in Google would be a problem but a 20-word search is so far off the beaten path that you shouldn’t worry about it.
The results estimates returned with the “site:” operator are becoming more acurate.
Redirects in supplemental results: there is a main web results googlebot amd supplemental results googlebot. The next time the supplemental results googlebot visits the page and sees the 301 it will index it accordingly and refresh. Supplemental results are getting fresher and fresher due to more frequent updates of the supplemental index.

Session 7 includes: Google Analytics and SERPs, duplicate content detection, porn marking, hyperlinks in option elements
Analytics data is not used at all by the Webspam team and to the best of Matt’s knowledge it is not used anywhere else in Google either.
Google does a lot of duplicate content detection: all the way from the crawl through the indexing and the scoring. There are different types of duplicate detected: exact duplicates and near-duplicates. Make sure your pages are quite different from each other.
How can you exclude a site from the SafeSearch? At the moment there is no tag for this. The best option would be to put it in your meta tags because SafeSearch actually looks at the raw content and will pick it up as being adult content.
Making a box spiderable in an option element is not recommended. It would make eyebrows go up. It’s better for users and search engines to take those links out and put them on the bottom of the page.

Session 8 includes: the difference between a data refresh, an algorithm update and an index update
Matt uses the metaphor of a car to explain differences between the different types of updates and refreshes Google does.

Session 9 includes: google datacenters
Matt talks about the different datacenters and how results from the same class C ip block are roughly the same but not always. He explains how even by going to the exact ip address of one datacenter you could get bounced over to another datacenter. Google uses datacenters to try out new features or algorithm changes.

Session 10 includes: all kinds of stuff
Matt answers a question about the possibility of searching only for home pages and says he may suggest that as a new feature.
strong vs bold and emphasize vs italic get scored in exactly the same way
Blogs and web sites are not ranked differently, unless you are at blog search.

Technorati tags: - -

0 comments ↓

There are no comments yet...Kick things off by filling out the form below.

Leave a Comment