Ohio.gov logo

Digital Ohio Home


Home
Contacts

Search
  Services


Submit Requests
  Add URLs
  Remove URLs
  Add KeyMatch
  Add DropDown
  Add Synonym
  Ask Question
Best Practices
Meta Tag Generator

Google Best Practices

Follow these best practices to get the most out of the Google search engine. If the results delivered are still not as expected or desired please let us know by using the Submit Request link to the left and we can work with you to tweak the indexer settings for your site.

Make sure that the Google crawler can read your content

Validate all HTML content to ensure that the HTML is well-formed. Use a text browser such as Lynx to examine your site, because most search engine spiders see your site much as Lynx would. If extra features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of your site in a text browser, then search engine crawlers may have trouble crawling your site.

Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in multiple copies of the same document being indexed for your site, as crawl robots will see each unique URL (including session ID) as a unique document.

Ensure that your site's internal link structure provides a hypertext link path to all of your pages. The Google search engine follows hypertext links from one page to the next, so pages that are not linked to by others may be missed.

Use robots standards to control search engine interaction with your content

Place a robots.txt file in the root directory of your site. This file tells crawlers which files and directories can or cannot be crawled, including various file types. The robots.txt file will be checked on a regular basis, but changes may not have immediate results.

Use robots meta tags to control whether individual documents are indexed, whether the links on a document should be crawled, and whether the document should be cached. The "NOARCHIVE" value for robots meta tags is supported by the Google search engine to block cached content, even though it is not mentioned in the robots standard.

Avoid using frames

The Google search engine supports frames to the extent that it can. Frames tend to cause problems with search engines, bookmarks, e-mail links and so on, because frames don't fit the conceptual model of the web (where every document corresponds to a single URL).

Searches that return framed pages will most likely only produce hits against the "body" HTML page and present it back without the original framed "Menu" or "Header" pages. Google recommends that you use tables or dynamically generate content into a single page (using ASP, JSP, PHP, etc.), instead of using FRAME tags.

Avoid placing content and links in script code

Most search engines do not read any information found in SCRIPT tags within an HTML document. This means that content within script code will not be indexed, and hypertext links within script code will not be followed when crawling. When using a scripting language, make sure that your content and links are outside SCRIPT tags. Investigate alternate HTML technologies to dynamic web pages, such as HTML layers.