Ohio.gov logo

Digital Ohio Home


Home
Contacts

Search
  Services


Submit Requests
  Add URLs
  Remove URLs
  Add KeyMatch
  Add DropDown
  Add Synonym
  Ask Question
Best Practices
Meta Tag Generator

Search Services

A few quick links to the information available on this site are listed below. Use the request options in the left navigation section to request changes.

Google Search Results

If the search results for your search terms are satisfactory then you need do nothing more. If you would like to improve the results please follow the steps below and look at the Google Best Practices page.

Please remember that the measure of a good results set is not the number of pages that are returned. Rather the measure of a good results set is the appropriateness and relevancy of the links in the few pages in the set.

top

Google Index

Frequent visits by the Google indexer to your websites will keep the Google state-wide index and all the associated subcollections up to date.

To check the pages for your site in the Google Index follow these steps:

  1. Go to the Advanced Search selection page.
  2. Enter your DOMAIN name in the Domain section and click on search

You can also enter site:[domain] as the search criteria on the main search screen. For example site:das.ohio.gov would return all the pages on the das.ohio.gov website. Where multiple domain names apply to the same set of pages Google has been configured to display just one lowercase domain. This is a manual process that may need to be updated from time to time. Any errors should be reported using the Request link to the left.

Does Google contain the correct web pages? If not, request additional URLs as starting points.

top

Page Titles

Review the page titles to ensure they are descriptive and informative. Having one of the search terms in the title makes a page move up in the rankings of a results set.

It is the title that shows up when a page is bookmarked. And it is the title that is displayed by search engines. Title tags are part of the HTML content of a webpage and are the responsibility of the agency.

top

Important Search Terms and Results Pages

Google presents us with the opportunity to fine tune the search results so that each agency can specify the terms that most represent their service set and identify the primary page for each term.

Identify your Top 10 service terms and acronyms and check the Google results. Results will be ranked using Google's internal algorithms.

Google administrators can customize KeyMatch terms to influence the results rankings. However, the KeyMatch terms apply to the entire index as well as to any collections that the pages might exist within. We can only have 3 returns per KeyMatch.

For more details, please see the Using KeyMatch to Control Results page.

top

Pages that Should Be EXCLUDED from the Index

With Google there are 4 methods available to exclude webpages from the state-wide index. Two are controlled by the Search Administrators and 2 by the Website Owners.

Website Owners can use Robots.txt files to control indexing at the directory level and the Robots Meta Tag to control indexing at the page level.

top

Robots.txt Files

Robots.txt files are simple text files that control access to the directory and associated subdirectories. Many search engines, including google, honor and follow the rules set up in the Robots.txt file.

To keep all robots and spiders out of a directory place a file named robots.txt in the root of the target directory. The file should contain the following code:

To apply to all robots and spiders, use...

User-agent: *

To disallow indexing of all pages, use...

Disallow: /

User-agent:* identifies which spiders and robots are allowed to index your site. The (*) is a wildcard and means any spiders and robots. The Disallow: without (/) tells the robots and spiders they can index the entire site. If the Robots.txt file does not exist in a directory then robots and spiders crawl all links and pages.

To keep certain robots and spiders out of a directory or folder place a file named robot.txt in the root of the target directory. You can also disallow files.

To disallow indexing a particular folder or directory named agencies, use...

User-agent: OhioSearchIndexer
Disallow: /agencies/

Note: the "User-agent: OhioSearchIndexer" applies the robots.txt file to the Ohio Google Crawler

To disallow indexing a particular folder or directory named agencies with the file named news.html, use...

User-agent: OhioSearchIndexer
Disallow: /agencies/news.html

To allow the state-wide index to crawl and index a directory, add this text to the top of the robots.txt file...

User-agent: OhioSearchIndexer
Allow: *

top

Robots Meta Tags

Robots Meta Tags are simple html tags that control access to the content and links on a webpage. Robots Meta Tags are inserted into the Head tag of the webpage. Many search engines, including google, honor and follow the rules set up in the Robots Meta Tags. If the page does not have a Robots Meta Tag then the content is indexed and the links are followed.

Robots Meta Tag options are:

ALL = INDEX, FOLLOW
NONE = NOINDEX, NOFOLLOW
INDEX Index this page.
FOLLOW Follow links from this page.
NOINDEX Don't index this page.
NOFOLLOW Don't follow links from this page.
NOIMAGEINDEX Don't index the images on this page.
KEYWORDS Words used to index your document.

To allow the links on a page to be followed but to prevent the content from being indexed use...

<META name="robots" content="noindex">

To allow the content to be indexed but to prevent the links from being followed use...

<META name="robots" content="nofollow">

To prevent both indexing and following use...

<META name="robots" content="nofollow,noindex">

To allow the content to be searched by keywords...

<META name="keywords" content="oranges, lemons, limes">

top

Domain and URL Exclusion Rules

Domain and/or URL Exclusion are the two options available to the Search Administrators. To request that a specific page or domain be excluded from the search index please use the Request form to the left. From the onset the following rules have been set up for exclusion:


regexp:http://*.mail.*/
regexp:http://www.mail.*/
regexp:http://*.innerweb.*/
regexp:http://www.innerweb.*/
regexp:http://www\\.dogpile\\.com.*/
regexp:http://www.webtest.*/
regexp:http://*webtest.*/
regexp:http://*.webstats.*/
regexp:http://www.webeditors.*/
regexp:http://*.webeditors.*/
regexp:http://www.webstats.*/
regexp:http://www.wiggum.*/
regexp:http://*.wiggum.*/
regexp:http://www.test.*/
regexp:http://*.*.test.*/
regexp:http://*.test.*/
regexp:http://*.test.ohio.gov/
regexp:http://*test.*/
regexp:http://*.secure.*/
regexp:http://www.secure.*/
regexp:http://devcd1.portal.state.oh.us*/
regexp:http://testcd1.portal.state.oh.us*/
regexp:http://testcd2.portal.state.oh.us*/
regexp:http://cd1.portal.state.oh.us*/
regexp:http://cd2.portal.state.oh.us*/
regexp:http://ftp.*/
regexp:http://www.atistwim1.*/
regexp:http://atistwim1.*/
regexp:http://www\\.statejobs.ohio.gov/applicant/results*.*
regexp:http://gsdintranet.das.ohio.gov/
regexp:http://www.gsdintranet.das.ohio.gov/
regexp:http://www.gsdintranet.ohio.gov/
regexp:http://mail.regents.state.oh.us
regexp:http://odhlogin.odh.ohio.gov*/
regexp:http://www\\.odjfs.state.oh.us/
regexp:http://search.ohio.gov/
regexp:http://search.state.oh.us/
regexp:http://*.odnext01.*/
regexp:http://www\\.odnext01.das\\.ohio.gov/
regexp:http://www\\.state\\.oh.us/
regexp:http://www2\\.state\\.oh.us/
regexp:http://www4\\.state\\.oh.us/
regexp:http://www5\\.state\\.oh.us/
regexp:http://state\\.oh.us/
regexp:http://www\\.dw\\.ohio.gov/
regexp:http://www\\.msxml.excite\\.com.*/
regexp:http://www\\.msxml.infospace\\.com.*/
regexp:http://www\\.metacrawler\\.com.*/
regexp:http://www\\.dpxml.webcrawler\\.com.*/
regexp:http://www\\.msxml.webcrawler\\.com.*/
regexp:http://www\\.search.agency\\.ohio.gov/
regexp:http://www\\.search.ohio.gov/
regexp:http://*.*.intranet.*/
regexp:http://itcww*.*/
regexp:http://*.localhost.*/
regexp:http://www.localhost.*/

top