Web Crawlability (for Forms and Websites) Guide

Overview

This document explains the visibility of events to search engine web crawlers, also known as robots or “bots.”

Three options exist for crawlability:

Note: How to structure the contents of a site for search engines is outside the scope of this document.

Default Behavior in Certain

The default behavior is that events created within Certain are not crawlable.

If the goal is for all events to be public, Certain can enable web crawlers across the domain and block-allow the events to be indexed.

Contact your Customer Success Manager to facilitate this request.

Once domain-wide crawling is enabled, there are additional HTML META tags that can be added to the display shell to enable better crawlability. This improvement is outside the scope of this document.

META Tags

META tags are HTML elements that tell robots how to treat a page.

This section applies if Certain has been asked to enable web crawlers for the domain.

A ROBOTS META tag can instruct web crawlers to index or not index content, and to follow or not follow links.

By adding ROBOTS META tags into the head of the event display shell, Certain’s wrapper can exclude the event website(s) and form(s) from being indexed.

Private Events

To exclude Certain events from crawling, add the ROBOTS META tag to the custom display shell of the events’ display configuration.

How to write a Robots META Tag

When to include it

If the domain is enabled for crawling, include the robots META tag on pages you want excluded.

What to put into it

The ROBOTS META tag uses the NAME attribute with the value "ROBOTS" and a CONTENT attribute with one or more directives. The possible directives include: "INDEX", "NOINDEX", "FOLLOW", and "NOFOLLOW". If there is no robots META tag, the default is "INDEX, FOLLOW".

Where to put it

Place the robots META tag in the HEAD section of an HTML page.

Include the tag on every page by adding it to the advanced display shell with Certain’s display configuration.

Values for the Content Attribute

This is a list of common values for the CONTENT attribute and their effects.

| Value | Description | Used By | |---|---|---| | index | Allows the robot to index the page (default). | All | | noindex | Requests the robot to not index the page. | All | | follow | Allows the robot to follow the links on the page (default). | All | | nofollow | Requests the robot to not follow the links on the page. | All | | none | Equivalent to noindex, nofollow | Google | | noodp | Prevents using the Open Directory Project description as the page description in engine results. | Google, Yahoo, Bing | | noarchive | Requests the engine not to cache the page content. | Google, Yahoo, Bing | | nosnippet | Prevents displaying any description of the page in search engine results. | Google, Bing | | noimageindex | Requests this page not to appear as the referring page of an indexed image. | Google |

See the Document-level Metadata element reference for additional values.