Web Crawlability (for Forms and Websites) Guide

Overview

The Overview explains how visible an event is to search engine crawlers, also known as robots or bots. There are three options. All Private means no events are crawlable. This is the default, so no action is required. All Public means all events are crawlable. Ask Certain to enable this for your domain. Some Private means only some events are crawlable. Follow step 2, and then use the Robots META tag described below to make an event private (not crawlable). How to structure the contents of your site to optimize it for engines (SEO) is not within the scope of this document.

Default Behavior in Certain

The default behavior is that events created within Certain are not crawlable. If the intention is to make all events public, Certain can enable web crawlers across your domain and block-allow your events to be indexed. A Customer Success Manager can facilitate this request. Once this capability is enabled for the domain, additional HTML META tags can be added to the display shell to enable better crawlability. However, this enhancement is outside the scope of this document.

META Tags

This section applies if Certain has been asked to enable web crawlers in your domain. A special HTML META tag can tell robots to index or not index the content of a page, and/or not scan it for links to follow. By adding this extra HTML tag into the head of the Certain event display shell, you can instruct web crawlers to exclude your event’s website(s) and form(s) from being indexed.

Private Events

To exclude Certain events from crawling, add the robots META tag described below to the custom display shell of the events’ display configuration.

How to write a Robots META Tag

When to include it

The default behavior of sites that exclude the robots META tag is that they are found and indexed by web crawlers. If the intention is to exclude an event from being found by web crawlers, add this extra tag to the event display shell with the following values.

What to put into it

The NAME attribute must be ROBOTS. Valid values for the CONTENT attribute are INDEX, NOINDEX, FOLLOW, and NOFOLLOW. Multiple comma-separated values are allowed, but only some combinations are sensible. If there is no robots tag, the default is INDEX, FOLLOW.

Examples:

<META NAME="ROBOTS" CONTENT="NOINDEX, FOLLOW">
<META NAME="ROBOTS" CONTENT="INDEX, NOFOLLOW">
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

Where to put it

Like any META tag, the tag should be placed in the HEAD section of an HTML page. It should be included on every page of your site. You can place it in the advanced display shell with Certain’s display configuration (Plan Configure Display). This enables the HTML “wrapper” to be included in all the websites and forms in the event.

Values for ‘Content’ Attribute

See the following list for the values of content, and the corresponding behavior that web crawlers will exhibit when they are included as part of your ROBOTS META tag.

| Value | Description | Used By | |-------|-------------|---------| | index | Allows the robot to index the page (default). | All | | noindex | Requests the robot to not index the page. | All | | follow | Allows the robot to follow the links on the page (default). | All | | nofollow | Requests the robot to not follow the links on the page. | All | | none | Equivalent to noindex, nofollow. | Google | | noodp | Prevents using the Open Directory Project description as the page description in engine results. | Google, Yahoo, Bing | | noarchive | Requests the engine not to cache the page content. | Google, Yahoo, Bing | | nosnippet | Prevents displaying any description of the page in search engine results. | Google, Bing | | noimageindex | Requests this page not to appear as the referring page of an indexed image. | Google |

List from “<meta>: The Document-level Metadata element” is provided by Mozilla Contributors. See https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta for details.

What this means for your site

If you exclude the robots META tag, the default behavior is that pages are found and indexed by web crawlers. If you would like to exclude your event from being found by web crawlers, add the robots META tag into the event display shell with the appropriate CONTENT value.

Additional notes

This page references related topics and provides a technical example of the Robots META tag. The information is intended to help you manage crawlability for events and forms in the Certain display environment.

Email Analytics: Email Statuses and What They Mean
Adding an email address as a hyperlink