Filtering and blacklist configuration is performed according to groups. The default group is called
default . Each Cipafilter additionally comes configured with
which block all and no sites, respectively. Groups can be managed via the Group Management tab. When a user is authenticated, the LDAP tree is queried to determine if the user is part of any group
matching the names of the groups configured here. If so, the first matching group applies to the
user. If not, the
default group applies.
There is also a global blacklist and a global whitelist which may apply to all groups.
Permissions for individual groups are managed via the Group Configuration tab. The group to work
with is set using the Manage group drop-down at the top of the page.
Multiple filtering technologies are available which can be applied to the users in any particular
group. These technologies will not be applied to Web sites on the whitelist.
Safe Search Enforcement — This technology detects when a user accesses a popular search engine
(as well as certain other sites with search features, such as Flickr) and enables the site's
built-in "safe search" feature. This technology will reduce the amount of pornography and other
objectionable content returned by these sites. All groups with this feature turned on will be
subject to Enhanced Safe Search , if enabled.
Image License Restriction — This technology restricts image search results in Google, Yahoo!,
and Bing to only those that are specifically licensed for non-commercial reuse. While behavior
among different search engines may vary, this typically means that images returned will be free
to copy or redistribute if their contents remain unchanged. This feature substantially reduces
the number of image results, thereby also reducing the amount of pornography and other
objectionable content returned.
YouTube Restricted Mode Enforcement — This technology enforces Restricted Mode (aka Safety
Mode) on YouTube. This is a YouTube feature which provides much stricter filtering than the basic
safety mode enforced by the Safe Search Enforcement technology. For more information on
Restricted Mode, see YouTube Help . This feature is also compatible and highly recommended for use with Google Apps for Education
(GAFE) YouTube settings. When a user whose account is on a GAFE domain logs into YouTube, they are
subject to the content settings defined by the GAFE domain administrator; however, if the user
were to log out, these settings would no longer apply. YouTube Restricted Mode Enforcement prevents users from bypassing restrictions using this method by forcing YouTube to apply
Restricted Mode to all visitors, even ones who are not logged in. YouTube will advise visitors to
log in with their GAFE account when a video is blocked. For more information on GAFE YouTube
settings, see Google Apps Administrator Help .
Proxy/Anonymizer Detection — This technology is designed to detect and block Web-based proxy
services. This system uses fingerprints of popular Web sites along with a built-in pre-compiled
blacklist of known proxy servers to offer a high success rate in eliminating this problem. Proxies
detected by this system are shared by the rest of the Cipafilters in the cloud, which enables
quick discovery of new proxy servers that are made available on the Internet.
Send Prefer: safe —
Prefer: safe is an HTTP header which communicates to Web sites that
the client is requesting only non-objectionable ("safe") content. Sites that support this header
should, upon detecting it, enable any safe-search or parental-control features they support. This
functionality is very similar in concept to Safe Search Enforcement , but, since it is
implemented by the sites themselves, it does not require Cipafilter firmware updates to stay
current. The header is enabled by default for existing groups with Safe Search Enforcement or Pornography Detection turned on, but it is an independent feature and can be used in any
combination with the other group technologies.
Pornography Detection — Cipafilter's content-aware Pornography Detection feature analyzes
the contents of Web pages and dynamically ascertains whether they are pornographic, primarily by
measuring the presence or absence of certain words on the page. Each time a Web site is blocked,
the administrator may be e-mailed, and the block will be recorded in the Web monitoring system (if
active). To filter sites which may not be detected as pornographic through their content alone (including
very lightly moderated sites which are filled with user-generated content), this technology
additionally employs a pre-compiled blacklist of Web sites which are known to contain pornography.
Lingerie/Undergarment Detection — The Lingerie/Undergarment Detection technology is an
extension of Pornography Detection which blocks pages containing references to women's
undergarments and lingerie.
Extreme Language Detection — The Extreme Language Detection technology is another
extension of Pornography Detection which blocks pages containing offensive language. This
feature is designed for organizations with a zero-tolerance language policy; as such, it is quite
strict in its filtering — a single instance of profanity will trigger a block. At the same time,
however, the set of filtered words is quite small; specifically, it is limited to the most severe
profanities (those which would not be permitted on, for example, broadcast television).
Games Detection — This is another content-detection technology which blocks pages related to
the playing or discussion of games. The primary block target of this feature is browser-based
games, but potentially many related subjects (including PC and console games, card games, board
games, and occasionally sports) may also be affected. Unlike Pornography Detection , this
technology does not currently incorporate a pre-compiled blacklist — it is designed to be used
instead of or in addition to the gaming-related automatic blacklists described in the section
Download Blocking — This feature prevents users from downloading restricted files. Files that
are installable or executable, or may otherwise contain viruses, spyware, or trojans, and are not
required by the average user on a daily basis, are considered restricted. The following extensions
are currently filtered by this feature:
Google Apps Domain Restriction — This feature requires users in the selected group to access
Google properties using only accounts under those domains specified under Google Apps Domain Restriction on the Advanced Configuration tab of the Web Filtering page. This feature requires
google.com to be subject to decryption.
Device Authorization — This feature allows users in the selected group to authenticate
arbitrary devices for network access via the captive portal. Members of groups which do not have
this option enabled will receive a "not authorized" message when they try to authenticate.
Override Portal Time-out — This option allows for per-group portal authentication time-out
values. For example, a group for students may have a 1-hour time-out (forcing them to log back in
after the hour has elapsed), while a group for teachers might have a 12-hour time-out. If not
otherwise set, the default value (as set on the Web Filtering page) will be used.
Temporary Whitelist Management — When this feature is enabled, users in the selected group
will be able to access the Whitelist Management feature by clicking the Add to Whitelist link on the filter block page and logging in with their credentials.
Automatic Blacklists are lists of Web sites compiled by Cipafilter and are updated twice daily
from our corporate office. For ease of use, these blacklists are organized into categories; each
category may have any number of constituent lists enabled. To toggle the display of lists, use the
disclosure triangles next to the category names, or the Expand All and Collapse All all
buttons at the top.
Gaming and Gambling — Sites devoted to playing, downloading, or discussing non-educational
Gambling — Sites which encourage gambling, such as betting sites and online casinos.
Games — Sites related to computer and electronic games, particularly sites where users can play games. This blacklist also includes game download sites, game review sites, and Web sites devoted to the discussion of games. This category does not include games sites devoted to
educational use only.
Games (DMOZ Enhanced) — Additional game sites listed in the Open Directory Project's Games category .
Crime, Violence, and Ethical Issues — Sites related to violence or questionable activity.
Hate — Sites which promote hatred or bigotry via the use of vulgarity, slurs, calls for
violence, and other clearly hate-filled language. This does not include sites whose content is
mostly political in nature, but does include those with a great deal of bigoted user content.
The line between these types of sites is somewhat controversial, but this selection is designed
to be as objective and legally defensible as possible.
Violence — Sites related to gore, street fighting, and other violent imagery or activity
(death videos, shock sites, backyard wrestling, etc.).
Combat Sports — Sites related to professional and authoritatively sanctioned martial arts
and fighting sports (including boxing, wrestling, and MMA).
Weapons — Sites dedicated to the discussion or sale of guns, explosives, and other weapons
(potentially including sport- and hunting-related sites).
Drugs and Alcohol — Sites promoting the use, manufacture, or sale of illegal substances,
tobacco, and alcohol, as well as sites related to the intentional abuse of legal drugs.
Academic Cheating — Sites devoted to plagiarism, essay writing, and similar academic
Piracy — Sites devoted to copyright infringement or piracy of software, music, videos, and
other intellectual property.
Media — Sites related to online multimedia, including images, videos, and music.
Images — Sites dedicated to sharing, viewing, or uploading images (including photo sharing
sites, "meme" images, etc.). This list does not include image sites handled by the Safe Search Enforcement technology.
Video — Sites dedicated to sharing, viewing, or uploading videos (including Web cam
galleries, online television services, etc.). This list does not include sites that are
Music/Radio — Sites that offer online music/radio streaming, such as Pandora, Last.fm,
SHOUTcast, and Live365, as well as terrestrial radio broadcasting sites. Also included are
services such as iTunes, Napster, Rhapsody, and SoundCloud.
Commerce — Sites related to online purchases.
Shopping — Sites dedicated to online shopping and auctions, such as Amazon and eBay.
Travel Purchases — Sites dedicated to travel-related purchases and exchanges, including
plane/bus/train tickets, hotel reservations, car rentals, and "couch-surfing".
Social Networking — Sites dedicated to personal profiles, messaging, blogs, and other
Blogs — Sites dedicated to blogging, including both popular individual blogs and general
blog-hosting services (such as Tumblr and WordPress).
Dating and Personals — Sites related to dating, marriage, and sexual encounters.
Chat — Sites that provide chat, instant messaging, and texting services.
Social Media/Networking — Sites providing any other social components, including online
communities, networking/sharing sites, and social news sites. Also included are sites which fall
under another list but contain secondary social components, such as Last.fm and Flickr.
Filter Circumvention — Sites which may allow users to circumvent the Web filter. The lists in
this category are particularly complementary to the Proxy/Anonymizer Detection technology.
Remote Access — Sites related to remote-control and screen-sharing tools, such as LogMeIn
and TeamViewer, as well as downloadable software such as VNC clients.
VPN — Sites related to VPN and tunnelling services, including Hamachi and Tor, as well as
downloadable VPN clients.
Web Translation — Language translation services such as Google Translate. These sites are
not designed for circumvention, but a side effect of the way they work is that they can be used
to bypass Web filters.
Alternative Search Engines — Search-engine sites that may bypass the Safe Search Enforcement
feature. This includes sites with no safe search to enforce at all and sites which simply aren't
supported yet, as well as alternative domains for major search engines (such as google.de and
yahoo.co.jp). The list explicitly excludes those sites which are affected by Safe Search
Enforcement, including the US English versions of Google, Bing, Yahoo!, and DuckDuckGo.
Web Mail — Sites which offer Web-based e-mail services, such as Gmail and Yahoo! Mail.
Malware/Phishing — Sites which host viruses, worms, spamware, ransomware, and other malware,
as well as sites associated with phishing schemes. These sites may pose a threat to the security
of users' computers and/or identities.
Advertising — Sites providing advertising content and tracking services. Please note that URL
blacklisting is not a completely effective means of blocking ad content (browser-based ad-blocking
extensions use more advanced detection methods); however, this list can be useful for blocking
many of the more common advertisement sources.
The whitelist/blacklist system allows you to control Web access by using a basic domain-oriented
syntax or a sophisticated regular-expression URL-matching technology to either allow or reject Web
sites based on their URL.
Whitelist entries always override blacklist entries. Therefore, allowing a single sub-domain while
blocking the rest of the site can be done by whitelisting
subdomain.domain.com and blacklisting
To apply Global Lists to the selected group, check Apply Global Lists to this group .
For information on the syntax of list entries, see the List entry syntax section
After generating a root certificate via the Web Filtering page, SSL decryption functionality can
be configured on a per-group basis via this section.
Each group can be configured for one of four decryption behaviors:
Never — HTTPS connections made by members of this group will never be decrypted. This option
prevents the filter from applying advanced technologies, such as pornography detection, to secure
Web sites accessed by this group.
For only the following domains — HTTPS connections made by members of this group will not be
decrypted, except for connections to domains listed in the text field below. This option can be
thought of as a decryption whitelist, enabling administrators to limit decryption and its related
features to a small number of specific domains.
For all but the following domains — This option is the reverse of the above. HTTPS connections
made by members of this group will be decrypted, except for connections to domains listed in the
text field below. This option can be thought of as a decryption blacklist, enabling administrators
to bypass decryption of problematic domains (such as those that use certificate pinning).
Always — All HTTPS connections made by members of this group will be decrypted, and any
appropriate lists or technologies will be applied according to the data within the connection.
Decryption exemptions may be provided in the form of a URL or
REGEX: entry — however, all entries
must match against a request to the root of the domain (for example, the URL entry
http://www.domain.com/ will behave as expected, but
http://www.domain.com/page.html will have no
effect, because the Web filter is not able to see the full URL at the time of decision-making).
The Group Management tab allows for the addition, removal, and renaming of groups. Groups can be re-ordered by dragging and dropping the left side of the entry. The copy button on the far right of the entry area will "clone" the specified group, preserving all of its permissions and list entries.
The Global Configuration tab contains configuration options which are not group-specific.
The Super Whitelist is a special whitelist that applies to all groups and bypasses all filtering
operations. By default this list is automatically updated from Cipafilter's corporate office, but
this option can be disabled. Administrators can also specify their own Super Whitelist entries,
either in place of or in addition to the automatic ones.
This feature is designed primarily to allow software updates from trusted sources, such as Microsoft
and Apple, to pass through without causing heavy load to the Cipafilter or being slowed down by the
anti-virus system. This feature can also be used to prevent issues with Web sites which are
incompatible with the SSL decryption function.
The SNI Super Whitelist works just like the normal one, except that it only has an effect when
the client supports SNI (or when SNI isn't relevant, as with unsecured HTTP requests and requests
using browser proxy settings). This is useful to work around issues that might occur when clients
that don't support SNI (such as older mobile devices, Internet Explorer on Windows XP, and others)
attempt to access a site with many different domain names. Without the SNI Super Whitelist ,
these clients might be able to access domains they shouldn't, because the Web filter is unable to
know for certain what site they are trying to access.
An additional component to the SNI Super Whitelist is the Google Chromebook compatibility list, which activates SNI Super Whitelist entries designed to enhance the filter's compatibility
with Google Chromebooks. Please note that, due to limitations imposed by Google, this option is
fundamentally incompatible with Google Apps domain restriction.
A Google OAuth compatibility list is also provided; however, most customers will not need to
enable this feature manually, as the OAuth list additions are automatically enabled when the Google
portal front-end is in use.
Domains and URLs placed on either Super Whitelist will bypass nearly all functionality of
the filter, including virus scanning, SSL decryption, authentication, and portal interception. It is
important therefore to be careful when modifying these entries, as even a small misconfiguration can
open a large hole in the filter. Please consult with tech support if you have any questions about
When the Safe Search Enforcement technology is enabled, the Cipafilter will enforce the safe
search option on major search sites such as Google and Bing. However, because different search
providers have different concepts of "safe search", there may exist gaps in their filtering
capabilities. In particular, most search engines only catch explicit pornography terms.
The Enhanced Safe Search (ESS) feature (and its image-only sub-component, Enhanced Safe Image Search (ESIS)) is designed to address this problem — by entering key words
here, an administrator can designate additional terms which will trigger a block when used in a
search. A number of common terms are provided by default in the automatic lists.
The entries on the All Searches (ESS) tab apply to all types of searches (with very few
exceptions, such as map directions) — text, news, videos, images, etc. The Image Searches (ESIS)
entries apply only to image searches. This distinction is maintained because some terms which may
be inappropriate for image searches are valid for other types; for example, schools may wish to
limit image searches for the word breast without affecting textual searches for terms like breast cancer .
ESS key words must be literal strings (no regular expressions or wildcards are supported) of one or
more words to be blocked. Each key word will be looked for in the URL when a search is detected,
and, if it appears to exist as a search term (not part of the actual domain or path), the request
will be blocked. Note that key words are matched only against the URL and only as whole words, so
the feature may not catch searches for auto-corrected misspellings, plurals, etc. For example, the
breast will catch searches for
breast cancer , but not
braest . It is recommended that plurals and common misspellings be entered as separate key words,
if it is desired to catch them.
All ESS (and ESIS) entries apply to all groups which have Safe Search Enforcement turned on.
Suspicious Search Terms (SST) is a similar feature to Enhanced Safe Search , with two major
differences: it applies to all groups (regardless of whether Safe Search Enforcement is
enabled), and it does not produce a block page — it only logs the search for later review. This is
useful for monitoring ambiguously appropriate search terms.
Reports on ESS and SST searches can be viewed via Web Reports .
Note: Whitelisting a search engine will not bypass Enhanced Safe Search or Suspicious Search Terms (adding it to the Super Whitelist will, however).
The Global Whitelist and Global Blacklist work the same as their corresponding manual lists
(described above), except that they apply to all groups which have the Global Lists option
enabled. This feature allows an administrator to define one entry that applies to all groups,
without having to edit each group manually.
As previously mentioned, whitelist entries always override any overlapping blacklist entries.
Therefore, one can add a site to the Global Blacklist , and then allow it for a single group by
also adding it to that group's Manual Whitelist .
For information on the syntax of list entries, see the List entry syntax section
Administrators occasionally wish to exclude particular sites from the Automatic Blacklists , but
continue to apply content detection to those sites. For instance, an organization may want to allow
students access to non-pornographic Tumblr blogs. In this case, the domain can be added as an Automatic Blacklist Exemption here.
Any entries added here will be removed from all Automatic Blacklists for all groups; manual
lists will be unaffected.
Administrators sometimes need to bypass SSL decryption for problematic domains. While this
ordinarily can be achieved on a per-group basis using the For all but the following domains decryption setting, it may be appropriate to exempt a domain from decryption for all groups. This
can be accomplished by entering the domain as an SSL Decryption Exception here.
Any entries added here will be exempted from SSL decryption for all groups, overriding both the Always and For only the following domains decryption options.
Options for the temporary whitelist system (Whitelist Management) are set via the Temporary Whitelist tab.
When a user logs into Whitelist Management, they are provided a list of time durations (15 minutes,
1 hour, etc.) for which the site should be whitelisted. The table on this tab allows the
administrator to configure the available durations to their precise requirements; durations may be
added/removed, disabled, and set as the default. A duration is defined as a time period specified in
hh:mm format; for example,
01:00 is 1 hour. The minimum duration is 1 minute (
00:01 ); the
maximum is quite high to accommodate special needs, but it is not recommended to use Whitelist
Management as a semi-permanent custom whitelist by specifying very high values.
This section displays any temporary whitelist entries that are currently active. The Manage Temporary Whitelist Entries link allows an administrative user to view the history of temporary whitelist entries and remove active items.
This section describes the syntax used for adding manual entries to group lists, global lists, the Super Whitelist , etc.
Any line beginning with a
# (hash) is a comment and is not treated as a list entry. Comments do,
however, appear in the Comment field on the filter-reject page and are applied to all following
entries (until the next comment). Comments can also be placed on the same line after a list entry,
foo.com # Comment ; these are not used by the filter (they are simply discarded).
Empty lines and lines containing only whitespace are ignored as non-entries.
To match an entire Web site, simply enter its domain. For example, the entry
images.google.com , etc. It is also possible to match
individual sub-domains; for example,
To block a single page or directory on a Web site, enter the URL up to the point at which the filter
should stop matching. For example, to block all pages under
http://www.foo.com/directory , simply
enter that into the list. The list parsing is intelligent enough to handle both complete URLs as
well as partial URLs (such as
Basic wildcards (
* ) matching zero or more characters are supported in simple list entries. For
example, the entry
foo.com/bar*qux matches the paths
/barbazqux , etc., on
Wildcards can also be used in the domain, though there may be performance implications in some cases
(see below); for example,
video.foo.bar.com , etc.
Specific parts of a Web site or even ranges of sites can be blocked using a regular-expression
REGEX: ) entry. These entries are more complicated, but also much more powerful. They employ the Perl Compatible Regular Expressions (PCRE) engine and its pattern syntax.
A regular-expression entry takes the form of three colon-separated fields: the
REGEX prefix, the
host or domain name, and the pattern itself. All three fields are case-insensitive by default, and
whitespace surrounding the second and third fields is ignored. A trivial example, somewhat similar
foo.com/bar*qux wildcard entry mentioned above, is shown below:
The specified pattern (the third field) is matched against the full URL for each request to the
specified domain (the second field).
For performance reasons, pattern matching is performed against "normalized" domains. As an example,
the normalized domain for the entry
youtube.com ; therefore, the entry is
matched against any YouTube sub-domain, not just
Alternatively, a single wildcard (
* , or asterisk) can be used to apply a match to all domains
REGEX:*:porn matches any URL containing the text
porn on any domain). This can also be
used to match partial domains (as in
REGEX:*:^\w+://[^:/]*porn , which matches the text
the domain portion of the URL, but not in the path). Please note, however, that matching an entry
against all domains does incur a performance penalty. The extent of this penalty depends on several
factors, but on filters with many clients or many global wildcard entries, the effect can be
significant. For this reason, entries of this type should be considered a last resort.
The syntax of PCRE patterns is described fully in the PCRE documentation ; however, the following
can be used as a quick reference:
$match the beginning and end of a URL, respectively
)treat a series of characters as a single group
*matches 0 or more of the preceding group/character
+matches 1 or more of the preceding group/character
?matches exactly 0 or 1 of the preceding group/character
.matches any single character
[^:/]matches any single character except for
\dmatches any single digit (
\wmatches any single word character (
\bmatches the start or end of a word (the boundary between a word character and a non-word character)
\can be used in front of any special character to treat it literally
This is a basic domain entry; it matches all pages on all Web sites which are part of the
youtube.com domain — not only video pages, but also, for example,
This is a basic sub-domain entry; it matches all pages on all Web sites which are part of the
mail.google.com domain. This also matches sub-domains further down; for example, it matches
chatenabled.mail.google.com . It does not match any other Google sub-domain — for example,
images.google.com is unaffected.
# Social networking reddit.com
This is another basic domain entry; this time it is preceded by a comment. The
# Social networking line is not interpreted as a list entry; however, it appears in the Comment field on the
filter-reject page. This is useful for explaining why a page has been blocked; it can also be used
to (for example) give the name of the person who added the entry and/or the date they added it.
reddit.com # Added 12/01
This entry uses an alternative comment style (called a "trailing" or "line-ending" comment). This
type of comment is simply discarded by the filter; it does not appear on the filter-reject page.
This is a regular-expression entry. The
reddit.com after the first colon indicates that this entry
is matched only against Web sites under the
reddit.com domain (including
old.reddit.com , and so on).
\b(cat|dog)s?\b after the second colon indicates that the entry matches any reddit URL
containing one of the following whole words:
dogs . (
\b matches the start
or end of a whole word; the
(x|y) structure means "either
y "; and the
s? means that the
s may or may not appear.)
Because of the "whole word" requirement, this rule does not match, for example, the words
bulldog . However, it does match
dog-catcher , since the hyphen makes two separate words.
This is another regular-expression entry — this time a "global wildcard" entry. The Web filter tries
to match the pattern against every single URL that passes through. As mentioned above, this does
incur a certain performance hit, so it is important to use this type of rule only when absolutely
porn after the second colon indicates that the entry matches the text
porn if it appears
anywhere in the URL (whether the domain or the path). Note that this entry also matches the text
anti-pornography , for example, since it still contains
This is another "global wildcard" entry which is tested against every URL passing through the
filter. Once again, this type of entry does incur a certain performance hit.
^ after the second colon is an anchor that means the expression matches only at the very
beginning of the URL (without this, the expression would match anywhere).
? means "zero or one of the preceding character"
— in this case, the preceding character is an
[^/:]+ matches one or more (
+ ) of any character that is not a colon or slash (
[^:/] ). Matching
only non-colon, non-slash characters ensures that the pattern is only tested against the first part
of the URL (the domain).
\.edu matches the text
.edu . The backslash is necessary because, in regular expressions, a dot
by itself means "any character."
[:/] matches either a
: or a
/ . This is useful to ensure that the pattern matches
only at the very end of the domain name (otherwise, it might match a domain like
The pattern in this case is much more specific than the
porn example above — it only matches Web
sites with domains ending in
.edu (such as
mit.edu ). It does not match URLs like
http://edu.foo.com/ , nor
This is a simple wildcard entry which is effectively identical to the
example shown above. Although it is much easier to read and write, it has all of the same