4.7.2.12. URL filtering in HTTP

Starting with version 3.3FR1, Zorp supports category-based URL filtering using a regularly updated database.

Configuring URL-filtering in HTTP

The URLs and domains in the database are organized into thematic categories like adult, news, jobsearch, etc.

To enable url-filtering, set the enable_url_filter and enable_url_filter_dns options to TRUE. The enable_url_filter_dns option is needed only to ensure that a domain or URL is correctly categorized even when it is listed in the database using its domain name, but the client tries to access it with its IP address (or vice-versa).

Note

URL-filtering is handled by the Zorp Http proxy, without the need of using ZCV. The URL-filtering capability of Zorp is available only after purchasing the url-filter license option.

Updates to the URL database are automatically downloaded daily from the BalaSys website using the zavupdate utility.

Access to specific categories can be set using the url_category option, which is a hash indexed by the name of the category. The following actions are possible:

ActionDescription
HTTP_URL_ACCEPT

Permit access to the URL.

HTTP_URL_REJECT

Reject the request. The error code and reason for the rejection are specified in the second and third arguments. See Section Configuring URL-filtering in HTTP for details.

HTTP_URL_REDIRECT

Redirect the connection to the URL specified in the second argument.

Table 4.14.  Action codes for URL filtering

Example 4.14. URL-filtering example

The following example blocks several categories and accepts the rest. For a complete list of categories, see Section List of URL-filtering categories.

class MyHTTPUrlFilter(HttpProxy):
    def config(self):
        HttpProxy.config(self)
        self.enable_url_filter=TRUE
        self.enable_url_filter_dns=TRUE
        self.url_category['adult']=(HTTP_URL_REJECT, (403, "Adult website",))
        self.url_category['porn']=(HTTP_URL_REJECT, (403, "Porn website",))
        self.url_category['malware']=(HTTP_URL_REJECT, (403, "Site contains malware",))
        self.url_category['phishing']=(HTTP_URL_REJECT, (403, "Phishing site",))
        self.url_category['warez']=(HTTP_URL_REJECT, (403, "Warez site",))
        self.url_category['*']=(HTTP_URL_ACCEPT,)

The following example redirects access to online gaming sites to a dummy website.

class MyHTTPUrlFilter(HttpProxy):
    def config(self):
        HttpProxy.config(self)
        self.enable_url_filter=TRUE
        self.enable_url_filter_dns=TRUE
        self.url_category['onlinegames']=(HTTP_URL_REDIRECT, "http://example.com")
        self.url_category['*']=(HTTP_URL_ACCEPT,)
List of URL-filtering categories

The Zorp URL database contains the following thematic categories by default.

  • abortion: Abortion information excluding when related to religion

  • ads: Advert servers and banned URLs

  • adult: Sites containing adult material such as swearing but not porn

  • aggressive: Similar to violence but more promoting than depicting

  • antispyware: Sites that remove spyware

  • artnudes: Art sites containing artistic nudity

  • astrology: Astrology websites

  • audio-video: Sites with audio or video downloads

  • banking: Banking websites

  • beerliquorinfo: Sites with information only on beer or liquors

  • beerliquorsale: Sites with beer or liquors for sale

  • blog: Journal/Diary websites

  • cellphones: stuff for mobile/cell phones

  • chat: Sites with chat rooms etc

  • childcare: Sites to do with childcare

  • cleaning: Sites to do with cleaning

  • clothing: Sites about and selling clothing

  • contraception: Information about contraception

  • culinary: Sites about cooking et al

  • dating: Sites about dating

  • desktopsillies: Sites containing screen savers, backgrounds, cursers, pointers, desktop themes and similar timewasting and potentially dangerous content

  • dialers: Sites with dialers such as those for pornography or trojans

  • drugs: Drug related sites

  • ecommerce: Sites that provide online shopping

  • entertainment: Sites that promote movies, books, magazine, humor

  • filehosting: Sites to do with filehosting

  • frencheducation: Sites to do with french education

  • gambling: Gambling sites including stocks and shares

  • games: Game related sites

  • gardening: Gardening sites

  • government: Military and schools etc

  • guns: Sites with guns

  • hacking: Hacking/cracking information

  • homerepair: Sites about home repair

  • hygiene: Sites about hygiene and other personal grooming related stuff

  • instantmessaging: Sites that contain messenger client download and web-based messaging sites

  • jewelry: Sites about and selling jewelry

  • jobsearch: Sites for finding jobs

  • kidstimewasting: Sites kids often waste time on

  • mail: Webmail and email sites

  • marketingware: Sites about marketing products

  • medical: Medical websites

  • mixed_adult: Mixed adult content sites

  • mobile-phone: Sites to do with mobile phones

  • naturism: Sites that contain nude pictures and/or promote a nude lifestyle

  • news: News sites

  • onlineauctions: Online auctions

  • onlinegames: Online gaming sites

  • onlinepayment: Online payment sites

  • personalfinance: Personal finance sites

  • pets: Pet sites

  • phishing: Sites attempting to trick people into giving out private information

  • porn: Pornography

  • proxy: Sites with proxies to bypass filters

  • radio: non-news related radio and television

  • religion: Sites promoting religion

  • ringtones: Sites containing ring tones, games, pictures and other

  • searchengines: Search engines such as google

  • sect: Sites about religious groups

  • sexuality: Sites dedicated to sexuality, possibly including adult material

  • shopping: Shopping sites

  • socialnetworking: Social networking websites

  • sportnews: Sport news sites

  • sports: All sport sites

  • spyware: Sites who run or have spyware software to download

  • updatesites: Sites where software updates are downloaded from including virus sigs

  • vacation: Sites about going on holiday

  • violence: Sites containing violence

  • virusinfected: Sites who host virus infected files

  • warez: Sites with illegal pirate software

  • weather: Weather news sites and weather related

  • weapons: Sites detailing or selling weapons

  • webmail: Just webmail sites

  • whitelist: Contains site suitable for kids

Customizing the URL database

To customize the database, you have to manually edit the relevant files of the database. The URL database is located on the Zorp hosts under the /etc/zorp/urlfilter/ directory. Every thematic category is subdirectory containing two files called domains and urls. These files contain the list of domains (e.g., example.com) and URLs (e.g., example.com/news/) that fall into the specific category. Optionally, the subdirectory may contain a third file called expressions, where more complex rules can be defined using regular expressions.

  • To to allow access (whitelist) to a domain or URL, add it to the domains or urls file of the whitelist category. Do not forget to configure your Http proxies to permit access to the domains of the whitelist category.

    Warning

    Deleting a domain from a category is not equivalent to whitelisting. Deleted domains will be re-added to their original category after the next database update.

  • To add a new URL or domain to an existing category, create a new subdirectory under /etc/zorp/urlfilter/, create the domains and urls files for this new category, and add the domain or URL (without the http://www. prefix) to the domains or urlsfile. Zorp will automatically add these sites to the specific category after the next daily database update, or when the zufupdate command is executed.

  • To create a new category, create a new subdirectory under /etc/zorp/urlfilter/, create the domains and urls files for this new category, and add domains and URLs to these files. Do not forget to configure your Http proxies to actually use the new category.

Warning

Manual changes to the URL database are not applied automatically, they become effective only after the next daily database update, or when the zufupdate command is executed.

Note

Manual changes are automatically merged with the original database during database updates.

If you are using the URL-filter database on several Zorp hosts and modify the database manually, make sure to copy your changes to the other hosts as well.