zcv.cfg

Description

The zcv.cfg file controls the operation of ZCV, the Zorp Content Vectoring Server.

Structure

zcv.cfg uses an XML-like format to describe various configuration settings. The exact structure is configuration/section/<setting>, where the "name" attribute of the configuration block identifies the ZCV subsystem described by the nested tags.

The main configuration blocks of the file are the following:

zcv: Global options of zcv.
scanpaths: Definitions and settings of the scanpaths.
nod32, html, etc.: Definitions and instance-specific settings of the modules.
module-options: Global settings of the modules that apply to every instance of the module.

The example below sets the global options used by ZCV, broken down to three different sections: log for log related settings, router for setting the path to the router.cfg file and misc for miscellaneous parameters.

        <configuration name="zcv">

          <section name="log">

          <loglevel>3</loglevel>

          <use_syslog>true</use_syslog>

          <logtags>true</logtags>

        </section>

        <section name="router">

          <router>/etc/zcv/router.cfg</router>

        </section>

        <section name="misc">

          <magic_length>2048</magic_length>

        </section>

        </configuration>

The ZCV modules have a slightly different structure. The name attribute in the configuration tag of the ZCV module and the section name identifies an instance of that module. Each instance can be run with a different parameter set. The example below shows a complete configuration block for the clamav module, with an instance named intranet having normal, and another named internet having paranoid sensitivity.

      <configuration name="clamav">

        <section name="internet">

          <mode>file</mode>

          <scan_packed>1</scan_packed>

          <disinfect>0</disinfect>

          <scan_suspicious>1</scan_suspicious>

          <heuristic_level>normal</heuristic_level>

        </section>

        <section name="intranet">

          <mode>file</mode>

          <scan_packed>1</scan_packed>

          <disinfect>1</disinfect>

          <scan_suspicious>0</scan_suspicious>

          <heuristic_level>normal</heuristic_level>

        </section>

       </configuration>

The router.cfg file

The router.cfg file controls the scanpath selection in ZCV. ZCV selects the scanpath based on the meta-information that Zorp supplies. Each line in router.cfg comprises from a condition and an action, separated by whitespace. When an incoming request matches a condition, the corresponding the action identifies the scanpath and its instance to be used.

The condition is a comma separated list of constraints, each constraint identifying a variable and an expected value in the header=match,header=match,... format. Wildcard characters like '*' and '?' can be included in the matches.The following variables are currently defined:

zcv_rule_group: The name of the rule group that the peer requests. Its value is specified the -R command line option in Zorp mode, or is supplied by the peer during the handshake.
content_type_detected: MIME type detected based on the first bytes of the file.
content_type_uncompressed: MIME type detected based on the first bytes of the file looking into a compressed file header and decompressing it if necessary.
content_type: MIME type as specified by the peer.
file_name: File name or URL.
file_extension: File extension. Please note that this information might not be accurate as some URLs do not contain file extension in which case this variable is empty. For example it is common to reference directories in HTTP which implicitly map to a server defined content and the URL does not contain a filename extension as in http://domain.com/directory/. It is better to use content_type or content_type_detected for content specific scanning.
file_xfer_direction: File transfer direction, either "upload" or "download".
zorp_protocol: Protocol that was used to transfer the checked file.
zorp_session_id: Zorp session id that requested content vectoring.
zorp_proxy_class: The name of the proxy class that requested content vectoring.
zorp_auth_user: The authenticated username.
zorp_client_address: Client address in AF_INET(>ipaddr<, >port<) format.
zorp_client_address.ip: Client IP address.
zorp_client_address.port: Client TCP/UDP port.
zorp_client_zone: The name of the client zone.
zorp_server_address: Server address in AF_INET(>ipaddr<, >port<) format.
zorp_server_address.ip: Server IP address.
zorp_server_address.port: Server TCP/UDP port.
zorp_server_zone: The name of the server zone.
smtp_envelope_sender: The envelope sender address in SMTP.
smtp_envelope_recipients: Space separated list of envelope recipient addresses in SMTP.
nntp_group_name: The name of the NNTP newsgroup name.
http_request_method: The type of the HTTP request.
http_request_url: The HTTP request URL.
http_request_version: The version of the HTTP request (e.g. 1.1).
http_request_host: The Host header included in the HTTP request.

Furthermore, virtually all defined Zorp variables can be used as variables with the 'zorp.' prefix, which denotes the 'session' object of the stacking proxy. For example: > zorp.session_id, zorp.client_address.ip_s, etc.

The action identifies the ZCV scanpath to use.

The example below selects the html scanpath for all files which are recognized as "text/html" files, and rejects everything else. An object is scanned only by the scanpath of the first matching condition.

            content_type="text/html" html

            content_type_detected="text/html" html

            REJECT

Global Options

Global options are stored in the configuration block named zcv. Related options are grouped into sections.

Section log

use_syslog: Use syslog for logging.
logtags: Enable the logging of message tags.
loglevel: Level of verbosity for logging messages. Default value: 3.
logspec: Set verbosity mask on a per category basis. The format of this value is described in zorp(8).

Section misc

magic_length: This parameter determines the amount of data (in bytes) read from MIME objects to detect their MIME-type. Higher value increases the precision of MIME-type detection. Default value: 0.
tempdir: Location of the temporal directory (used for swap files, etc.). Default value: /var/lib/zorp/tmp

Section router

router: Location of the router.cfg file. Default value: /etc/zcv/router.cfg

Section bind

ip: IP address to which ZCV binds. Default value: 0.0.0.0.
port: Port to which ZCV binds. Default value: 1318.
unix: Bind to a unix domain socket. If only the empty tag is present, the default socket (/var/run/zcv/zcv.sock) is used.

When binding to a unix domain socket, the owner and the permissions of the socket can be set using the following parameters:

owner: The owner of the socket. By default its value is NULL, meaning that the owner of the socket is the user running ZCV.
group: The owner group of the socket. Default value: zorpstate.
perm: The permission settings of the socket in Unix-style. Default value: 770.

Section blob

hiwat: ZCV tries to store everything in the memory if possible. If the memory usage of ZCV reaches hiwat, it starts to swap the data onto the hard disk, until the memory usage reaches lowat. Default value: 960 *1024 *1024 (960 MB).
lowat: Lower threshold of data swapping. Default value: 640 *1024 *1024 (640 MB).
max_disk_usage: The maximum amount of hard disk space that ZCV is allowed to use. Default value: 64 * 1024 * 1024 * 1024 (64 GB).
max_mem_usage: The maximum amount of memory that ZCV is allowed to use. Default value: 1024 * 1024 * 1024 (1 GB).
noswap_max: Objects smaller than this value (in bytes) are never swapped to hard disk. Default value: 16384.
deadlock_check_period: The period of deadlock check in seconds, to resolve deadlock, when storage is full. Default value: 5.
allocation_timeout: The time in seconds of waiting for blob allocation, if storage is full. Default value: 10.

Scanpath Options

The scanpath options are stored in the configuration block named scanpaths. Each section in this block has the name of a scanpath and contains settings specific for the given scanpath.

Settings to control trickling can also be configured here. Content filtering cannot be performed on partial files: the entire file has to be available on the firewall. Sending of the file to the client is started only if no virus was found (or the file was successfully disinfected). Instead of receiving the data in a continuous stream, as when connecting to the server “regularly”, the client does not receive any data for a while, then “suddenly” it starts to flow. This phenomena is not a problem for small files, since these are transmitted and checked fast, probably without the user ever noticing the delay, but can be an issue for larger files when the client application might time out. Another source of annoyance can be when the bandwidth of the network on the client and server side of the firewall is significantly different. In order to avoid time outs, a solution called trickling is used. This means that the firewall starts to send small pieces of data to the client so it feels that it is receiving something and does not time out. For further information on trickling, see the Virus filtering and HTTP Technical White Paper available at the BalaSys Documentation Page at http://www.balasys.hu/en/documentation/

The following options are available for each scanpath:

plugins

Comma-separated list of colon separated pairs listing the modules to be executed in the scanpath. The colon-separated pairs specify the module and its instance (e.g.: html:filterscripts, nod32:paranoid).

quarantine_mode

Quarantine mode to be used. Always the original file is quarantined.

always: Quarantine all objects rejected for any reason.
rejected: Quarantine objects that could not be disinfected.
modified+rejected: Quarantine only the original version of the files which were successfully disinfected. E.g.: if an infected object is found but it is successfully disinfected, the original (infected) object is quarantined. That way, the object is retained even if the disinfection eliminates some important information.
never: Disable quarantining, objects rejected for any reason are dropped.

threshold_oversize

Objects larger than threshold_oversize (in bytes) are not scanned, because of performance/resource reasons (i.e. large archives, ISO files, etc.).

trickle_mode

Mode of trickling to be used. Default: NONE.

none: Trickling is disabled.
percent: Determine the amount of data to be trickled based on the size of the object. Data is sent to the client only when ZCV receives new data; the size of the data trickled is the set percentage of the total data received so far. This is the recommended method to use.
steady: Trickle fixed amount of data in fixed time intervals.

trickle_percent

Amount of data to be trickled (percentage). Defailt value: 10.

trickle_steady_initial_delay

When an object is downloaded, trickling is started after this period (in seconds). Default value: 10.

trickle_steady_delay

Period (in seconds) between trickling data chunks.

trickle_steady_bytes

Amount of data (in bytes) that is sent to the client in a chunk during trickling. Default value: 128 bytes.

Modules

The following modules are available in ZCV:

sed: Filters and rewrites the input in stream similarly to the operation of the UNIX 'sed' command.
nod32: Performs virus scanning on the incoming data with the NOD32 engine. The data is processed in file mode.
clamav: Performs virus scanning on the incoming data with the Clam AntiVirus engine. The data is processed in file mode.
html: Performs JavaScript/Java/ActiveX filtering of HTML data in stream mode.
spamassassin: Performs spam filtering on the incoming e-mails with the SpamAssassin engine. The data is processed in file mode.
mail-hdr: Performs filtering and manipulation on the headers of e-mail messages. The data can be processed both in file and stream mode.
modsecurity: ModSecurity is a platform-independent web application level security gateway module (Web Application Firewall (WAF)), that can be integrated to Zorp Gateway. ModSecurity's WAF solution can look into the HTTP(S) traffic and provides a powerful policy definition language and an API to achieve advanced security.
program: Performs filtering and/or manipulation of the data with an external 3rd-party application. The data can be processed either in file and stream mode.

The Sed module

The configuration name of the sed module is sed. This module has the following instance-specific options:

filter

The stucture of this string is the following: a slash (/), the string to be replaced, a slash (/), the replacement string, and the options. Slashes in the string have to be escaped with backslashes.The folowing options are available:

-g: Replace all occurances of the string.
-i: Run in case insensitive mode.

For example, the /example/sample/-g filter replaces all occurances of 'example' to 'sample'.

The NOD32 module

The nod32 module has the following instance-specific options:

scan_packed: Perform virus scanning on archived files. Default value: YES.
scan_suspicious: Perform virus scanning on suspicious files (e.g.: suspicious files are often new variants of known viruses). Default value: NO.
heuristic_level: Level of heuristic sensitivity. The available levels are OFF, NORMAL, and HIGH. Default value: OFF.
archive_max_size: Archives larger than the specified value (in megabytes) are not scanned. Zero means unlimited. Default value: 10.

The clamav module

The configuration name of the Clam AntiVirus module is clamav. The module has the following module options:

daemon_socket: The domain socket used to communicate with the clamav engine. Default value: /var/run/clamav/clamd.ctl

The clamav module has the following instance-specific options:

scan_packed: Perform virus scanning on archived files. Default value: YES.

The SpamAssassin module

The configuration name of the SpamAssassin module is spamassassin. The module has the following instance-specific options::

check_only

Only check the e-mails, but do not make any modification to the e-mail. The result of the spam filtering is returned to ZCV separately. Default value: FALSE.

host and port

The hostname and port number of the machine SpamAssassin is running on, if different from the ZCV host.

socketpath

The domain socket used to communicate with SpamAssassin if it is running on the ZCV host. Default value: /var/run/spamassassin.sock

username

The user under which SpamAssassin should filter e-mails. Default value: not set, the user running SpamAssassin is used (nobody).

timeout

Timeout value for the scanning requests in seconds. Default value: 60.

Note
If the timeout is set to -1 (unlimited), then no timeout is used for the connection if SpamAssassin is running on a remote host.

threshold

By default, ZCV rejects all e-mails SpamAssassin detects as spam. However, to minimize the impacts of false positives, if the spam status of an e-mail (as calculated by SpamAssassin) is over the required_score (default value: 5), but below the value set in threshold, ZCV only marks the e-mail as spam, but does not reject it. If the spam status of an e-mail is above the threshold, it is automatically rejected. Default value: 10.0.

The HTML module

The configuration name of the html module is html.

The html module has the following instance-specific options:

filter_javascript

Remove javascript from HTML pages. Default value: NO. Enabling this option removes all javascript and script tags, and the conditional value prefixes (e.g.: onclick, onreset, etc.).

filter_activex

Remove ActiveX components from HTML pages. Default value: NO. Enabling this option removes the applet tags and the classid value prefix.

filter_java

Remove java from HTML pages. Default value: NO. Enabling this option removes the java: and application/java-archive inclusions, as well as the applet tags.

filter_css

Remove CSS (cascading style sheets) from HTML pages. Default value: NO. Enabling this option removes the single link tags, the style tags and options, as well as the class options.

filter_custom

A whitespace-separated value of colon separated pairs, specifying the headers, tags, etc. to be removed based on their names or their values.

The following HTML elements can be filtered:

Tags: Remove everything between the specified tag and its closing tag. Embedded structures are also handled. E.g.: closed-tag:ul

Single tags: Remove all occurrences of the specified single tag (img, hr, etc.). E.g.: tag:hr

Options: Remove options (e.g.: width, etc.) and their values. E.g.: option:width

Prefixes: Remove all options starting with the set prefix. E.g.: prefix:on will remove all options like onclick, etc.

buffer_size

This attribute control the size of the internal buffer of this module

The mail header module

The configuration name of the mail header module is mail-hdr. A filter contains a pattern (i.e. the header line to be found) enclosed within backslashes (/), a whitespace, the action to be performed on the header line, and an optional argument. The pattern and the argument can be regular expressions. To search for the pattern in case insensitive mode, add an i character after the closing backslash of the pattern. The following actions can be performed on the mail headers:

Append: Add the argument of the filter as a new header line after the match.
Discard: Discard the entire e-mail message. The argument is returned to the mail server sending the message as an error message.
Ignore: Remove the matching header line from the message.
Pass: Accept the matching header line. This action can be used to create exceptions from other filter rules.
Prepend: Add the argument of the filter as a new header line before the match.
Reject: Reject the entire e-mail message. The argument is returned to the sender of the message as an error message.
Replace: Replace the mathing header line to the argument of the filter.

The module has the following instance-specific options::

filter

The list of filters to be applied on the mail headers. For example:

              <filter>

                  /^Subject: hello$/i          DISCARD

                  /^Date: (.*)/                     APPEND "X-Date: \1 \1"

              <filter>

header_wrap_length

If a manipulated header line is longer than this value (in bytes), is will be broken into a new lines. These new lines will not be longer then header_wrap_length. Default value: .

max_line_length

This attribute control the maximum length of a header line

The Modsecurity module

The configuration name of the Modsecurity module is modsecurity. The module has the following module options:

config_file: Ruleset configuration file for ModSecurity. Default value: not set.
transaction_timeout: Timeout value for transactions, in seconds. Default value: 60.
process_request_body: Request bodies will be buffered and processed by ModSecurity. Default value: YES.
process_response_body: Response bodies will be buffered by ModSecurity. Default value: YES.
request_body_limit: The maximum request body size ModSecurity will accept for buffering, in KBytes. Anything over the limit will be rejected with status code 413 (Request Entity Too Large). Default value: 10000.
request_body_no_files_limit: The maximum request body size ModSecurity will accept for buffering in KBytes, excluding the size of any files being transported in the request. Default value: 1000.
response_body_limit: The maximum response body size that will be accepted for buffering, in KBytes. Anything over this limit will be rejected with status code 500 (Internal Server Error). Default value: 1000.

Author

This manual page was written by the BalaSys Documentation Team <documentation@balasys.hu>.

Previous	Up	Next
	Home

Zorp Professional 7 Reference Guide

Appendix C. Zorp manual pages