2.1.6. The concept of the CF framework

CF is not a content vectoring engine, it is a framework to manage and configure various third-party content vectoring modules (engines) from a uniform interface. PNS uses these modules to filter the traffic. These modules run independently from PNS. They do not even have to run on the same machines. PNS can send the data to be inspected to these modules, along with configuration parameters appropriate for the scenario. For example, a virus filtering module can be used to inspect all files in the traffic, but different parameters can be used to inspect files in HTTP downloads and e-mail attachments. Also, different scenarios can use a different set of modules for inspecting the traffic. Using the above example, HTTP traffic could be inspected with a virus filter, a content filter, and all client-side scripts could be removed. E-mails could be scanned for viruses using the same virus filtering module (but possibly with stricter settings), and also inspected by a spam filtering module.

Interaction of PNS and CF

Figure 2.4. Interaction of PNS and CF

  • A PNS proxy can send data for further inspection to a CF rule group.

  • A rule group is used to define a scenario (using a set of router rules).

  • The router rules of the scenario are condition – action pairs that determine how a particular object should be inspected. This decision is based on meta-information about the traffic or objects received from PNS and on information collected by CF.

    • The condition can be any information that PNS/CF can parse, for example, the client's IP address, the MIME-type of the object, and so on.

    • The action is either a default action (such as ACCEPT or REJECT), or a scanpath — a list of content vectoring module instances (the modules and their settings corresponding to the scenario) that will inspect the traffic. Rule groups have a scanpath configured as default, but the routers in the group can select a different scanpath for certain conditions.

The examples demonstrated on Figure 2.5, Content vectoring scenarios in CF can be translated to the CF terms defined in the previous paragraph as follows:

Content vectoring scenarios in CF
Content vectoring scenarios in CF

Figure 2.5. Content vectoring scenarios in CF

  1. There are two rule groups (scenarios) defined, one for HTTP traffic, one for SMTP.

  2. Router rules in the HTTP rule group call a scanpath.

  3. The scanpath includes module instances of a virus filtering, a content filtering, and an HTML module that are configured to remove all scripts.

This is only a basic example, further router rules could be used to optimize the decisions (for example, there is not much sense in trying to remove client-side scripts in non-HTML files that are downloaded, and so on). Similarly, another rule group corresponds to the SMTP scenario, with a scanpath including a virus filtering and a spam filtering module instance.

The whole process is summarized in the following procedure.