Cloudmark Authority Data Exchange and Privacy
Tweet (252KB)Download White Paper
Contents
Introduction
This document describes how Cloudmark Authority, Cloudmark's message content anti-spam/virus filtering solution, can be deployed to ensure that no data that is private to subscribers is sent offsite. In this scenario, Cloudmark is completely compliant with the strictest privacy laws such as those of Switzerland, Russia and Germany.
Cloudmark Authority Message Processing
Cloudmark Authority Anti-Spam processes messages using the following mechanism:
- Messages are pre-processed to put them into a well-known state (e.g. removing extraneous whitespace from text, decoding compression formats etc).
- Sets of “fingerprinting” algorithms are run sequentially on the message content. Each algorithm "transforms" message content in a unique way, producing a new piece of content that may contain extracts of the message, or are a new representation of the message (for example one algorithm performs entropy analysis on the message content).
- The transformed message content is subjected to an MD5 hashing process, producing the Cloudmark "fingerprint". Each algorithm can produce zero or more fingerprints for a given message.
- The fingerprints thus generated are compared against an in-memory cache of known spam fingerprints. The cache is entirely local to the memory and disk of the server running Cloudmark Authority.
The Cloudmark fingerprint cache is updated using HTTPS, and is signed for data integrity purposes.
In all of the above steps, no data is leaving the server that processes the message.
Cloudmark Authority Data Exchange
Data Received by Authority
Cloudmark Authority stops abuse by generating “fingerprints” on messages that are sent to it, and comparing those fingerprints against a local cache of known “bad” fingerprints. This process involves no data exchange between the Authority instance hosted by the service provider, and is thus entirely satisfied on the service provider's infrastructure.
The cache of bad fingerprints is updated every 60 seconds by retrieving the latest fingerprints from the Cloudmark service.
Because fingerprints are true cryptographic hashes, generated by the Cloudmark Service, they contain no personal data whatsoever.
License Check
Cloudmark Authority does do a license check to ensure the service is allowed. This is done via a HTTP POST of the license string stored in the license.cfg file on the server every 60s. This string contains no information on the server if it was intercepted.
Example: license string
12345
dc37835d7e7ere24524281788070ecda8cec7719d7824f070a4d0dba507a7ef3
Statistical Data Sent By Authority
Cloudmark Authority under normal operation will send data to the Cloudmark Service, (currently located in two data centers in the USA). No data about individual messages is sent as part of this process.
Rather, the data sent is in the form of general statistics about the Authority installation, for example:
- Number of messages scanned in a time interval
- Number of spam messages detected
- Number of times a particular fingerprint was seen
- Configuration parameters
- 60 second download update issues
Additional detailed statistics can also be enabled:
- Spam and legitimate mail seen from each connecting IP
- All of the fingerprints generated from each message scanned
None of fingerprint data sent to Cloudmark as a summary in any way identifies or can be associated with an individual subscriber, or the content sent by that subscriber (see below).
Reporting of any of the statistical data described above can be disabled by the customer, however doing so may adversely affect the ability of Cloudmark to respond quickly to new spam threats.
Feedback from Subscribers
Service providers implementing Cloudmark Authority have the option to give their users the ability to report missed spam messages (False Negatives) as well as messages incorrectly tagged as spam (False Positives) to the Cloudmark Service.
The benefit of doing this is that the Cloudmark Service will respond automatically to this feedback, self-correcting based on the subscriber feedback. The Cloudmark Service also tracks the trust of subscribers who report, in order to separate “good” reporters from “bad” reporters. The feedback that users send can be done in two different ways.
Traditional Feedback Method
The traditional method is for the following data to be sent to the Cloudmark Service:
- The entire message is sent to the Cloudmark Service
- The email address of the reporter who sent the feedback is sent to the Cloudmark Service
The traditional method allows Cloudmark to perform analysis on the whole message in order to detect long-term trends. However, in countries/territories where privacy laws prevent the submission of personal data by subscribers (even, as in this case, where the submission is initiated by the subscriber themselves), there is an alternative method, which enables feedback to be sent which contains no personal/private data.
Private Feedback Method
The private feedback method sends the following data to the Cloudmark Service:
- Only the fingerprints of the message are sent to the Cloudmark Service
- A unique id (opaque to Cloudmark) representing the subscriber is sent to the Cloudmark Service
The private feedback method means that no personal/private data is sent to the Cloudmark service, either in terms of message content or subscriber identity. Cloudmark is still able to automatically respond to the feedback, and track the trust of subscribers.
Conclusions
The following conclusions can be drawn about the security of Cloudmark fingerprints:
- No two-way encryption is used, thus there are no issues with key integrity.
- Because the message content is transformed and the original content discarded, there is no way to retrieve the original message content.
- Only hashing algorithms are used, thus there is no way to extract even the transformed message content.
- Even though MD5 has been shown to have certain weaknesses when used for digital signing/data integrity purposes, those weaknesses are not particularly relevant to the Cloudmark use of the algorithm for generating message fingerprint
- All statistics reported can be disabled
- Feedback from customers on missed spam or incorrectly tagged spam can be reported in a private manner, or indeed not reported at all
(252KB)Download White Paper Back to Top