Quite some time ago I slapped together a couple of scripts to submit quarantined attachments from my mail server to Cuckoo (Article here).
I have finally found time to re-write this to use a Postfix content filter: extracting any attachments from an email processed by Postfix, then submitting to Cuckoo (on a different box) via Cuckoo’s REST API.
A couple of methods for achieving the same goal come to mind:
- Run as a Postfix after-queue filter: processing mail, submitting to Cuckoo then returning to the queue (Or forwarding to something else, such as Amavis). I initially started out with this in mind and wrote something which almost works as a simple content filter, though at the moment it doesn’t play nicely with virtual aliases.
- Copy all incoming mail to another mailbox and process from there. You could either poll the mailbox via IMAP and process mail (Which could be done locally on the Cuckoo box), or have Postfix pipe it through a script, thus never actually delivering the mail to a mailbox. I chose the latter option.
For now, since it is more or less a blind submission to Cuckoo without expecting any feedback, the BCC -> Pipe method makes much more sense as we don’t mess with the flow of mail (less to go wrong) and the implementation is more efficient.
However, the benefit of a content-filter design would become apparent if you want the system to make automated mail delivery decisions based on intelligence received from Cuckoo. This would be interesting to persue, but you would first need to determine how to quantify a Cuckoo report into actionable data… Of course, the delivery delay introduced may be another challenge (Though I would say no more of a delay than servers who implement Greylisting). Perhaps you could compromise and:
- Strip the attachment and deliver the mail, then provide a link or additional email once processing is complete (Not a very elegant solution).
- Tag each email processed by Cuckoo, then index the tag for retroactive removal from mailboxes after processing has completed and a threat detected.
- You should probably not be delivering dangerous attachments anyway..
Note: There are a few issues you would need to consider before implementing something like this on anything other than your personal mail server (as I have done here) – If there is any expectation of privacy for your users then that will need to be addressed (Though I would consider this less invasive then most anti-spam scanners). In addition, consider any data security implications of having certain attachment types transferred to and stored on your Cuckoo machine.
Anyway, I digress. What I have now is a system which pipes a copy of incoming mail through a Python script which decodes applicable attachments and submits them to Cuckoo for analysis via the REST API.
The Python script itself is, in typical Python fashion, astonishingly simple. It takes a complete email message from STDIN and first checks if it is a multipart message (Ignoring if it is not):
if not msg.is_multipart():
#Not multipart? Then we don't care, pass it back
logging.debug("Ignoring non-multipart message from %s." % (msg['from']))
If the email is a multipart message, then we iterate over all the parts of the message and infer the content type from the headers. The content-type is then checked against a predefined list (MIME types such as octet-stream, msword, zip, etc). I’ve also included a secondary magic number check after the attachment is decoded as in testing it was found that you can end up submitting a bunch of stuff defined as an octet-stream which you don’t really want to submit (PGP Keys, for example).
for part in msg.walk():
# TODO: Other mime types? pdf? doc?
logging.debug("Processing mail part of type %s" % (part.get_content_type()))
if part.get_content_type().strip() in mtypes:
attachment = part.get_payload(decode=True)
mtype = magic.from_buffer(attachment, mime=True)
# Secondary check using magic
# Sometimes we get octet-streams which we do not want to analyse
if mtype not in mtypes:
Once a relevant attachment is identified the attachment is decoded into a temporary file object and included in a POST request to the REST interface running on the Cuckoo box. After submission is complete the copy of the mail is simply discarded, as delivery to the original mailbox continues in parallel and the original mail flow is left undisturbed.
In order to actually get a copy of the mail to our script we make use of Postfix’s recipient BCC maps and per-user transport tables. I’m running iRedMail, which makes things somewhat specific to an LDAP-based iRedMail setup, but the concept remains the same.
We configure Postfix with a recipient_bcc_map to BCC all incoming mail with a recipient domain tribalchicken.com.au to the user email@example.com. The email isn’t actually delivered and stored in the mailbox but the user must exist, otherwise Postfix will bounce the mail.
Before the transport tables can be configured a new pipe transport needs to be defined in Postfix’s master.cf:
cuckoolyse unix - n n - - pipe
flags= user=cuckoolyse argv=/home/cuckoolyse/cuckoolyse.py
It is then possible to use the now-defined transport cuckoolyse for a certain user which can be done using Postfix’s transport_maps option (As I mentioned I am using iRedMail so my BCC mapping and transport tables are contained in LDAP). Once configured correctly mail will be BCC’d to the cuckoolyse address, but mail destined for that address will be delivered via the cuckoolyse service.
Of course, the Cuckoo API server needs to be running on the Cuckoo box. It’s simple to get running, so have a look at the Cuckoo documentation for more information. Keep in mind you may wish to consider encrypting the transport of submissions.
I’ve added the code for both the version I am using here as well as the original “simple content filter” to Github here:
Feel free to use and modify as you see fit, but if you find this helpful shoot me an email as I would be very interested to hear about other implementations. Also, if you would like more information about the Postfix side of things please get in touch and I will be more than happy to explain further.
Update: I added a check to determine if the sample had already been analysed by Cuckoo. Otherwise, I end up with this…
Feedback is welcome. Contact me here.