Spam filtering with Exim

This page contains tips and tricks on how to filter spam using the Exim MTA.
Don't copy the lot. Just think of it as a source of inspiration.

The stuff below is based on Debian 12.x / Bookworm and Exim 4.9x.
I use Let's encrypt and Dehydrated to generate a fullchain.pem and privkey.pem for TLS.
For DKIM and and DMARC see;

Install

Select the following packages and anything that they depend on;

exim4-daemon-heavy
fuzzyocr
giflib-tools
libgif(7)
netpbm
poppler-utils
pyzor
razor
spamd (requires spamassassin)

Plus any documentation you might want to install and then do a lot of RTFM.

After this configure the lot;
Replace 'example.org' with your own domain, '*.example.org' with your own hostname. Replace '192.0.2.1' with your own IPv4 address. Replace '2001:db8:1234:' with your own IPv6 addresses. Replace 'RvdP' with your own initials.
Sometimes dots ('.') in IPv4 addresses are escaped with a backslash ('\.'), E.G.; '192\.0\.2\.1' and the colons (':') in IPv6 addresses with a double colon ('::'), E.G.; '2001::db8::1234::'.

Main

main/000_localmacros

I created my own config file; '/etc/exim4/conf.d/main/000_localmacros'.

# RvdP, this a local config file

# Enable checks; 1 = True = Enable
# ================================
MAIN_TLS_ENABLE = 1
MAIN_TLS_TRY_VERIFY_HOSTS = !*
TLS_DH_MIN_BITS = 512
# New
REMOTE_SMTP_SMARTHOST_TLS_VERIFY_HOSTS = !*
# Lets encrypt certs
# tls_certificate
MAIN_TLS_CERTIFICATE = CONFDIR/fullchain.pem
# tls_privatekey
MAIN_TLS_PRIVATEKEY = CONFDIR/privkey.pem
CHECK_RCPT_REVERSE_DNS = 1
# New, build-in SPF check
# Disabled because I copied it to local_check_rcpt
# CHECK_RCPT_SPF = 1
# New, UTF-8
MAIN_SMTPUTF8_ADVERTISE_HOSTS = *

# RvdP, I have my own blacklist DNS
ROUTER_DNSLOOKUP_IGNORE_TARGET_HOSTS = \
	0.0.0.0 : 127.0.0.0/8 : \
	172.16.0.0/12 : 10.0.0.0/8 : 169.254.0.0/16 : \
	255.255.255.255 : 100.64.0.0/10 : \
	::::/80 : 64:ff9b::::/96 : fc00::::/7 : \
	fe80::::/10 : fec0::::/10

# Other options;
MAIN_QUALIFY_DOMAIN = example.org

# DKIM
DKIM_DOMAIN = example.org
DKIM_SELECTOR = dkm
DKIM_PRIVATE_KEY = /etc/exim4/keys/dkim_rsa.private
DKIM_CANON = relaxed

# My own variables
# ----------------
# I have a mail forward or list on these
hostlist spf_white_hosts = \
	*.gmane.org

# My own ACLs
# -----------
CHECK_CONNECT_LOCAL_ACL_FILE = /etc/exim4/local_check_connect
CHECK_RCPT_LOCAL_ACL_FILE = /etc/exim4/local_check_rcpt
CHECK_MIME_LOCAL_ACL_FILE = /etc/exim4/local_check_mime
CHECK_DATA_LOCAL_ACL_FILE = /etc/exim4/local_check_data

# I defined this
LOCAL_COMBI_WHITELIST = /etc/exim4/local_combi_whitelist
# Subscribed list names
LOCAL_SUBSCRIBED_LISTS_REGEX = \N(list1|list2|list3)\N
# Subscribed list header froms
LOCAL_SUBSCRIBED_FROMS_REGEX = \N(from1|from2|from3)\N

'MAIN_TLS_TRY_VERIFY_HOSTS = !*' will enable TLS even if Exim thinks that there is something wrong with the remote TLS implementation.
CHECK_RCPT_SPF is not enabled in this file. The SPF check is done in the RCPT phase instead.
'hostlist spf_white_hosts' are hosts where I have a '.forward' file. These are not SPF checked.
'/etc/exim4/local_combi_whitelist' is a combined host - sender-address whitelist. See local_combi_whitelist
'\N(list1|list2|list2)\N' is a regex or of subscribed mailing lists. These are checked at 'subscription check' in local_check_data. Replace list1, list2, list2 with substrings of the names of the lists you are subscribed to.
Sometimes mailing lists can only be identified by their content from. This is where '\N(from1|from2|from3)\N' comes in.

main/01_exim4-config_listmacrosdefs

I edited this file. The first edit is just above the '# listen on all all interfaces?' remark;

# RvdP, primary_hostname
primary_hostname = kill-spam.example.org
# RvdP, make helo interface dependent
# Incoming connections + Callout
smtp_active_hostname = ${if eq{$interface_address}{192.0.2.1}\
   {ns4.example.org}{kill-spam.example.org}}

# listen on all all interfaces?

'192.0.2.1' is 'my' IPv4 address. Replace with your own IPv4 address.
'ns4.example.org' is the reverse lookup for '192.0.2.1' and 'kill-spam.example.org' the HELO used by my server. Replace these with your own.

Promote IPv6

Alternatively, use the following;

# RvdP, make helo IP address dependent
# Incoming connections + Callout
smtp_active_hostname = ${if eq{$interface_address}{192.0.2.1}\
   {please-use-ipv6.example.org}{kill-spam.example.org}}

Whenever someone connects using IPv4, the string 'Please-use-IPv6' will be logged.
Note: There has to an 'A' (address) record for this name (Please-use-IPv6.example.org). Otherwise some spam filters might object.

main/02_exim4-config_options

At the top, a connect ACL;

### main/02_exim4-config_options
#################################


# RvdP, ACL at start of connection
acl_smtp_connect = check_connect

# Defines the access control list that is run when an

I added a mime ACL;

acl_smtp_rcpt = MAIN_ACL_CHECK_RCPT

# RvdP, mime ACL
acl_smtp_mime = check_mime

# Defines the access control list that is run when an

And a vrfy ACL;

acl_smtp_data = MAIN_ACL_CHECK_DATA


# RvdP, These options specify the Access Control Lists (ACLs) that
# are used to control the ETRN, EXPN, and VRFY commands.
# Where no ACL is defined, the command is locked out.

acl_smtp_vrfy = check_vrfy


# Message size limit. The default (used when MESSAGE_SIZE_LIMIT

The next edit is just above '# Domain used to qualify unqualified recipient addresses';

# For spam scanning, there is a similar option that defines the interface to
# SpamAssassin. You do not need to set this if you are using the default, which
# is shown in this commented example. As for virus scanning, you must also
# modify the acl_check_data access control list to enable spam scanning.

# RvdP, Enabled
spamd_address = 127.0.0.1 783

# Domain used to qualify unqualified recipient addresses

The next edits are just above '# In a minimaldns setup, update-exim4.conf guesses the hostname and';

# RvdP, use /etc/hosts first
host_lookup_order = byaddr:bydns

# The setting below causes Exim to try to initialize the system resolver
# library with DNSSEC support.  It has no effect if your library lacks
# DNSSEC support.
dns_dnssec_ok = 1

# In a minimaldns setup, update-exim4.conf guesses the hostname and

The next edit is just above '# Enable an efficiency feature.';

# RvdP, Disable
rfc1413_query_timeout = 0s


# Enable an efficiency feature.  We advertise the feature; clients

The next edit is just above '# Bounce handling';

# RvdP, other main config options

# RvdP, mild verbal abuse
bounce_message_text = "See http://www.example.org/spam/spam-policy.html"

# RvdP, complain about HELO after RCPT
helo_allow_chars = _

# RvdP, Check HELO after RCPT
helo_try_verify_hosts = *

# RvdP, No pipelining
pipelining_advertise_hosts = :

# RvdP, No chunking
chunking_advertise_hosts =

# RvdP, filter stuff
message_body_visible = 256K

# RvdP, 8 bit stuff
accept_8bitmime
print_topbitchars

# RvdP, UTF-8
headers_charset = UTF-8


# Bounce handling

ACL

acl/10exim4-config_check_connect

This includes a hook to a rate limiting ACL.

# 10_exim4-config_check_connect

# RvdP, ACL at start of connection
check_connect:

# RvdP, Hook for local connect ACL file
  .ifdef CHECK_CONNECT_LOCAL_ACL_FILE
  .include CHECK_CONNECT_LOCAL_ACL_FILE
  .endif

  accept

acl/30_exim4-config_check_rcpt

Filter postmaster (My SpamAssassin setup does not spam filter abuse);

  # Accept mail to postmaster in any local domain, regardless of the source,
  # and without verifying the sender.
  #
  # RvdP, disabled
  #accept
  #  .ifndef CHECK_RCPT_POSTMASTER
  #  local_parts = postmaster
  #  .else
  #  local_parts = CHECK_RCPT_POSTMASTER
  #  .endif
  #  domains = +local_domains : +relay_to_domains


  # Deny unless the sender address can be verified.

Add recipient verification;

  # assumption is that they are your friends, and if they get onto black
  # list, it is a mistake.
  accept
    hosts = +relay_from_hosts
    # RvdP, added recipient verification
    verify = recipient
    control = submission/sender_retain
    control = dkim_disable_verify


  # Accept if the message arrived over an authenticated connection, from

In case of blacklisted sender, refer to blacklist instead of postmaster.
And the full message gets logged.

  # the black list. See exim4-config_files(5) for details.
  deny
    # RvdP, changed
    #message = sender envelope address $sender_address is locally blacklisted here. If you think this is wrong, get in touch with postmaster
    message = sender envelope address $sender_address is locally blacklisted here. See http://www.example.org/spam/blacklist.html
    #log_message = sender envelope address is locally blacklisted.

In case of blacklisted host, refer to blacklist instead of postmaster.
And the full message gets logged.

  # the black list. See exim4-config_files(5) for details.
  deny
    # RvdP, changed
    #message = sender IP address $sender_host_address is locally blacklisted here. If you think this is wrong, get in touch with postmaster
    message = sender IP address $sender_host_address is locally blacklisted here. See http://www.example.org/spam/blacklist.html
    #log_message = sender IP address is locally blacklisted.

Use a callout verification. With a long (100s) timeout;

  # Accept if the address is in a domain for which we are an incoming relay,
  # but again, only if the recipient can be verified.

  accept
    domains = +relay_to_domains
    endpass
    verify = recipient
    # RvdP, use a callout
    verify = recipient/callout=100s


  # At this point, the address has passed all the checks that have been

acl/35_exim4-config_check_mime

There is no '35_exim4-config_check_mime', so you have to create your own;

# 35_exim4-config_check_mime

# RvdP, ACL that is used before data ACL
check_mime:

  # My  own ACL
  .ifdef CHECK_MIME_LOCAL_ACL_FILE
  .include CHECK_MIME_LOCAL_ACL_FILE
  .endif

  # accept otherwise
  accept

It simply refers to 'CHECK_MIME_LOCAL_ACL_FILE', which refers to 'local_check_mime'.

acl/50_exim4-config_check_vrfy

# 50_exim4-config_check_vrfy

# RvdP, ACL that is used after the VRFY command
check_vrfy:
  accept

Router

router/200_exim4-config_primary

Changed ignore to defer.

# here so that mail to relay_domains is handled separately.

smarthost:
  debug_print = "R: smarthost for $local_part@$domain"
  driver = manualroute
  domains = ! +local_domains
  transport = remote_smtp_smarthost
  route_list = * DCsmarthost byname
  # RvdP, changed
  #host_find_failed = ignore
  host_find_failed = defer
  same_domain_copy_routing = yes
  no_more
  
.endif


# The "no_more" above means that all later routers are for

Other files

These are not in '/etc/exim4/conf.d/' but in '/etc/exim4/'. Below are the complete files. Edit these to suit your needs.

local_check_connect

# RvdP, this is the local Connect ACL
# It slows down fast remote senders

# Deny if they keep going on
  deny
    message   = Too many connections from your host.
    hosts     = !+relay_from_hosts
    # ratelimit = <m> / <p> / <options> / <key>
    # m = Number of messages, p = Time
    # Default option is per_mail
    # Default key is $sender_host_address
    ratelimit = 15 / 1h

# Deny at connect
  deny
    message = You are blacklisted
    hosts   = *.censys-scanner.com : *.shodan.io

# Tmp host blacklisting
  deny
    message = Host $sender_host_address is locally blacklisted here. See http://www.example.org/spam/
    hosts   = ${if exists{CONFDIR/local_tmp_auth_host_blackl}\
                {CONFDIR/local_tmp_auth_host_blackl}\
              {}}

# Otherwise slow down
  defer
    message   = Busy. Try later.
    hosts     = !+relay_from_hosts
    condition = ${if >\
      # Extract int from float before compare
      {${extract{1}{.}{$sender_rate}}}\
      {10}\
    {yes}{no}}

local_check_rcpt

# RvdP, this is the local RCPT ACL
# A lot of definitions are in /etc/exim4/conf.d/main/000_localmacros
# Options are in /etc/exim4/conf.d/main/02_exim4-config_options
    
# Checks are done in order: IP, Host, Helo, From, To.
# DNS based checks are in order of increasing DNS load.
# SMTP based checks in order of increasing SMTP load.
# Some of these rules defer rather than deny. These can then be whitelisted
# Defers are before denies.
# Reject (deny) is only in case of obvious malice.

# First whitelist temp recipients;
# Tell the DATA ACL that we did, and then accept
  warn
    set acl_m_epoch = $tod_epoch
    message         = X-Example-WL: $acl_m_epoch
    recipients      = ${if exists{CONFDIR/local_rcpt_whitelist}\
      {CONFDIR/local_rcpt_whitelist}\
    {}}
  
  accept
    recipients = ${if exists{CONFDIR/local_rcpt_whitelist}\
      {CONFDIR/local_rcpt_whitelist}\
    {}}
    
# Tmp host blacklisting
  deny
    message = Host $sender_host_address is locally blacklisted here. See http://www.example.org/spam/
    hosts   = ${if exists{CONFDIR/local_tmp_host_blackl}\
                {CONFDIR/local_tmp_host_blackl}\
              {}}
    
# Tmp sender blacklisting
  deny
    message = Sender envelope address $sender_address is locally blacklisted here. See http://www.example.org/spam/
    senders = ${if exists{CONFDIR/local_tmp_sender_blackl}\
                {CONFDIR/local_tmp_sender_blackl}\
              {}}

# Include whitelist
# Combines host name and email address
  .ifdef LOCAL_COMBI_WHITELIST
    .include LOCAL_COMBI_WHITELIST
  .endif

# Check hostname
  defer
    message   = Broken Reverse DNS; no host name found for IP address $sender_host_address. Please contact your ISP.
    condition = ${if and \
      {\
        {def:sender_host_address}\
        {!def:sender_host_name}\
      }\
    {yes}{no}}

# Check HELO after RCPT
  defer
    message   = Underscores are not allowed in hostnames. Please contact your ISP.
    condition = ${if match\
      {$sender_helo_name}\
      {\N.*_.*\N}\
    {yes}{no}}

# Helo can't be localhost, *.local, *.localdomain or *.lan
  defer
    message   = HELO can't be $sender_helo_name. Please contact your ISP.
    condition = ${if match\
      {${lc:$sender_helo_name}}\
      {\N(localhost|\.local(domain)?|\.lan)$\N}\
    {yes}{no}}

# Helo should be FQDN
  defer
    hosts     = !+relay_from_hosts
    message   = HELO should be Fully Qualified Domain Name. Please contact your ISP.
    condition = ${if !match\
      {$sender_helo_name}\
      {\N.*[A-Za-z].*\..*[A-Za-z].*\N}\
    {yes}{no}}

# A remote helo can't be mine
  deny
    message   = Using my HELO is identity theft.
    hosts     = ! : !+relay_from_hosts
    condition = ${if match\
      {$sender_helo_name}\
      {\N^(192\.0\.2\.1|2001:db8:1234:|(.*\.)?example\.org)$\N}\
    {yes}{no}}

# Domain can't be localhost or localdomain
  defer
    message   = $sender_address_domain is an incorrect domain. Please contact your ISP.
    hosts     = ! : !+relay_from_hosts
    condition = ${if match\
      {$sender_address_domain}\
      {\N(localhost|\.local(domain)?|\.lan)$\N}\
    {yes}{no}}

# A remote host using my Domain is wrong
  deny
    hosts     = ! : !+relay_from_hosts : !+spf_white_hosts
    message   = Using my domain is identity theft.
    condition = ${if match\
      {$sender_address_domain}\
      {\N^(.*\.)?example\.org$\N}\
    {yes}{no}}

HELO check. If the HELO verify fails there must be some other link between the HELO and the hostname. If not, it's a zombie box.

# Try to match helo
# Conditions are AND-ed.
  defer
    message   = Lookup of $sender_helo_name failed or did not match. Please contact your ISP.
    hosts     = ! : !+relay_from_hosts
   !verify    = helo
    condition = ${if !eq\
      # Domains without TLD
      {${extract{-2}{.}{${lc:$sender_host_name}}}}\
      {${extract{-2}{.}{${lc:$sender_helo_name}}}}\
    {yes}{no}}
    condition = ${if \
      # Same network
      or {\
        {\
          # IPv4; Same /24, E.G.: 192.168.1.
          and {\
            {isip4{$sender_host_address}}\
            {!match \
              {${lookup dnsdb{a=$sender_helo_name}{$value}fail}}\
              {\
                ${extract{1}{.}{$sender_host_address}}\.\
                ${extract{2}{.}{$sender_host_address}}\.\
                ${extract{3}{.}{$sender_host_address}}\.\
              }\
            }\
          }\
        }\
        {\
          # IPv6; Same /48, E.G.: 2345:db8:4321:
          and {\
            {isip6{$sender_host_address}}\
            {!match \
              {${lookup dnsdb{a=$sender_helo_name}{$value}fail}}\
              {\
                ${extract{1}{:}{$sender_host_address}}\:\
                ${extract{2}{:}{$sender_host_address}}\:\
                ${extract{3}{:}{$sender_host_address}}\:\
              }\
            }\
          }\
        }\
      }\
    {yes}{no}}
    condition = ${if !match\
      # Host is mx of helo
      {${lc:${lookup dnsdb{>: mxh=$sender_helo_name}{$value}fail}}}\
      {${lc:$sender_host_name}}\
    {yes}{no}}
    condition = ${if !match \
      # Host address is mx of helo address
      {${lookup dnsdb{a=${lookup dnsdb{>: mxh=$sender_helo_name}{$value}fail}}{$value}fail}}\
      {$sender_host_address}\
    {yes}{no}}

# Block fake GMail / Google mail
  deny
    message   = This is a fake GMail mail. See http://www.example.org/spam/
    hosts     = ! *.google.com
    condition = ${if eq \
      {$sender_address_domain}\
      {gmail.com}\
    {yes}{no}}

# Block fake AOL mail
  deny
    message   = This is a fake AOL mail. See http://www.example.org/spam/
    hosts     = ! *.aol.com
    condition = ${if eq \
      {$sender_address_domain}\
      {aol.com}\
    {yes}{no}}

# Block fake Yahoo mail
  deny
    message   = This is a fake Yahoo mail. See http://www.example.org/spam/
    hosts     = ! *.yahoo.com
    condition = ${if eq\
      {$sender_address_domain}\
      {yahoo.com}\
    {yes}{no}}

# Enforce a message-size limit here and after DATA command
  defer
    message   = Message size $message_size is larger than limit of 1048576. Send a weblink to your data instead.
    condition = ${if >\
      {$message_size}{1048576}\
    {yes}{no}}

# Special cases;
# Firstsinganndy / shengxintga.com
# '^' indicates that the local part should be evaluated too.
  deny
    message = Fuck off morons
    senders = \N^firstsinganndy.*@gmail\.com\N

  deny
    message = Fuck off morons
    condition = ${if match\
      {$sender_address}\
      {\N^firstsinganndy.*@gmail\.com\N}\
    {yes}{no}}

Some addresses are rejected instead of deferred, or need regexes;

Google groups;
Subscribes you without your consent.
*@messagent.*;
Spams on behalf of others from a *@messagent.Spammers_Domain address.
myfanbox.com;
Subscribes you without your consent.
myspace;
Subscribes you without your consent.
NRC Handelsblad;
A notorious Dutch spammer. Uses various nrc.nl subdomains.

# Google groups
  deny
    message = Fuck off morons
    hosts   = *.google.com
    senders = *@groups.bounces.google.com
# *@messagent.*
  defer
    message = Domain $sender_address_domain is locally blacklisted here. See http://www.example.org/spam/
    condition = ${if match\
      {$sender_address_domain}\
      {\N^messagent\..*$\N}\
    {yes}{no}}

# myfanbox.com
  deny
    message = Fuck off morons
    hosts   = *.sms.ac
    senders = fbNOREPLY@myfanbox.com
# myspace
  deny
    message = Fuck off morons
    hosts   = *.myspace.com
    senders = noreply@message.myspace.com
# NRC
  deny
    message = Fuck off morons
    condition = ${if match\
      {$sender_address_domain}\
      {\N^(.*\.)?nrc\.nl$\N}\
    {yes}{no}}

Not all blacklists support IPv6 yet. See IPv6 enabled RBLs

# Check IP address in DNS based blacklists
  deny
    hosts    = ! : !+relay_from_hosts
    message  = Host is listed in $dnslist_domain. Please contact your ISP.
    dnslists = \
      cbl.abuseat.org : \
      bl.spamcop.net : \
      sbl.spamhaus.org : \
      xbl.spamhaus.org

# Check host name in domain DNS based blacklists; Deny
  deny
    message  = Host name is listed in $dnslist_domain. Please contact your ISP.
    hosts    = ! : !+relay_from_hosts
    dnslists = \
      dbl.spamhaus.org/$sender_host_name

# Check email address domain in DNS based blacklists; Deny
  deny
    message  = Domain is listed in $dnslist_domain. Please contact your ISP.
    hosts    = ! : !+relay_from_hosts
    senders  = ! :
    dnslists = \
      dbl.spamhaus.org/$sender_address_domain

Below a SPF Check.
This is done in the recipient phase rather than the mail from phase.

# Copied from /etc/exim4/conf.d/acl/30_exim4-config_check_rcpt
  .ifdef _HAVE_SPF
  deny
    !acl = acl_local_deny_exceptions
    senders = ! :
    hosts = ! : !+relay_from_hosts : !+spf_white_hosts
    spf = fail
    message = [SPF] $sender_host_address is not allowed to send mail from \
              ${if def:sender_address_domain {$sender_address_domain}{$sender_helo_name}}.
  defer
    !acl = acl_local_deny_exceptions
    senders = ! :
    hosts = ! : !+relay_from_hosts : !+spf_white_hosts
    spf = temperror
    message = Temporary DNS error while checking SPF record.  Try again later.

  warn
    spf = pass:softfail:neutral:permerror
    add_header = :at_start:$spf_received
  .endif

# Return to default Exim rcpt ACL

local_check_mime

# RvdP, this is the local MIME ACL
# A lot of definitions are in /etc/exim4/conf.d/main/000_localmacros
# Options are in /etc/exim4/conf.d/main/02_exim4-config_options

# First accept recipients from whitelist
  accept
    condition = ${if exists\
      {CONFDIR/local_rcpt_whitelist}\
    {yes}{no}}
    condition = ${if eq\
      {$h_x-example-wl:}\
      {$acl_m_epoch}\
    {yes}{no}}

# Include whitelist
# Combines host name and email address
  .ifdef LOCAL_COMBI_WHITELIST
    .include LOCAL_COMBI_WHITELIST
  .endif

# MS binaries
  deny
    message = Dont send binaries. Send sources instead.
    condition = ${if eq\
      {$mime_content_type}\
      {application/x-msdos-program}\
    {yes}{no}}

# Dos / Windows junk
  deny
    message = Attachment has unsupported file format. Try text or PDF instead.
    condition = ${if match\
      {$mime_filename}\
      {\N.+\.(bat|btm|cmd|com|cpl|dat|dll|docm|exe|jar|lnk|msi|pif|prf|reg|scr|vb|vbs)$\N}\
    {yes}{no}}

# Reject stuff inside zip
  deny
    decode = $mime_filename
    message = Attachment has unsupported file format inside zip file.
    condition = ${if match\
      {$mime_decoded_filename}\
      {\N.+\.zip$\N}\
    {yes}{no}}
    set acl_m6 = ${run{/bin/sh -c '/usr/local/sbin/check_zip.sh $message_exim_id'}}
    condition  = ${if eq \
      {$runrc}{1}\
    {true}{false}}

# Return to default Exim mime ACL

local_check_data

# RvdP, this is the local DATA ACL
# A lot of definitions are in /etc/exim4/conf.d/main/000_localmacros
# Options are in /etc/exim4/conf.d/main/02_exim4-config_options

Whitelists

# First whitelist temp recipients
# Check if the whitelist exists and then check if the X-Example-WL header line
# matches the epoch.
  accept
    condition = ${if exists\
      {CONFDIR/local_rcpt_whitelist}\
    {yes}{no}}
    condition = ${if eq\
      {$h_x-example-wl:}\
      {$acl_m_epoch}\
    {yes}{no}}

# Include whitelist
# Combines host name and email address
  .ifdef LOCAL_COMBI_WHITELIST
    .include LOCAL_COMBI_WHITELIST
  .endif

Mesg-Id blacklist

# Tmp message blacklisting, based on mesg-id
  deny
    message   = Message $h_message-id: is locally blacklisted here. See http://www.example.org/spam/
    #${lookup{<key>} <search type> {<file>} {<string1>} {<string2>}}
    condition = \
      ${lookup{$h_message-id:}\
      lsearch {CONFDIR/local_tmp_msg_blackl}\
    {yes}{no}}

Formal checks

# Checks are done in order: From, To, other header stuff, message content;
# DNS based checks are in order of increasing DNS load,
# SMTP based checks in order of increasing SMTP load.
# Some of these rules defer rather than deny. These can then be whitelisted.
# Defers are before denies.
# Reject (deny) is only in case of obvious malice.

# Insist on Date
  defer
    message   = RFC compliant date required.
    condition = ${if !match\
      {$h_date:}\
      {\N[0-9]\N}\
    {yes}{no}}

# Insist on From
  defer
    message   = RFC compliant from required.
    condition = ${if !match\
      {$h_from:}\
      {\N.+@.+\..+\N}\
    {yes}{no}}

# Insist on to
  defer
    message   = RFC compliant to required.
    condition = ${if !match\
      {$h_to:}\
      {\N.+\N}\
    {yes}{no}}

# Insist on Subject
  defer
    message   = RFC compliant subject required.
    condition = ${if \
      or {\
        {!match{$h_subject:}{\N.+\N}}\
        {match{${lc:$h_subject:}}{\N.?no subject.?\N}}\
      }\
    {yes}{no}}

# Subject: Re: should have content
  defer
    message   = RFC compliant subject required.
    condition = ${if \
      and {\
       {match{${lc:$h_subject:}}{\N^re:\N}}\
       {!match{${lc:$h_subject:}}{\N^re:.+\N}}\
       {!match{${lc:$h_subject:}}{\N^re: .+\N}}\
      }\
    {yes}{no}}

# Insist on Message-ID
  defer
    message   = RFC compliant message-id required.
    condition = ${if !match\
      {$h_message-id:}\
      {\N.+\N}\
    {yes}{no}}

# Valid UTF-8 sequences
  defer
    message   = Message headers contain non UTF-8 chars.
    condition = ${if or {\
        {\
          and {\
            {match{$rh_bcc:}{\N[\x80-\xff]\N}}\
            {!match{$rh_bcc:}{\N^([\x20-\x7e]|[\xc2-\xf7][\x80-\xbf]+)+$\N}}\
          }\
        }\
        {\
          and {\
            {match{$rh_cc:}{\N[\x80-\xff]\N}}\
            {!match{$rh_cc:}{\N^([\x20-\x7e]|[\xc2-\xf7][\x80-\xbf]+)+$\N}}\
          }\
        }\
        {\
          and {\
            {match{$rh_date:}{\N[\x80-\xff]\N}}\
            {!match{$rh_date:}{\N^([\x20-\x7e]|[\xc2-\xf7][\x80-\xbf]+)+$\N}}\
          }\
        }\
        {\
          and {\
            {match{$rh_from:}{\N[\x80-\xff]\N}}\
            {!match{$rh_from:}{\N^([\x20-\x7e]|[\xc2-\xf7][\x80-\xbf]+)+$\N}}\
          }\
        }\
        {\
          and {\
            {match{$rh_message-id:}{\N[\x80-\xff]\N}}\
            {!match{$rh_message-id:}{\N^([\x20-\x7e]|[\xc2-\xf7][\x80-\xbf]+)+$\N}}\
          }\
        }\
        {\
          and {\
            {match{$rh_reply-to:}{\N[\x80-\xff]\N}}\
            {!match{$rh_reply-to:}{\N^([\x20-\x7e]|[\xc2-\xf7][\x80-\xbf]+)+$\N}}\
          }\
        }\
        {\
          and {\
            {match{$rh_sender:}{\N[\x80-\xff]\N}}\
            {!match{$rh_sender:}{\N^([\x20-\x7e]|[\xc2-\xf7][\x80-\xbf]+)+$\N}}\
          }\
        }\
        {\
          and {\
              {match{$rh_subject:}{\N[\x80-\xff]\N}}\
              {!match{$rh_subject:}{\N^([\x20-\x7e]|[\xc2-\xf7][\x80-\xbf]+)+$\N}}\
          }\
        }\
        {\
          and {\
            {match{$rh_to:}{\N[\x80-\xff]\N}}\
            {!match{$rh_to:}{\N^([\x20-\x7e]|[\xc2-\xf7][\x80-\xbf]+)+$\N}}\
          }\
        }\
      }\
    {yes}{no}}

# Domain can't be localhost or localdomain
  defer
    message   = ${domain:$h_from:} is a wrong domain.
    hosts     = ! : !+relay_from_hosts
    condition = ${if match\
      {${domain:$h_from:}}\
      {\N^(localhost|localhost\.local(domain)?|local(domain)?)$\N}\
    {yes}{no}}

Blacklists

# Blacklist unreachable sender before h_from: verification
  deny
    message   = Sender content address $h_from: is locally blacklisted here. See http://www.example.org/spam/
    hosts     = ! : !+relay_from_hosts
    #${lookup{<key>} <search type> {<file>} {<string1>} {<string2>}}
    condition = \
      ${lookup{${address:$h_from:}}\
      nwildlsearch {CONFDIR/local_sender_blacklist}\
    {yes}{no}}

# Blacklist based on domain; Deny
  deny
    message  = Sender content domain is listed in $dnslist_domain. Please contact your ISP.
    hosts    = ! : !+relay_from_hosts
    dnslists = \
      dbl.spamhaus.org/${domain:$h_from:}

# Exp blacklist based on domain; Warn
  warn
    message  = Sender content domain is listed in $dnslist_domain. Please contact your ISP.
    hosts    = ! : !+relay_from_hosts
    dnslists = fulldom.rfc-clueless.org/${domain:$h_from:}

Callout

# require that there is a verifiable sender address in at least
# one of the "Sender:", "Reply-To:", or "From:" header lines.
# Do this only in case of DSN
#
# Do a regular verify first (bug workaround);
  defer
    senders = :
    message = No verifiable sender address in message headers.
   !verify  = header_sender
# And then a callout;
  defer
    senders = :
    message = No verifiable sender address in message headers.
   !verify  = header_sender/callout=100s

# Deny fake Message-Id
  deny
    message   = Using my domain is identity theft.
    hosts     = ! : !+relay_from_hosts : !+spf_white_hosts
    condition = ${if match\
      {$h_message-id:}\
      {\N^<.+@(.+\.)?example\.org>$\N}\
    {yes}{no}}

Charsets

# Check for spam specific charsets
# Subject
  defer
    message   = This looks like spam to me.
    condition = ${if match\
      {$rh_subject:}\
      {\N^=?.+?=$\N}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$rh_subject:}}\
      {\N(ascii|iso|utf|windows)\N}\
    {yes}{no}}

# Content
  defer
    message   = This looks like spam to me.
    condition = ${if match\
      {${lc:$h_content-type:}}\
      {\Ncharset\N}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_content-type:}}\
      {\N(ascii|iso|utf|windows)\N}\
    {yes}{no}}

Fake bounces

# Block fake bounces on ip address
  deny
    message   = This is a fake (joe job) or sub standard (lacking original headers) DSN.
    hosts     = ! : !+relay_from_hosts : !+spf_white_hosts
    senders   = :
    condition = ${if !match\
      {${lc:$message_body}}\
      {\N( |^)received: from .+(192\.0\.2\.1|2001:db8:1234:)\N}\
    {yes}{no}}

URL Shorteners

You should be able to see where weblinks in mail lead to before you click on them. People using url shorteners clearly have something to hide. It's best to block mails containing them.

# Tiny URL and others
  defer
    message   = UrlShort What have you got to hide.
    condition = ${if match\
      {${lc:$message_body}}\
      {\Nhref=(3D)?\"http(s)?://(9m\.no|bit\.ly|goo\.gl|ow\.ly|tinyurl\.com)/\N}\
    {yes}{no}}

Alternatively, you can set the Spamassassin URL_SHORTENER_CHAINED and URL_SHORTENER_DISABLED score to 10. Spamassassin has a far more extensive URL shortening regex then the one above.

Marketing

Blocks on 'x-message-type: marketing';

# Block Marketing
  deny
    message = Marketing sux
    condition = ${if match\
      {${lc:$h_x-message-type:}}\
      {\Nmarketing\N}\
    {yes}{no}}

Subscription Check

Below a Subscription Check;
If you know exactly which lists your users are subscribed to, you can block unwanted mailing lists. Mailing list subscription is only allowed by means of a confirmed opt-in. For a lot of mailing list software this is in fact the default. Subscriptions without confirmation are rogue; It's illegal to subscribe people to mailing lists without their specific request! Even when they are customers. And regardless of what terms and conditions might say.

# Block fake mailing lists, List-Subscribe
  defer
    message = I did not subscribe to this list
    condition = ${if match\
      {$h_list-subscribe:}\
      {\N.+\N}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_list-subscribe:}}\
      {LOCAL_SUBSCRIBED_LISTS_REGEX}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_from:}}\
      {LOCAL_SUBSCRIBED_FROMS_REGEX}\
    {yes}{no}}

# Block fake mailing lists, List-Unsubscribe
  defer
    message = I did not subscribe to this list
    condition = ${if match\
      {$h_list-unsubscribe:}\
      {\N.+\N}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_list-unsubscribe:}}\
      {LOCAL_SUBSCRIBED_LISTS_REGEX}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_from:}}\
      {LOCAL_SUBSCRIBED_FROMS_REGEX}\
    {yes}{no}}

# Block fake mailing lists, List-Help
  defer
    message = I did not subscribe to this list
    condition = ${if match\
      {$h_list-help:}\
      {\N.+\N}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_list-help:}}\
      {LOCAL_SUBSCRIBED_LISTS_REGEX}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_from:}}\
      {LOCAL_SUBSCRIBED_FROMS_REGEX}\
    {yes}{no}}

# Block fake mailing lists, List-Unsubscribe-Post
# if List-Unsubscribe-Post and List-Unsubscribe don't match
  defer
    message = I did not subscribe to this list
    condition = ${if match\
      {$h_list-unsubscribe-post:}\
      {\N.+\N}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_list-unsubscribe-post:}}\
      {LOCAL_SUBSCRIBED_LISTS_REGEX}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_list-unsubscribe:}}\
      {LOCAL_SUBSCRIBED_LISTS_REGEX}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_from:}}\
      {LOCAL_SUBSCRIBED_FROMS_REGEX}\
    {yes}{no}}

'LOCAL_SUBSCRIBED_LISTS_REGEX' and 'LOCAL_SUBSCRIBED_FROMS_REGEX' are defined in confd/main/000_localmacros

Fake ASCII

# Check non ASCII in body when ASCII is specified
  defer
    message   = Charset not specified
    condition = ${if match\
      {${lc:$h_content-type:}}\
      {\Nus-ascii\N}\
    {yes}{no}}
    condition = ${if match\
      {$message_body}\
      {\N[\x80-\xff]\N}\
    {yes}{no}}

Spammer specific

Below various specific spammer related filters. They all claim that you subscribed to their mailing lists. Edit to suit your needs.

# Block 3dkabel spam
  defer
    message   = I did not subscribe to this list.
    hosts     = *.mcdlv.net
    senders   = mcdlv.net : *@*.mcdlv.net
    condition = ${if match\
      {${lc:$message_body}}\
      {\Nu ontvangt deze e-mail omdat u bent ingeschreven voor onze nieuwsbrief\N}\
    {yes}{no}}

# Block Betjeman en Barton spam
  defer
    message   = I did not subscribe to this list.
    condition = ${if match\
      {${lc:$h_from:}}\
      {\Nbetjemanandbarton\N}\
    {yes}{no}}
    condition = ${if match\
      {$h_list-unsubscribe:}\
      {\N.+\N}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$h_subject:}}\
      {\Nbestelling|factuur|order|pakbon\N}\
    {yes}{no}}

# Block Ebay spam
# Je hebt deze e-mail ontvangen omdat je in je berichtgevingsvoorkeuren hebt
# opgegeven dat je informatie over speciale acties, aanbiedingen en evenementen
# wilt ontvangen. Je bent joe@example.org op www.ebay.nl
  defer
    message   = I did not subscribe to this list.
    condition = ${if match\
      {${lc:$h_from:}}\
      {\Nebay\N}\
    {yes}{no}}
    condition = ${if match\
      {${lc:$message_body}}\
      {\Nberichtgevingsvoorkeuren hebt opgegeven dat je informatie over speciale acties\N}\
    {yes}{no}}

# Block Hema spam
  defer
    message   = I did not subscribe to this list.
    condition = ${if match\
      {${lc:$h_subject:}}\
      {\Nuitnodiging voor de beoordeling van je hema aankoop\N}\
    {yes}{no}}

# Block Marktplaats spam
# OK:
# Dit is een kopie van uw reactie op een advertentie van:
# Not OK:
# Marktplaats B.V. heeft deze e-mail naar u gestuurd. Op dit adres wordt u
# op de hoogte gehouden van nieuws,
# productwijzigingen en promoties van Marktplaats.nl. Indien u hier geen
# prijs op stelt, klik dan
  defer
    message   = I did not subscribe to this list.
    condition = ${if match\
      {${lc:$h_from:}}\
      {\Nmarktplaats\N}\
    {yes}{no}}
    condition = ${if !match\
      {${lc:$message_body}}\
      {\Ndit is een kopie van uw reactie op een advertentie van\N}\
    {yes}{no}}
    condition = ${if match\
      {${lc:$message_body}}\
      {\Nop de hoogte gehouden\N}\
    {yes}{no}}
    condition = ${if match\
      {${lc:$message_body}}\
      {\Nproductwijzigingen en promoties\N}\
    {yes}{no}}

Roque webmail quota.

# Block quota shit.
# general
  defer
  message   = I don't use webmail
  condition = ${if match\
    {${lc:$message_body:}}\
    {\Nmailbox.*exceeded.*(quota|limit).*administrator\N}\
  {yes}{no}}

# Beste gebruiker,
# =20
# Uw e-mail quota heeft overschreden Admin opslag termijn die is 20GB =
# zoals ingesteld door uw beheerder, bent u momenteel op 20.9GB, kunt u =
# niet in staat zijn om nieuwe e-mail verzenden of ontvangen totdat je =
# weer valideren uw mailbox. Voor re-valideren uw mailbox klik Dan Hier: =
# <http://www.opinionpower.com/Surveys/634059182.html>=20
#
# Line may be wrapped after dot;
# http://www.=
#   opinionpower.=
#   com/Surveys/
#
# For tests Exim converts newlines to spaces.
#
defer
  message   = I don't use webmail
  condition = ${if match\
    {${lc:$message_body:}}\
    {\Nhttp://www\.(= )?opinionpower\.(= )?com/surveys/\N}\
  {yes}{no}}

# LED's
defer
  message   = This looks like spam to me: led
  condition = ${if match\
    {${lc:$h_subject:}}\
    {\N( led )\N}\
  {yes}{no}}
  condition = ${if match\
    {${lc:$h_content-type:}}\
    {\Nmultipart\N}\
  {yes}{no}}

# Silly bugfix
defer
  message   = This looks like spam to me: LED
  condition = ${if match\
    {$bh_subject:}\
    {\N( LED | Led | led )\N}\
  {yes}{no}}
  condition = ${if match\
    {${lc:$h_content-type:}}\
    {\Nmultipart\N}\
  {yes}{no}}

# Nederlanders verdienen
  defer
  message   = This looks like spam to me: subject
  condition = ${if match\
    {${lc:$h_subject:}}\
    {\Nnederlanders verdienen al miljoenen euro\N}\
  {yes}{no}}
  condition = ${if match\
    {${lc:$h_subject:}}\
    {\Nvanuit huis door gebruik te maken van deze maas in de wet om rijk te worden\N}\
  {yes}{no}}

# PayPal
  defer
  message   = Fix the login first
  senders   = service@paypal.nl
  condition = ${if match\
    {${lc:$h_subject:}}\
    {\Nu hebt berichten van paypal\N}\
  {yes}{no}}

# PromoNews 
  deny
    message   = PromoNews sux.
    condition = ${if match\
      {${lc:$h_from:}}\
      {\Npromonews\N}\
    {yes}{no}}

Phishing

# Phishing
# ========
# ABN_AMRO, ING, Rabo
  defer
  message   = This looks like a phish to me: bank
  condition = ${if match\
    {${lc:$message_body:}}\
    {\N (abn( |-)?amro|ics|ing|rabo) \N}\
  {yes}{no}}
  condition = ${if match\
    {${lc:$message_body:}}\
    {\N (beveiliging|klant|heer|meneer) \N}\
  {yes}{no}}
  condition = ${if match\
    {${lc:$message_body:}}\
    {\Nhref=(3D)?\"http(s)?://\N}\
  {yes}{no}}

# ICS cards
  defer
  message   = This looks like a phish to me.
  condition = ${if match\
    {${lc:$h_subject:}}\
    {\Ngeblokkeerde toegang tot uw ics cards rekening\N}\
  {yes}{no}}

Limit message size

# Enforce a message-size limit
  defer
    message   = Message size $message_size is larger than limit of 4194304. Send a weblink to your data instead.
    condition = ${if >\
      {$message_size}{4194304}\
    {yes}{no}}

Spamassassin

# Spam score
deny
  message   = X-Spam-Score: $h_x-spam-score:
  condition = ${if eq\
    {${lc:$h_x-spam:}}\
    {yes}\
  {yes}{no}}

# Don't spam check local generated mail
# (Comment out for spam check test purposes)
  accept
    hosts = : +relay_from_hosts

# Don't spam filter abuse
  accept
    condition = ${if \
      and {\
        {match{$recipients}{\N<?abuse@(.+\.)?example\.org>?\N}}\
        {match{$h_to:}{\N<?abuse@(.+\.)?example\.org>?\N}}\
      }\
    {yes}{no}}

# Spam filter the rest
# Warn tmp at 00 points for debug
  warn
    message   = X-Spam-Score: $spam_score ($spam_bar).
    condition = ${if <\
      {$message_size}{6144k}\
    {1}{0}}
    spam      = spamd:true
    condition = ${if >\
      {$spam_score_int}{50}\
    {1}{0}}
  warn
    message   = X-Spam-Report: $spam_report
    condition = ${if <\
      {$message_size}{6144k}\
    {1}{0}}
    spam      = spamd:true
    condition = ${if >\
      {$spam_score_int}{50}\
    {1}{0}}
  deny
    message   = This message scored $spam_score spam points.
    condition = ${if <\
      {$message_size}{6144k}\
    {1}{0}}
    spam      = spamd:true
    condition = ${if >\
      {$spam_score_int}{50}\
    {1}{0}}

# Return to default Exim data ACL

local_combi_whitelist

This bypasses a lot of checks!

# RvdP, local combined host - address whitelist
# A lot of definitions are in /etc/exim4/conf.d/main/000_localmacros

# Example
  accept
    hosts   = *.example.com
    senders = example.com

# 2nd example
  accept
    hosts   = *.example.org
    senders = joe@example.org

local_host_blackl

This file simply lists blacklisted hosts. One per line. Wildcards can be used.

local_sender_blackl

This file simply lists blacklisted senders. One per line. Wildcards can be used.

Spamd / SpamAssassin configuration

Is run SpamAssassin as user spamd. A lot of SpamAssassin, Pyzor and Razor files are therefore in '~spamd/', which corresponds with '/var/lib/spamd/'.

/etc/default/spamd
Files in /etc/spamassassin/
- local.cf
- v310.pre
- FuzzyOcr.cf (FuzzyOcr.cf.real)
Auto update
- sa-update via proxy
Bayes
Image and PDF spam check

/etc/default/spamd

Enable spamd and run as user spamd and enable auto update from cron;

# /etc/default/spamd
# Duncan Findlay

# WARNING: please read README.spamd before using.
# There may be security risks.

# Options
# See man spamd for possible options. The -d option is automatically added.

# SpamAssassin uses a preforking model, so be careful! You need to
# make sure --max-children is not set to anything higher than 5,
# unless you know what you're doing.

# RvdP, non priv
#OPTIONS="--create-prefs --max-children 5 --helper-home-dir"
OPTIONS="--create-prefs --max-children 5 --helper-home-dir --username spamd"

# Pid file
# Where should spamd write its PID to file? If you use the -u or
# --username option above, this needs to be writable by that user.
# Note that this setting is not used when spamd is managed by systemd
# RvdP, non priv
#PIDFILE="/run/spamd.pid"
PIDFILE="/run/spamd/spamd.pid"

# Set nice level of spamd
#NICE="--nicelevel 15"

/var/run/spamd/ should be owned spamd:spamd.
'/run/' disappears on reboot, so it needs to be recreated.
Below a edited version of '/etc/init.d/spamd';

test -f /etc/default/spamd && . /etc/default/spamd
  
# RvdP, /run disappears on reboot
if [ ! -d /run/spamd ]
then
        mkdir -m 2775 /run/spamd
        chown :spamd /run/spamd
fi

DOPTIONS="-d --pidfile=$PIDFILE"

If you run systemd, you should have systemd do this for you!

/etc/default/spamassassin

Note: /etc/default/spamd and /etc/default/spamassassin are now separate files!
You may need to create /etc/default/spamassassin;

# /etc/default/spamassassin
# RvdP, used by /etc/cron.daily/spamassassin

# Set CRON;
CRON=1

# Proxy
http_proxy="http://proxy.example.org:8080/"
https_proxy="http://proxy.example.org:8080/"

See '/etc/cron.daily/spamassassin' for more info!

Files in /etc/spamassassin/

local.cf

I added a lot of stuff;

endif # Mail::SpamAssassin::Plugin::Shortcircuit

# RvdP, my own stuff
# ==================

# Config
# ------

# Pyzor home
pyzor_options --homedir /home/spamd/.pyzor

# Pyzor home
pyzor_options --homedir /var/lib/spamd/.pyzor

# razor.conf
razor_config	/var/lib/spamd/.razor/razor-agent.conf

# Charset
report_charset		UTF-8

# Sensible contact
report_contact		postmaster@example.org
report_hostname		example.org

# I Don't understand other languages
ok_languages		en nl
ok_locales		en nl

# Blacklisted in received lines
score RCVD_IN_SBL	10.0
score RCVD_IN_XBL	10.0
score RCVD_IN_BL_SPAMCOP_NET	10.0

# Blacklisted web sites
score URIBL_ABUSE_SURBL	10.0
score URIBL_CR_SURBL	10.0
score URIBL_MW_SURBL	10.0
score URIBL_PH_SURBL	10.0
score URIBL_WS_SURBL	10.0

# No Tiny URL
#score SHORTENED_URL_SRC	10.0
score URL_SHORTENER_CHAINED	10.0
score URL_SHORTENER_DISABLED	10.0

# No caps
score SUBJ_ALL_CAPS	10.0
score UPPERCASE_50_75	10.0
score UPPERCASE_75_100	10.0

# HTML with image and 0 - 400 bytes text
score HTML_IMAGE_ONLY_04	10.0

# Razor, identical to Pyzor
score RAZOR2_CHECK		2.0
score RAZOR2_CF_RANGE_51_100	2.0

# More bayes
score BAYES_60		 3.0
score BAYES_80		 4.0
score BAYES_95		 5.0
score BAYES_99		10.0
score BAYES_999		10.0

# SORBS Sux
score RCVD_IN_SORBS_DUL	0

# Barracuda Sux
#score RCVD_IN_BRBL_LASTEXT	0

v310.pre

Disabled spamcop auto reporting;

# SpamCop - perform SpamCop message reporting
#
#loadplugin Mail::SpamAssassin::Plugin::SpamCop

Enabled auto whitelist.

# AWL - do auto-whitelist checks
#
# RvdP, enabled
loadplugin Mail::SpamAssassin::Plugin::AWL

Enabled language guesser;
You need this for 'ok_languages'.

# TextCat - language guesser
#
# RvdP, enabled
loadplugin Mail::SpamAssassin::Plugin::TextCat

FuzzyOcr.cf

Logging on. The log directory needs to be writable to the process owner (spamd).

# Logfile (make sure it is writable by the plugin)
# Default value: none
# RvdP, logging on
#focr_logfile /tmp/FuzzyOcr.log
focr_logfile /var/local/log/fuzzyocr/FuzzyOcr.log

Enable PDF scanning.

# These helpers must be defined before enabling PDF scanning
# RvdP, enable PDF scanning
focr_bin_helper pdfinfo, pdftops, pstopnm

Increase timeout.

# Timeout for the plugin, in seconds. (Maximum runtime of the plugin)
# Default value: 10
#focr_timeout 15
# RvdP, increase
focr_timeout 30

Increase maximum image size to scan;

#focr_max_height 800
#focr_max_width 800
# RvdP, set to 1920×1080
focr_max_height 1920
focr_max_width 1080

You need to get 'gifinter' and 'libgif.so'. For Debian these are in packages 'giflib-tools' and 'libgif7'

Auto update

On a non-systemd system 'CRON=1' in '/etc/default/spamassassin' takes care of this.
If I understand correctly, on a systemd based system, setting CRON=1 will install the appropriate timer the first time /etc/cron.daily/spamassassin is run.

sa-update via proxy

On a non-systemd system 'http_proxy="http://proxy.example.org:8080/"' in '/etc/default/spamassassin' takes care of this.
I also modified /etc/cron.daily/spamassassin;

if [ "$CRON" = "0" ] ; then
    exit 0
fi

# RvdP, Use proxy; export to child
if [ -n "${http_proxy}" ]
then
	export "http_proxy=${http_proxy}"
fi
if [ -n "${https_proxy}" ]
then
	export "https_proxy=${https_proxy}"
fi

# If the systemd timer is active, there's nothing else for us to do:
if [ -d /run/systemd/system ] && \
       systemctl is-enabled --quiet spamassassin-maintenance.timer; then
    exit 0
fi

I have no idea how to tell systemd to use a proxy server.

Bayes

I have a spam archive and lots of saved non spam messages.

 su - spamd
~$ cd .spamassassin/
~$ sa-learn --mbox --spam Spam_File
~$ sa-learn --mbox --ham Ham_File

You can also use this for a single spam mail

~$ sa-learn --spam --mbox todays-spam

The default behaviour of sa-learn is to limit the maximum message size to 500 kB. By setting this to '0' there is no limit;

~$ sa-learn --spam --mbox --max-size 0 todays-spam

Image and PDF spam check

Uncommenting the following line in FuzzyOcr.cf should enable PDF scanning;

focr_bin_helper pdfinfo, pdftops, pstopnm

pdfinfo, pdftops and pstopnm are in package poppler-utils.

Razor configuration

I edited '/etc/razor/razor-agent.conf' and '/var/lib/spamd/.razor/razor-agent.conf'. The log directory needs to be writable to the process owner (spamd).

# See razor-agent.conf (5)
# Change this to 5 for safer classification of MIME attachments.  This will let more spam through
logic_method = 4
# Change the next line to a file to stop using syslog
# RvdP, Dir
#logfile = /var/log/razor-agent.log
logfile = /var/log/razor/razor-agent.log

# RvdP, force ~spamd/
razorhome = /home/spamd/.razor

And /etc/logrotate.d/razor;

# RvdP, Dir
# /var/log/razor-agent.log {
/var/log/razor/razor-agent.log {
	weekly
	rotate 3
	compress
	nomail
	notifempty
	missingok
}

Misc

Reporting to Pyzor

Create a dir .pyzor in your homedir. Copy the servers file to this dir. Create an empty mailfolder, E.G.: 'todays-spam'. Save the spam to this folder. Copy the spam to stdin of Pyzor and tell it to report the spam;

~$ cat mail/todays-spam | pyzor report

This should produce something like '(200, OK)'.

Delete the contents of 'todays-spam' before you save more spam.

DNS issues

Make sure you have a helo name which points to the ip addresses of all the interfaces which your mailserver might use.
If you have a split DNS, use all interfaces for in the internal, and just the external interface names in the external DNS.
If your mailserver is behind reverse NAT / portforwarding, the helo name should point to the external IP address too.

/etc/services

It's a good idea to put the ports used by various daemons in /etc/services (if not already there);

# Local services
spamd		783/tcp				# spamassassin daemon