ClearOS Installation Checklist

I'm writing this checklist as I setup a new router for the home office to remind me of the modifications I need to make to get a fresh deployment "just right" the first time.  ClearOS is a CentOS-based router distribution that lets one rapidly and easily deploy and manage routers and miscellaneous network services. CentOS itself is a de-branded flavour of Red Hat Enterprise Linux. Back when ClearOS flew under the ClarkConnect label if one wanted certain parts of the product one had to either pay for an Office or Enterprise license or use well crafted google queries to find the ftp credentials for Enterprise repositories, grab the RPMs, rpm2tgz/rpm2cpio them and overlay them on the filesystem (and don't forget to fix the permissions!) to avoid unresolvable dependencies.

Fortunately with the morph to ClearOS the Clear Foundation folks stopped charging for commodity software (i.e. web configuration modules for DMZ and Multi-WAN (load balancing/failover)) and started focusing on services you can live without. I warn against paying just for the tech support, you're much better off in the community forum - sparse though the posts are. From my experience (last dating in 2008 mind you) their Level 1 cuts off at putting the cd in the drive and may flat out refuse to support their product if you reveal it's being used in any setting more complex than a small office. If you're an intermediate *nix user and you can't figure out the problem on your own or with google chances are tech support can't (or won't) help you anyway; drop by the users forum.

Most of the routers I make these days are virtual machines and that goes beyond the scope of this checklist, however I plan to cover my process in a future article. The machine this checklist will be based on is an AthlonXP 2500+ 1.83GHz with 384MB DDR266 and an 8GB CompactFlash card plugged into a CF-IDE adapter, which you can get on eBay for about $1CAD. The machine has an onboard NIC and two PCI NICs as well as a wifi card so it can be turned into an access point. I like to use CF cards instead of hard drives on my physical routers because - although the cheap ones can be quite slow - you don't have to worry about them up and dying on you for several years. The cards are worn out with repeated write cycles (though often in the high thousands) so if you choose to use them you should try to minimize the amount of data written to disk during day-to-day operation. A remote syslogd might be of great help.

We're going to assume you've already downloaded and burned the installation ISO to disc. At the time of writing the current version is 5.1 SP1, the instructions below may not apply to future versions. Once you've booted you'll eventually be asked if you would like to let the installer automatically partition your hard drive or if you'd like to manually configure the partitioning. It's usually fine to let the installer do its thing but if you're working in confined spaces like our 8GB flash card or have plenty of ram the oft-defaulted swap partition size of 1GB is a tad generous. Also choose to manually configure the partitioning if you would like to use software RAID. If you choose to RAID your drives use anything but level 0; there is no need which I could conceive for high performance storage on a router (almost everything needed is loaded into ram on startup) and reliability is priority one on mission-critical systems like these.

Chances are if you want to do anything with the storage on your router you want to outsource that operation to another machine. Your router is the gatekeeper for the network and if it becomes compromised the consequences could be worse than with any single workstation. The simplest way to reduce the risk of a service being exploited is to not run it, so your router shouldn't run anything it doesn't need for management or to route and protect the network (firewall, IPS, IDS etc). In following with that notion keep the number of users on the system to an absolute minimum. If you're the only person who should have access to the box you should be the only person with a user account. ClearOS allows root logins via SSH by default so you should create at least one user account for yourself in order to separate privileges.

Once you've completed the installation and rebooted you can connect to the management interface at https://lan-ip:81 and log in as root. You'll be asked to fill in a number of details to complete the installation process. Once that's completed register your router with ClearSDN. If you have not made an account with them yet you should do so at the ClearSDN portal first. I generally disable "send diagnostic reports" when I register the routers but you may be less paranoid and more helpful. Once your router has been registered go to Software Updates. You can enable or disable automatic updates. They are enabled by default and I don't like that one bit: what if one of the repositories gets hacked? What if a new RPM breaks something critical and I'm not around to fix it?

Don't waste your time with all the checkboxes, shell into your router and run the following:

  • yum update
    • Updates all packages currently installed
  • yum install screen
    • Screen is a handy tool for multitasking shells
  • yum install lynx
    • There's already a version of lynx that comes as part of the ClearOS text console and you can use it by symlinking it to /usr/bin and its config file to /etc but that's messy.
  • yum install links
    • You don't need this if you have lynx, I just like to install them both so I can type either. Depends on my mood.
  • yum install nmap
    • nmap is an invaluable network diagnostic and analysis tool.
  • yum groupinstall "Development Tools"
    • Install this on systems where you expect to be compiling third-party software. Wherever possible use RHEL/CentOS RPMs for the corresponding version of ClearOS. You probably don't want to install this on space-restricted systems. If you have space to kill it never hurts to be prepared.
  • yum install ncurses-devel
  • yum install kernel-devel
    • Install these on systems where you expect to be modifying the kernel, you will need the Development Tools group to compile the kernel or modules.
  • yum install net-snmp
    • Install this so you can monitor system statistics remotely. (i.e. with Cacti)
  • yum install wpa-supplicant
    • You need this if you want to run an access point, ClearOS is only configured for WEP by default and can't be set up through the web config.

The ClearOS web config has an embedded MRTG package that graphs system vitals, but if you plan on remotely monitoring your router's statistics (load average, network traffic, etc.) you will probably want to install net-snmp. Depending on your configuration you may need open port 161UDP. Here's a very short configuration sample that you can drop into /etc/snmp/snmpd.conf:

rouser  public
rocommunity  public localhost
syslocation  "Server Room"
syscontact  [email protected]
com2sec local    public
com2sec local    public
group MyROGroup v1         local
group MyROGroup v2c        local
group MyROGroup usm        local
view all    included  .1  80
access MyROGroup ""      any       noauth    exact  all    none   none

Replace with the subnet or IP that should have access to SNMP data.

ClearOS is one of the few distributions that enable syncookies by default. You probably don't need to add these lines since syncookies override tcp_max_syn_backlog. I like to do it anyway just in case something fails on bootup. Per my previous article Defending Against the SYN Flood add these lines to /etc/rc.d/rc.firewall.local:

echo 3096 > /proc/sys/net/ipv4/tcp_max_syn_backlog
echo 2 > /proc/sys/net/ipv4/tcp_syn_retries
echo 1 > /proc/sys/net/ipv4/tcp_synack_retries
echo 1 > /proc/sys/net/ipv4/tcp_syncookies

Or not. It won't kill you either way.

ClearOS comes with a special rule that snort will use to detect local SSH brute force attempts but as we covered in my previous article Stifling Brute Force Attacks with fail2ban fail2ban is highly extensible and can perform any operation that can be executed from the command line in response to any pattern match found in a given log file.  Fail2ban is not available in the default ClearOS repositories but we can use the RHEL 5 rpm available at http://dag.wieers.com/rpm/packages/fail2ban/. After installing the packages listed above the RPM should have only one dependency: gamin-python. Install fail2ban thus:

# yum install gamin-python
# wget http://rpmforge.sw.be/redhat/el5/en/i386/rpmforge/RPMS/fail2ban-0.8.1-1.el5.rf.noarch.rpm
# rpm -iv fail2ban-0.8.1-1.el5.rf.noarch.rpm

Return to the webconfig and make sure you have installed all the components and third party applications listed that you need, like the Advanced Firewall Module which is not installed by default. Configure your firewall, DHCP and VPN(s). Back at the command line let's clean out all the packages we just downloaded:

# yum clean all

At the command line run chkconfig --list:

acpid           0:off   1:off   2:on    3:on    4:on    5:on    6:off
avahi-daemon    0:off   1:off   2:off   3:on    4:on    5:on    6:off
avahi-dnsconfd  0:off   1:off   2:off   3:off   4:off   5:off   6:off
clamd           0:off   1:off   2:off   3:off   4:off   5:off   6:off
cpuspeed        0:off   1:on    2:on    3:on    4:on    5:on    6:off
crond           0:off   1:off   2:on    3:on    4:on    5:on    6:off
cups            0:off   1:off   2:off   3:off   4:off   5:off   6:off
dansguardian-av 0:off   1:off   2:off   3:off   4:off   5:off   6:off
dnsmasq         0:off   1:off   2:on    3:on    4:on    5:on    6:off
fail2ban        0:off   1:off   2:off   3:on    4:on    5:on    6:off
firewall        0:off   1:off   2:on    3:on    4:on    5:on    6:off
freshclam       0:off   1:off   2:off   3:off   4:off   5:off   6:off
haldaemon       0:off   1:off   2:off   3:on    4:on    5:on    6:off
httpd           0:off   1:off   2:off   3:off   4:off   5:off   6:off
ipsec           0:off   1:off   2:off   3:off   4:off   5:off   6:off
iscsi           0:off   1:off   2:off   3:off   4:off   5:off   6:off
iscsid          0:off   1:off   2:on    3:on    4:on    5:on    6:off
kudzu           0:off   1:off   2:off   3:on    4:on    5:on    6:off
l7-filter       0:off   1:off   2:off   3:off   4:off   5:off   6:off
ldap            0:off   1:off   2:off   3:on    4:on    5:on    6:off
ldapsync        0:off   1:off   2:off   3:on    4:on    5:on    6:off
lm_sensors      0:off   1:off   2:on    3:on    4:on    5:on    6:off
lvm2-monitor    0:off   1:on    2:on    3:on    4:on    5:on    6:off
mcstrans        0:off   1:off   2:off   3:off   4:off   5:off   6:off
mdmonitor       0:off   1:off   2:off   3:off   4:off   5:off   6:off
mdmpd           0:off   1:off   2:off   3:off   4:off   5:off   6:off
messagebus      0:off   1:off   2:off   3:on    4:on    5:on    6:off
multipathd      0:off   1:off   2:off   3:off   4:off   5:off   6:off
mysqld          0:off   1:off   2:off   3:off   4:off   5:off   6:off
netconsole      0:off   1:off   2:off   3:off   4:off   5:off   6:off
netfs           0:off   1:off   2:off   3:on    4:on    5:on    6:off
netplugd        0:off   1:off   2:off   3:off   4:off   5:off   6:off
network         0:off   1:off   2:on    3:on    4:on    5:on    6:off
nmb             0:off   1:off   2:off   3:off   4:off   5:off   6:off
nscd            0:off   1:off   2:off   3:off   4:off   5:off   6:off
ntpd            0:off   1:off   2:off   3:off   4:off   5:off   6:off
openvpn         0:off   1:off   2:off   3:off   4:off   5:off   6:off
pptpd           0:off   1:off   2:off   3:off   4:off   5:off   6:off
rdisc           0:off   1:off   2:off   3:off   4:off   5:off   6:off
restorecond     0:off   1:off   2:on    3:on    4:on    5:on    6:off
saslauthd       0:off   1:off   2:off   3:on    4:on    5:on    6:off
smb             0:off   1:off   2:off   3:off   4:off   5:off   6:off
snmpd           0:off   1:off   2:off   3:off   4:off   5:off   6:off
snmptrapd       0:off   1:off   2:off   3:off   4:off   5:off   6:off
snort           0:off   1:off   2:off   3:off   4:off   5:off   6:off
snortsam        0:off   1:off   2:off   3:off   4:off   5:off   6:off
squid           0:off   1:off   2:off   3:off   4:off   5:off   6:off
sshd            0:off   1:off   2:on    3:on    4:on    5:on    6:off
suvad           0:off   1:off   2:on    3:on    4:on    5:on    6:off
syslog          0:off   1:off   2:on    3:on    4:on    5:on    6:off
system-mysqld   0:off   1:off   2:on    3:on    4:on    5:on    6:off
syswatch        0:off   1:off   2:on    3:on    4:on    5:on    6:off
vpnwatchd       0:off   1:off   2:off   3:off   4:off   5:off   6:off
webconfig       0:off   1:off   2:on    3:on    4:on    5:on    6:off
winbind         0:off   1:off   2:off   3:off   4:off   5:off   6:off
wpa_supplicant  0:off   1:off   2:off   3:off   4:off   5:off   6:off

We want to turn off anything that starts in runlevels 2.3.4 or 5 that we don't need. This will make the router boot faster and use less ram, which is particularly important if you're building a virtual machine. A fast power-cycle has obvious advantages for any connectivity device and one can comfortably fit a low-traffic ClearOS router into 96MB of RAM with room to breathe by disabling the right services. Based on the defaults shown here, these are some services you probably want to turn off - your mileage of course may vary:

  • avahi-daemon
  • avahi-dnsconfd
    • Zeroconf stuff. You only want it if you know what that means, and probably not even then.
  • haldaemon
    • Practically unused by anything but X
  • iscsi
  • iscsid
    • Obviously you want these if you really are using iSCSI.
  • kudzu
    • Checks for new hardware and can interrupt boot process, can be run from the command line anyway
  • lvm2-monitor
    • You only want this if you're using LVM
  • messagebus
    • Same with HALD
  • netfs
    • Leave this on if you're doing anything with NFS

Services you may want to disable include:

  • suvad
    • Talks to ClearSDN, disabling interferes with updates and ClearSDN services but it can be started on demand
  • lm_sensors
    • There's no hardware to monitor on a xen virtual machine
  • cpuspeed
    • Ditto
  • acpid
    • Ditto

Use the following syntax to remove init scripts from these runlevels:

# chkconfig --level 2345 iscsid off

And enable anything that should be turned on:

# chkconfig --level 2345 snmpd on

Be sure not to touch any of the numerous LDAP services, ClearOS uses that internally to manage the user accounts. If you don't know what a service does be sure to look it up before you disable it.

If your router includes a wireless card that requires firmware do not forget to download it to /lib/firmware.

Updatedb indexes the files on your mounted partitions for fast searching with the locate or slocate tool. You should run it once now that you have most of the files installed. By default, cron runs updatedb every night. This causes high I/O load and can be disabled by disabling its cron script's execute bit:

# chmod -x /etc/cron.daily/mlocate.cron

While the web config makes ClearOS what it is, I don't like the console configuration (slow, featureless and requires two logins...) - and I really don't like the graphical one (wtf?). These are a severe obstacle on Xen installations where it's difficult to navigate serial-based (no ptys to alt+Fx to) xenconsole out of the bottomless ncurses pit. Rather than loop-mount the image and configure the networking and reboot the VMs and shell in I prefer to disable that rubbish altogether. Edit /etc/inittab to reflect:

# Run gettys in standard runlevels
#1:2345:respawn:/sbin/mingetty --autologin=clearconsole tty1
1:2345:respawn:/sbin/mingetty tty1
2:2345:respawn:/sbin/mingetty tty2

If this router is going to be in a high traffic environment you may find that snortreport.sh progressively uses more and more resources to compile the webconfig-accessible IPS/IDS reports until it starts overlapping with itself, causing extreme load. You can delete it, move it, or remove its execute bit:

# chmod -x /usr/sbin/snortreport.sh

ClarkConnect used to come with some fairly dangerous default snort rules, particularly if your router is intended to be the firewall for a public network. Things look a lot better now, the two rules I could remember always having to comment out now come commented by default; as things run I'll keep a list of false-positive generating rules here.

NOTE: That list can be found here: Bad Snort Rules

Snort rules take the form of lines in files located in /var/lib/suva/services/intrusion-protection/rules/ (formerly /var/lib/suva/services/snort/rules/).  Disabling a rule is as simple as prefixing it with a hash mark (#) and restarting snort:

# /etc/init.d/snort restart

Check your Intrusion Prevention reports in the web config regularly when you first deploy your new firewall. Investigate any rule that appears multiple times to determine if your particular environment is triggering false positives. This is critical if you are protecting a public network, say a farm of web servers. One rule (that is now commented by default) would block an IP that sent or received a string of ascii 'a's, like: 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' because a long string of 'a's  is one signature of a certain buffer overflow attack. One day one of my users said "Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaah!" in a web-based chat room when I had a freshly deployed router out and he and everyone who saw it were swiftly blocked from the network. One can either live without IPS (I wouldn't recommend it) or one can mitigate the downtime through careful monitoring.

You may wish to lock down SSH by following my article on key-exchange. While by default (in gateway mode) ssh is not accessible on the external addresses one should never discount the possibility of attack from within. Any machine behind your firewall that can be compromised will be in a unique position to compromise other machines if attention is not paid to internal network security. There is no such thing as a trusted network, only more trustworthy.

Passwordless or Single Password SSH with Key Exchange

IMPORTANT UPDATE In the ten years since this article was published a lot has changed. Please see my updated article Generate and Automatically Load SSH Keys for Convenient Passwordless Authentication for a more secure and convenient implementation.

In the last two articles we have covered in detail the main flaw of any username-password authentication scheme and how to defend against attacks by increasing their time/resource cost. Unfortunately this does nothing to eliminate the problem but key-exchange authentication - while not unbreakable - changes the shape of the playing field and it's becoming an increasingly favoured authentication scheme for a myriad of applications including SSH and VPN protocols. This article will show you how to quickly generate and exchange keys with remote hosts and disable traditional password authentication.

First you will need to generate a key pair:

ssh-keygen -t rsa

You are going to have to decide here whether you want to encrypt your private key with a passphrase and enter one password every time you use key exchange or make login instant at the expense of a more vulnerable key. You need to consider the possible damage that could be done if a given machine with unrestricted shell access to other hosts is compromised.

The least level of protection you can apply is exchanging only lesser-privileged accounts as a first step toward higher levels, i.e. by using su. Key exchange suffers from this weakness only when the private key is stored locally and unencrypted. One could keep the private key (~/.ssh/id_rsa) on a USB stick, however your key is vulnerable when the device is mounted. Even when using an encrypted key if the file is intercepted it can eventually be cracked. Smart cards (themselves) do not share these weaknesses and will be the topic of an upcoming article.

Never allow root to directly shell into a machine regardless of the authentication scheme you choose to use, make sure your target's sshd_conf includes:

PermitRootLogin no
AllowUsers user1 user2 user3

Where user1,2,3 are the names of specific users permitted to login. This may not be practical for larger or public installations.

If the target account on the remote host has not yet used ssh you may have to create ~/.ssh. Add the new private key to the remote host's authorized keys list:

cat ~/.ssh/id_rsa.pub | ssh xxx.xxx.xxx.xxx "xargs --null echo >> ~/.ssh/authorized_keys"

If you have not already shelled into the remote host from this account you will be prompted to accept its public key. You may then be prompted for your password. If the copy was successful you will be returned to the command line without a message. Try logging into the remote host, if you are not asked for a password or you are asked your private key's passphrase you have successfully performed the key exchange.

Once you've finished exchanging keys with all of the hosts which should have access to the target you might proceed to disable password based authentication. The previous command will no longer work to import new keys, you will have to transfer them via other means (i.e. a host that has already exchanged keys with the target). Edit /etc/ssh/sshd_config to reflect:
PasswordAuthentication no ChallengeResponseAuthentication no
and restart sshd. Try logging in from a machine that has not exchanged keys with it. You should see something like:
Permission denied (publickey).
In an ideal world you wouldn't run management services (SSH, webmin, snmp etc) on public address space. One can keep SSH from being exposed in the first place by making it listen on a private subnet and connecting to it via VPN. The only time I can see someone wanting to expose SSH in particular is to provide sftp and chances are you'll be dealing with a number of users where key exchange isn't practical. You can address this by keeping password authentication and enabling a chroot jail, which I'll cover in a future article. If an attacker does manage to break into an account despite your fail2ban setup they will at least be confined to their own little slice of the filesystem.

Brute Force and Flood Protection for Web Forms

In the last article I told you any username-and-password authentication system that is exposed to the Internet is inherently vulnerable to dictionary and brute force attack. If you must use such an authentication scheme you can defend it by implementing rate control. If you block an attacker from trying to log in for one hour after three failed attempts it would take them a year to try just under 3,000 combinations. In cryptanalytic terms that is abysmal and the odds are on your side that the attacker will have moved on by then.

While porting your ban system to fail2ban might be a great idea it's probably overkill for situations where you have hundreds of legitimate users who might often forget their credentials; IP-bans are not generally considered good customer service. Many sites, including Google, will present the user with a CAPTCHA after three failed attempts and that's great but those are getting easier to crack every day.

For the sake of the pseudocode in this article we're going to assume you want to block the  potential attacker and politely tell them they have either a) failed to log in too many times, please come back in an hour or b) posted too recently, please try again. Since we want to be able to rate control two (and perhaps more in the future) different things and we don't want to make a mess of our database let's make one table called 'greylist' and use the type column to differentiate:

CREATE TABLE `demo_cat`.`greylist` (
`type` VARCHAR( 30 ) NOT NULL ,
`date` INT NOT NULL ,
`ip` VARCHAR( 15 ) NOT NULL ,
PRIMARY KEY ( `ip` ) ,
INDEX ( `date` )

Now in your login script for argument's sake we'll say $outcome is a boolean representation of if the authentication was successful or not and $delay is the period of time we want to measure for in seconds. We'll start off by clearing everything that's out of date, a relatively inexpensive query to run every time there's a failure. After the table has been updated we'll add an entry for the current failure and take a tally of all the entries for the user's IP. If the tally exceeds the retry $threshold we'll tell them to buzz off for an hour, change their password, show a captcha or whatever suits your site best.


   $ip = mysql_real_escape_string($_SERVER['REMOTE_ADDR']);
   mysql_query("delete from `greylist` where `type` = 'login' and `date` < '".time()-$offset."'");
   mysql_query("insert into `greylist` (`type`, `date`, `ip`) values ('login', '".time()."', '$ip'')");
   $result = mysql_query("select `ip` from `greylist` where `type` = 'login' and `ip` = '$ip'");
   if(mysql_num_rows($result > $threshold))
      // Too many tries, what now?
      // Please try again


It is as simple as that. Now let's use this to flood-protect our comments box:


   $ip = mysql_real_escape_string($_SERVER['REMOTE_ADDR']);
   mysql_query("delete from `greylist` where `type` = 'comment' and `date` < '".time()-$offset."'");
   mysql_query("insert into `greylist` (`type`, `date`, `ip`) values ('comment', '".time()."', '$ip'')");
   $result = mysql_query("select `ip` from `greylist` where `type` = 'comment' and `ip` = '$ip'");
   if(mysql_num_rows($result > $threshold))
      // You posted too recently, please wait x seconds before trying again.
      // Continue...


A more sophisticated implementation of this concept is in use at Ychan, where users' posting patterns are analyzed to determine if they are computers, legitimate humans or computers trying to look like humans.

Stifling Brute Force Attacks with fail2ban

fail2ban is a package that monitors your log files for failed login attempts and executes a configured action, usually temporarily blocking the attacking IP with iptables for a set duration. Any exposed service that uses a username/password authentication scheme is vulnerable to dictionary and brute force attack, your first defense if you must expose such a service is to make such attacks as costly as possible and that's where fail2ban comes in. By temporarily blocking an address for even 10 minutes after every 3 failed login attempts you make the process several orders of magnitude slower. Since fail2ban reads plain log files and can be configured for any action one clever deployment could see a log server collecting logs from all the hosts on a network and sharing the relevant logs with the firewall via NFS where fail2ban can quickly cut access to the entire network from the attacker with ease. For the purposes of this article we will only focus on locking down SSH on a local host.

fail2ban is probably available in your distribution's package management system. Gentoo users type:

# emerge fail2ban

If the package is not available for your flavour you can compile it from source, available at http://sourceforge.net/project/showfiles.php?group_id=121032&package_id=132537:

# tar xjf fail2ban-*
# cd fail2ban-*/
# ./setup.py install
# cp /usr/local/src/fail2ban-*/files/{your distro or close match here}-init /etc/init.d/fail2ban

Then add the script to the appropriate runlevels. Gentoo users type:

# rc-update add fail2ban default

Despite the name, fail2ban jails are not like chroot or ssh jails. A 'jail' is the combination of a filter and an action. The filters are regular expressions used to search the log files for interesting lines such as login failures. These filters are located in /etc/fail2ban/filter.d/ and the action scripts are located in /etc/fail2ban/filter.d/. By adding to and tying these filters and actions together in /etc/fail2ban/jail.conf you can re-purpose fail2ban to do just about any log event-triggered action imaginable; once you've given it a good mucking about locking down SSH may seem trite.

Open /etc/fail2ban/jail.conf and find [ssh-iptables], change the configuration block to look like this:

enabled = true
filter = sshd
action = iptables[name=SSH, port=ssh, protocol=tcp]
logpath = /var/log/sshd/current
maxretry = 5
# findtime = 600
# bantime = 600

You may need to edit logpath to reflect your system's settings. Set maxretry to however many failed login attempts you wish to allow over a given amount of time (findtime) until the source address is blocked for a given amount of time (bantime). The default findtime and bantime  is 600 seconds (10 minutes) and only needs to be set if you would like to choose different durations. If you would like to be notified by e-mail when someone has been blocked (probably not a good idea on a busy public server) add this line to the jail:

mail-whois[name=SSH, [email protected]]

Now make sure your SSH daemon is logging in verbose mode, add this line if you must to /etc/ssh/sshd_config:


If your sshd log entries contain the string pam_unix(sshd:auth) (Gentoo users here) you may need to modify the line starting with __daemon_re in /etc/fail2ban/filter.d/common.conf to look like:

__daemon_re = [\[\(]?%(_daemon)s(?:\([^\)]+\))?[\]\)]?:?

and configuration is over. Now start the server:

/etc/init.d/fail2ban start

If you run iptables --list you should see a fail2ban target. Try breaking into SSH from another host, after a few tries you should be blocked from port 22 on the remote host. Running iptables-save will show you a rule under the fail2ban target for the IP that was just blocked. Once the bantime limit has been reached you will regain access.

Defending Against the SYN Flood

A SYN flood is a type of resource-starvation denial of service (DoS) attack in which the attacker creates enough "half open" connections to render a server inaccessible to the legitimate public. Because the attack takes advantage of weaknesses in the default configuration of most TCP implementations rather than raw strength, one attacker with a relatively low bandwidth connection can quickly take down a much better equipped server. The attacker only needs to send one SYN packet to establish a half-open connection on the defending server, which will in turn attempt to reset the connection a set number of times. Since the handshake has been initialized and the connection is being logged the deed is done; the attacker doesn't need to respond to the RST packet so the source address can be spoofed, making the task of tracing the attacker virtually impossible and the attack itself very difficult to block.

When you first come under attack it may not seem obvious  what is happening. The targeted host(s) will stop or sporadically respond to your users and you may not even be able to shell into the machine. If you can gain access to the machine the telltale signs are:

  • Services are running but using no CPU or I/O
  • Traffic graphs flatline but the host(s) remain pingable
  • Services appear to be listening on the right ports, the firewall is clear, but you can't connect to them even locally
  • Multiple TCP-based services are affected
  • The output of netstat -n indicates an unusually high number of connections in the SYN_RECV state

All or most TCP services will seem to be affected because they all share the same connection queue. Unless your server is very overloaded, even on high traffic sites you should never see more than about 5 or 6 connections in the SYN_RECV state sustained over any period of time - particularly if you reduce the number of retries your kernel attempts as outlined below.

Fortunately there are two ways to address this problem: stack tweaking and syncookies (for BSD/linux, other implementations exist). Since the SYN flood relies on a lengthy timeout and limited number of available connections the obvious first step is to increase these limits. Having a lot of extra RAM comes in handy here since it takes RAM to track the connections. In fact, in preventing most resource starvation tactics throwing more RAM (if available) at the problem is always a good blind first step - though never the solution. We can manipulate these values through the /proc interface:

# echo 3096 > /proc/sys/net/ipv4/tcp_max_syn_backlog

tcp_max_syn_backlog limits the number of half-open connections the kernel will track. This is the limit that gets exhausted when regular users are no longer able to connect.

# echo 2 > /proc/sys/net/ipv4/tcp_syn_retries
# echo 1 > /proc/sys/net/ipv4/tcp_synack_retries

tcp_syn_retries is the number of times the kernel will wait appx. 40 seconds and send out another SYN packet when trying to establish an outbound connection. This won't do you any good for SYN flood protection but it can mitigate the effects of some amplification/redirection techniques that use your hosts as soldiers. tcp_synack_retries limits the number of times the kernel will retry responding to a half-opened connection. The default is 5 and that means an attacker's connection could last in the queue for up to 180 seconds. If the attacker can open an easy 300 new half-open connections in that period it becomes clear how quickly your connection queue can be overrun. Setting this value too low can cause problems for people on weak links like dialup.

Obviously this isn't going to be enough; finite resources will always be finite resources. Syncookies are a genious little invention that in a nutshell validate that traffic coming to the host is sent from a real computer rather than a packet generator by sending a simple type of cryptographic challenge in the headers of outgoing packets that is "responded" to in the headers of incoming packets by the mechanics of tcp itself. Because spoofed traffic doesn't have a legitimate sending host behind it to  "hear" the challenge it (probably) does not contain  a valid response and the connection is swiftly discarded.

Syncookies are not enabled by default and enabling them will override the value in tcp_max_syn_backlog, but it won't hurt you to increase it anyway:

# echo 1 > /proc/sys/net/ipv4/tcp_syncookies

Most distributions include a "local" script that runs at the end of init, yours may have one specifically for the firewall. On Gentoo I put these rules in /etc/conf.d/local.start and on ClearOS /etc/rc.d/rc.firewall.local. Note that since NAT doesn't handle the connections themselves and only passes them through, simply turning on syncookies in your firewall will not protect everything behind it.

If you want to centralize or introduce a degree of separation between your SYN flood protection and regular servers you can use proxies, Squid and Apache both work in reverse and SOCKS proxies may work as well (don't quote me on that).

I was caught with my pants down once; I hadn't enabled syncookies on just one VM and it got SYN flooded (murphy's law of course) and that's a mistake you only make once. It underscored for me the importance of following some sort of thorough lockdown procedure before you deploy a new machine. That will be the subject of an upcoming article where I will attempt to compile a definitive checklist.

If you are running a virtualized environment or have the space for enough servers the easiest way to mitigate the harm a resource starvation attack can do to the continuity of your operations is to compartmentalize and space services out as much as possible. If you have a web server and a dns server 1-1 NATted to a public address and an attacker hits you on port 80 only the web server is going to lock up, your DNS and therefore mail and so on should continue to operate, until of course they figure it out. If you have to run DNS and mail and web and radius try to run them on different servers rather than one despite the overhead; when one plans a public-facing network one should think less in terms of bare economics and more in terms of capacity to absorb attack.