Posts Tagged ‘proc’

Removing virbr0 or Why The Fsck is My Dom0 NATting?

I noticed one of my new Xen dom0s was coughing up our friend, the ip_conntrack: table full, dropping packet message today. If you like to get your money’s worth out of your dedis the RAM available to dom0 is probably limited – meaning a correspondingly low default ip_conntrack_max. I’m sure you can see how this might be a problem, even more so if it is lower than the ip_conntrack_max of your virtual machines.

None of my previous CentOS dedis had NAT/conntrack modules loaded by default and this dom0 had no need for NAT – being of a fully bridged configuration and routing only public IPs. My first guess was that this dedi’s redhatty initrd loaded the modules through the typical mash-everything-against-the-kernel-and-see-what-sticks approach so I tried removing the NAT and connection tracking related modules:

# rmmod iptable_nat
ERROR: Module iptable_nat is in use

OK, let’s take a look at the tables:

[root@cl-t067-252cl ~]# iptables-save
# Generated by iptables-save v1.3.5 on Sat Jul 21 21:27:40 2012
*nat
:PREROUTING ACCEPT [931:50495]
:POSTROUTING ACCEPT [446:25128]
:OUTPUT ACCEPT [7:502]
-A POSTROUTING -s 192.168.122.0/255.255.255.0 -d ! 192.168.122.0/255.255.255.0 -p tcp -j MASQUERADE --to-ports 1024-65535 
-A POSTROUTING -s 192.168.122.0/255.255.255.0 -d ! 192.168.122.0/255.255.255.0 -p udp -j MASQUERADE --to-ports 1024-65535 
-A POSTROUTING -s 192.168.122.0/255.255.255.0 -d ! 192.168.122.0/255.255.255.0 -j MASQUERADE 
COMMIT

It seems I have a subnet I was not aware of…

virbr0    Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

Who put that there? libvirt, apparently. According to that article not only is our problem ip_conntrack_max, but:

However, NAT slows down things and only recommended for desktop installations.

Seems highly logical to me. Their solution didn’t look very permanent so I first deleted the symlink in the autostart directory for “default”:

# cd /etc/libvirt/qemu/networks/autostart/
# ls -lsah
total 16K
8.0K drwx------ 2 root root 4.0K Jul 21 21:17 .
8.0K drwx------ 3 root root 4.0K May 14 09:18 ..
   0 lrwxrwxrwx 1 root root   14 Jul 21 21:17 default.xml -> ../default.xml
# mv default.xml
# cd ..
# cp default.xml ~/
# /etc/init.d/libvirtd restart

That didn’t do anything at all. Still had virbr0, still had the iptables rules and still had the kernel modules.

Reboot.

Apparently that was the wrong thing to do. All of my interfaces, bridges, etc seemed to come back up (except virbr0) and the NAT/conntrack modules were missing but not a single VM was routing.

On to their method:

# virsh net-destroy default
# virsh net-undefine default
# service libvirtd restart

Everything looks great. You still have the NAT/conntrack modules loaded but we should be able to take those out one by one.

# lsmod | grep nat
iptable_nat            40517  0 
ip_nat                 52973  2 ipt_MASQUERADE,iptable_nat
ip_conntrack           91749  4 ipt_MASQUERADE,iptable_nat,ip_nat,xt_state
nfnetlink              40457  2 ip_nat,ip_conntrack
ip_tables              55329  2 iptable_nat,iptable_filter
x_tables               50377  7 xt_physdev,ipt_MASQUERADE,iptable_nat,xt_state,ipt_REJECT,xt_tcpudp,ip_tables

Reboot.

Boned again.`Now default.xml is missing (I’m assuming that’s what net-destroy does) – good thing we made a backup first!

# cd /etc/libvirt/qemu/networks/
# cp ~/default.xml ./
# ln -s default.xml autostart/
# reboot

OK. Screw it. We’ll do it the hard way.

#!/bin/bash
ifconfig virbr0 down
iptables -t nat -D POSTROUTING -s 192.168.122.0/255.255.255.0 -d ! 192.168.122.0/255.255.255.0 -p tcp -j MASQUERADE --to-ports 1024-65535
iptables -t nat -D POSTROUTING -s 192.168.122.0/255.255.255.0 -d ! 192.168.122.0/255.255.255.0 -p udp -j MASQUERADE --to-ports 1024-65535
iptables -t nat -D POSTROUTING -s 192.168.122.0/255.255.255.0 -d ! 192.168.122.0/255.255.255.0 -j MASQUERADE
iptables -D INPUT -i virbr0 -p udp -m udp --dport 53 -j ACCEPT 
iptables -D INPUT -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT 
iptables -D INPUT -i virbr0 -p udp -m udp --dport 67 -j ACCEPT 
iptables -D INPUT -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT 
iptables -D FORWARD -d 192.168.122.0/255.255.255.0 -o virbr0 -m state --state RELATED,ESTABLISHED -j ACCEPT 
iptables -D FORWARD -s 192.168.122.0/255.255.255.0 -i virbr0 -j ACCEPT 
iptables -D FORWARD -i virbr0 -o virbr0 -j ACCEPT 
iptables -D FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable 
iptables -D FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
rmmod iptable_nat
rmmod ipt_MASQUERADE
rmmod ip_nat
rmmod xt_state
rmmod ip_conntrack

HOW DO YOU LIKE ME NOW?!

ip_conntrack: table full, dropping packet.

Connections in to and out of your network are working sporadically. Your router’s dmesg is flooded with “ip_conntrack: table full, dropping packet.” What do you do?

This condition occurs when the connection tracking table has reached its limit. Connection tracking is a function of Netfilter that stores information like the source and destination IP addresses, port numbers, protocol type, state and timeout of a two-way connection. This facility lets us create sophisticated and informed Netfilter rules in a way that is not possible to accurately derive on a packet header-by-header basis.

The conntrack table takes the form of a memory structure; if there were no constraints on the size of the table it could conceivably start knocking off userspace processes if it became too large (i.e. under DoS conditions). Entries in the conntrack table expire either when their timeout has been reached or the connection has been properly closed. In cases where connections are not being closed according to protocol (poor network connectivity, DoS, spoof attack, etc.) the table can fill rapidly causing an intermittent denial of service condition on your network.

The most prominent symptom of a full connection tracking table is that your old, running connections (secure shell sessions) will continue to function while it becomes impossible to establish new ones. Worse, as the entries continue to time out and the table keeps filling up you may “get lucky” and establish a new connection here and there, making the situation much more confusing.

Depending on the situation you may have one or two options. If you have gobs and gobs of RAM available or the (for example) attack is low-volume you can adjust the entry limit of the table. First, check what the current limit is:

# cat /proc/sys/net/ipv4/ip_conntrack_max
65536

You can see how full the table currently is by running:

# cat /proc/sys/net/ipv4/netfilter/ip_conntrack_count
62168

ip_conntrack_max is determined as a multiple of how much RAM the system boots up with but generally stops at 65536 regardless. You may find that this isn’t even enough for a high volume network under normal conditions. We can adjust the limit temporarily thus:

# echo 131072 > /proc/sys/net/ipv4/ip_conntrack_max

If this turns out to be your magic bullet and you’re sure no other actions need to be taken to mitigate your particular situation add the following line to /etc/sysctl.conf:

net.ipv4.netfilter.ip_conntrack_max = 131072

To load the value from sysctl.conf run:

# sysctl -p

If you don’t have the option of throwing more RAM at the problem you may be forced to make an executive decision in the interest of preserving network services for legitimate clients. You can decrease the load on the conntrack table by removing rules that use stateful logic (i.e. containing “-t nat” or “-m state”). The brute force option is to rmmod the ip_conntrack module:

# rmmod ip_conntrack

However this may not be possible in all environments. The other option is to flush your rules and set the default policy to allow:

# iptables -P
# iptables -F

This is also typically the effect of

# /etc/init.d/iptables stop
or
# /etc/init.d/firewall stop

Defending Against the SYN Flood

A SYN flood is a type of resource-starvation denial of service (DoS) attack in which the attacker creates enough “half open” connections to render a server inaccessible to the legitimate public. Because the attack takes advantage of weaknesses in the default configuration of most TCP implementations rather than raw strength, one attacker with a relatively low bandwidth connection can quickly take down a much better equipped server. The attacker only needs to send one SYN packet to establish a half-open connection on the defending server, which will in turn attempt to reset the connection a set number of times. Since the handshake has been initialized and the connection is being logged the deed is done; the attacker doesn’t need to respond to the RST packet so the source address can be spoofed, making the task of tracing the attacker virtually impossible and the attack itself very difficult to block.

When you first come under attack it may not seem obvious  what is happening. The targeted host(s) will stop or sporadically respond to your users and you may not even be able to shell into the machine. If you can gain access to the machine the telltale signs are:

  • Services are running but using no CPU or I/O
  • Traffic graphs flatline but the host(s) remain pingable
  • Services appear to be listening on the right ports, the firewall is clear, but you can’t connect to them even locally
  • Multiple TCP-based services are affected
  • The output of netstat -n indicates an unusually high number of connections in the SYN_RECV state

All or most TCP services will seem to be affected because they all share the same connection queue. Unless your server is very overloaded, even on high traffic sites you should never see more than about 5 or 6 connections in the SYN_RECV state sustained over any period of time – particularly if you reduce the number of retries your kernel attempts as outlined below.

Fortunately there are two ways to address this problem: stack tweaking and syncookies (for BSD/linux, other implementations exist). Since the SYN flood relies on a lengthy timeout and limited number of available connections the obvious first step is to increase these limits. Having a lot of extra RAM comes in handy here since it takes RAM to track the connections. In fact, in preventing most resource starvation tactics throwing more RAM (if available) at the problem is always a good blind first step – though never the solution. We can manipulate these values through the /proc interface:

# echo 3096 > /proc/sys/net/ipv4/tcp_max_syn_backlog

tcp_max_syn_backlog limits the number of half-open connections the kernel will track. This is the limit that gets exhausted when regular users are no longer able to connect.

# echo 2 > /proc/sys/net/ipv4/tcp_syn_retries
# echo 1 > /proc/sys/net/ipv4/tcp_synack_retries

tcp_syn_retries is the number of times the kernel will wait appx. 40 seconds and send out another SYN packet when trying to establish an outbound connection. This won’t do you any good for SYN flood protection but it can mitigate the effects of some amplification/redirection techniques that use your hosts as soldiers. tcp_synack_retries limits the number of times the kernel will retry responding to a half-opened connection. The default is 5 and that means an attacker’s connection could last in the queue for up to 180 seconds. If the attacker can open an easy 300 new half-open connections in that period it becomes clear how quickly your connection queue can be overrun. Setting this value too low can cause problems for people on weak links like dialup.

Obviously this isn’t going to be enough; finite resources will always be finite resources. Syncookies are a genious little invention that in a nutshell validate that traffic coming to the host is sent from a real computer rather than a packet generator by sending a simple type of cryptographic challenge in the headers of outgoing packets that is “responded” to in the headers of incoming packets by the mechanics of tcp itself. Because spoofed traffic doesn’t have a legitimate sending host behind it to  “hear” the challenge it (probably) does not contain  a valid response and the connection is swiftly discarded.

Syncookies are not enabled by default and enabling them will override the value in tcp_max_syn_backlog, but it won’t hurt you to increase it anyway:

# echo 1 > /proc/sys/net/ipv4/tcp_syncookies

Most distributions include a “local” script that runs at the end of init, yours may have one specifically for the firewall. On Gentoo I put these rules in /etc/conf.d/local.start and on ClearOS /etc/rc.d/rc.firewall.local. Note that since NAT doesn’t handle the connections themselves and only passes them through, simply turning on syncookies in your firewall will not protect everything behind it.

If you want to centralize or introduce a degree of separation between your SYN flood protection and regular servers you can use proxies, Squid and Apache both work in reverse and SOCKS proxies may work as well (don’t quote me on that).

I was caught with my pants down once; I hadn’t enabled syncookies on just one VM and it got SYN flooded (murphy’s law of course) and that’s a mistake you only make once. It underscored for me the importance of following some sort of thorough lockdown procedure before you deploy a new machine. That will be the subject of an upcoming article where I will attempt to compile a definitive checklist.

If you are running a virtualized environment or have the space for enough servers the easiest way to mitigate the harm a resource starvation attack can do to the continuity of your operations is to compartmentalize and space services out as much as possible. If you have a web server and a dns server 1-1 NATted to a public address and an attacker hits you on port 80 only the web server is going to lock up, your DNS and therefore mail and so on should continue to operate, until of course they figure it out. If you have to run DNS and mail and web and radius try to run them on different servers rather than one despite the overhead; when one plans a public-facing network one should think less in terms of bare economics and more in terms of capacity to absorb attack.

Return top
foxpa.ws
Online Marketing Toplist
Internet
Technology Blogs - Blog Rankings

Internet Blogs - BlogCatalog Blog Directory

Technology blogs
Bad Karma Networks

Please Donate!


Made in Canada  •  There's a fox in the Gibson!  •  2010-12