=^.^=

Remote Controlled Netfilter with ClearOS API

karma

In my last post I shared a scriptlet that could be used to remote block access to your network with apache, sudo and iptables. This script suffers from the major flaw that appended iptables rules are read last and anywhere a universal ACCEPT rule preceded the script's additions they would be ignored. Another major drawback is in the rules disappearing if the firewall is reloaded, the host is rebooted and so on. Fortunately ClearOS has an easy to use API that lets you directly manipulate its firewall properties the same way as webconfig. This script doesn't require sudo rules or apache to be running. It DOES require ClearOS, and this is how to install it:

The SSL certificate webconfig provides will probably cause problems, so call the script like this if you use wget:

wget -O - 'https://192.168.8.1:81/rcleartables.php?action=deny&name=$name&ip=$ip' --no-check-certificate >/dev/null 2>&1

Note that this script requires a name variable, it should be a unique identifier containing letters and numbers (no spaces). I keep the name and other data associated with the blocks on the client end of things so the blocks can be removed by a button that executes the script with action=remove and also take care of cleaning stale blocks by way of recorded  timestamps. How you choose to extend the functionality is up to you.

<?php
/*
           # Remote Controlled iptables ClearOS API
           # June 2010 http://foxpa.ws
           # WTFPL v.2 http://foxpa.ws/wtfpl/

/// DOCUMENTATION

DANGER: Improperly configured, this script could be used by an attacker to
        block legitimate traffic.

This script adds or removes a name/IP pair pssed to it through the GET
variables "ip", "name" and "action" to or from the ClearOS Incoming Block
firewall ruleset. Valid action values are deny, and remove.
The script will exit with a 0 on error or a 1 upon successful execution.
Place the script in /var/webconfig/htdocs and chown it to webconfig.

To block an IP, one would GET request it thus:
https://address:81/rcleartables.php?action=block&ip=222.222.222.222&name=
On a successful block you would receive HTTP headers and a single 1 in the
body, or a 0 if the block was unsuccessful.

$whitelist is an array of IP addresses that should never be blocked
$allowed_clients should be an array of IP addresses allowed to have access to
this script. leave it blank to allow any host (not recommended).
$shared_secret is an optional key that can be passed to the script as an MD5
hash via GET var "key" to authenticate your application. Blank to disable.
$log_path should be the path to the specific file you would like to log actions
to. Blank to disable logging. Remember to update your log rotater's config.
*/

// CONFIGURATION
$whitelist = array();
$allowed_clients = array('');
$shared_secret = '';
$log_path = '/var/log/riptables.log';

// FUNCTIONS
function log_action($line)
{
	global $log_path, $remote_addr;
	if(!empty($log_path))
	{
		$fh = fopen($log_path, 'a');
		$date = date("Y-m-d H:i:s", time());
		fwrite($fh, "$date $remote_addr - $line\n");
		fclose($fh);
	}
}

// SANITY CHECKING
if(empty($_GET['ip']))
{
	log_action("IP not specified");
	die('0');
}
if($_GET['action'] == 'deny' and empty($_GET['name']))
{
	log_action('Rule name not specified');
	die('0');
}
$ip = $_GET['ip'];
$name = $_GET['name'];
$remote_addr = $_SERVER['REMOTE_ADDR'];
$octets = explode('.', $ip);
foreach($octets as $octet)
{
	if($octet > 255 or $octet < 0)
	{
		log_action("Invalid IP Address $ip");
		die('0');
	}
}
$ip = escapeshellcmd($ip);
if(!empty($shared_secret) and $_GET['key'] != md5($shared_secret))
{
	log_action("DANGER Invalid shared secret. Remember to encrypt your key variable with MD5.");
	die('0');
}
if(!empty($allowed_clients[0]))
{
	$valid = false;
	foreach($allowed_clients as $allowed)
	{
		if($allowed = $remote_addr)
			$valid = true;
	}
	if(!$valid)
	{
		log_action("DANGER Client is not in \$allowed_hosts array. This could be a sign of exposure.");
		die('0');
	}
}

// THE BRAINS
require_once("/var/webconfig/api/FirewallIncoming.class.php");
$fw = new FirewallIncoming();

if($_GET['action'] == 'deny')
{
	$fw->AddBlockHost($name, $ip);
	$fw->Restart();
	log_action("$ip was blocked");
	print('1');
}
elseif($_GET['action'] == 'remove')
{
	$fw->DeleteBlockHost($ip);
	$fw->Restart();
	log_action("$ip was removed");
	print('1');
}
else
{
	log_action('Invalid action parameter.');
	die('0');
}

?>
<?php
/*
# Remote Controlled iptables ClearOS API
# June 2010 http://foxpa.ws
# WTFPL v.2 http://foxpa.ws/wtfpl/

/// DOCUMENTATION

DANGER: Improperly configured, this script could be used by an attacker to
block legitimate traffic.

This script adds or removes a name/IP pair pssed to it through the GET
variables "ip", "name" and "action" to or from the ClearOS Incoming Block
firewall ruleset. Valid action values are block, and remove.
The script will exit with a 0 on error or a 1 upon successful execution.
Place the script in /var/webconfig/htdocs and chown it to webconfig.

To block an IP, one would GET request it thus:
https://address:81/rcleartables.php?action=block&ip=222.222.222.222&name=
On a successful block you would receive HTTP headers and a single 1 in the
body, or a 0 if the block was unsuccessful.

$whitelist is an array of IP addresses that should never be blocked
$allowed_clients should be an array of IP addresses allowed to have access to
this script. leave it blank to allow any host (not recommended).
$shared_secret is an optional key that can be passed to the script as an MD5
hash via GET var "key" to authenticate your application. Blank to disable.
$log_path should be the path to the specific file you would like to log actions
to. Blank to disable logging. Remember to update your log rotater's config.
*/

// CONFIGURATION
$whitelist = array();
$allowed_clients = array('');
$shared_secret = '';
$log_path = '/var/log/riptables.log';

// FUNCTIONS
function log_action($line)
{
global $log_path, $remote_addr;
if(!empty($log_path))
{
$fh = fopen($log_path, 'a');
$date = date("Y-m-d H:i:s", time());
fwrite($fh, "$date $remote_addr - $line\n");
fclose($fh);
}
}

// SANITY CHECKING
if(empty($_GET['ip']))
{
log_action("IP not specified");
die('0');
}
if($_GET['action'] == 'block' and empty($_GET['name']))
{
log_action('Rule name not specified');
die('0');
}
$ip = $_GET['ip'];
$name = $_GET['name'];
$remote_addr = $_SERVER['REMOTE_ADDR'];
$octets = explode('.', $ip);
foreach($octets as $octet)
{
if($octet > 255 or $octet < 0)
{
log_action("Invalid IP Address $ip");
die('0');
}
}
$ip = escapeshellcmd($ip);
if(!empty($shared_secret) and $_GET['key'] != md5($shared_secret))
{
log_action("DANGER Invalid shared secret. Remember to encrypt your key variable with MD5.");
die('0');
}
if(!empty($allowed_clients[0]))
{
$valid = false;
foreach($allowed_clients as $allowed)
{
if($allowed = $remote_addr)
$valid = true;
}
if(!$valid)
{
log_action("DANGER Client is not in \$allowed_hosts array. This could be a sign of exposure.");
die('0');
}
}

// THE BRAINS
require_once("/var/webconfig/api/FirewallIncoming.class.php");
$fw = new FirewallIncoming();

if($_GET['action'] == 'deny')
{
$fw->AddBlockHost($name, $ip);
$fw->Restart();
log_action("$ip was blocked");
print('1');
}
elseif($_GET['action'] == 'remove')
{
$fw->DeleteBlockHost($ip);
$fw->Restart();
log_action("$ip was removed");
print('1');
}
else
{
log_action('Invalid action parameter.');
die('0');
}

?>

Remote Controlled Netfilter with httpd, iptables and sudo

karma

For a more secure and robust method of executing commands remotely, please see Part Six of my Mass Virtual Hosting series.

Is your web server behind a Linux firewall? Have you ever wanted to quickly make a ban at the firewall level from within your site? This extremely simple script will help you accomplish just that.

ClearOS users please see this article instead.

Requirements:

  • A linux router/firewall
  • Netfilter and iptables
  • An httpd capable of running PHP
  • PHP
  • Probably sudo

Selecting and configuring your httpd goes beyond the scope of this article. I recommend lighttpd but apache works just as well for a slightly higher memory footprint. For obvious reasons, make sure the web server is only reachable from the private network.

It's important to point out here that we're talking about exposing a service with root access to iptables on your firewall; this is not without great risk and you must take every precaution to secure and restrict access to the web daemon.

Once you have your httpd configured and running install riptables.php somewhere in the default host's document root. Load it in your web browser. If you get a '0' in the page body the script is running properly. In most cases you will need to add a line to sudoers that grants your httpd access to your iptables binary.  Open sudoers thus:

# visudo

and add this line, replacing apache with the account your httpd runs under and /sbin/iptables with the full path to your iptables binary (some systems, including ClearOS, replace /sbin/iptables with a shell script; use /sbin/iptables-bin instead):

apache ALL=(root) NOPASSWD: /sbin/iptables

Save sudoers and open riptables.php. Adjust the configuration variables to reflect your environment. Depending on your system you may need to seed the logfile to give your httpd write access:

# touch /var/log/riptables.log
# chown httpd: /var/log/riptables.log

where httpd is the account your web daemon runs under. Save the script and do a test run; from a root shell on your router type:

# iptables-save | grep "222.222.222.222"

You should see no results. Load the following URL into your browser, replacing the appropriate parts such as IP address:

http://192.168.0.1/riptables.php?action=deny&ip=222.222.222.222

If your browser loaded a '1' then the block was added successfully. Run the iptables-save line again. You should see:

-A INPUT -s 222.222.222.222 -j DROP
-A FORWARD -s 222.222.222.222 -j DROP
-A OUTPUT -d 222.222.222.222 -j DROP

If so everything is in working order. If not, check your log. Remove the block thus:

http://192.168.0.1/riptables.php?action=remove&ip=222.222.222.222

An additional mechanism for authentication, which comes in handy if the IP(s) you have granted access run multiple web apps, is the shared secred. Put a passphrase into the $shared_secret variable, then when you call the script from your webapp append the key variable with an md5'd hash of the secret.
The script can be called from within your web application by crafting a GET request or as simply as running wget:

exec("wget -O - 'http://$fw_address/riptables.php?action=$action&ip=$ip' >/dev/null 2>&1");

Download riptables.php here.

Note that IP chain rules are followed in order. If there is an ACCEPT rule for 0.0.0.0/0 before the rules this script adds they must be removed or it will not work.

< ?php
/*
           # Remote Controlled iptables
           # June 2010 http://foxpa.ws/
           # WTFPL v.2 http://foxpa.ws/wtfpl/

/// DOCUMENTATION

DANGER: This script requires sudo and an httpd in most environments.
        If you don't know why that's dangerous, you don't want to use this
        script.

DANGER: Improperly configured, this script could be used by an attacker to
        block legitimate traffic.

This script blocks, allows or removes an IP as passed to it through the GET
variables "ip" and "action." Valid action values are block, allow, remove.
The script will exit with a 0 on any error or a 1 on a successful execution.

To block an IP, one would GET request it thus:
   http://server-address/riptables.php?action=block&ip=222.222.222.222
On a successful block you would receive HTTP headers and a single 1 in the
body, or a 0 if the block was unsuccessful.

To grant httpd access to iptables you may need to edit sodoers to reflect:
apache ALL=(root) NOPASSWD: /sbin/iptables
where apache is whatever account your web daemon of choice runs under.

$whitelist is an array of IP addresses that should never be blocked
$allowed_clients should be an array of IP addresses allowed to have access to
this script. leave it blank to allow any host (not recommended).
$shared_secret is an optional key that can be passed to the script as an MD5
hash via GET var "key" to authenticate your application. Blank to disable.
$sudo_path should be the complete path to your sudo binary. Leave blank
if you do not require sudo.
$iptables_path should be set to reflect your system's configuration. Typical
locations include /sbin and /usr/sbin. Be sure to leave off the trailing slash.
$iptables_bin shoild be the name of your iptables binary, usually just iptables
except on systems where it has been replaced with a shell script.
$log_path should be the path to the specific file you would like to log actions
to. Blank to disable logging. Remember to update your log rotater's config.
*/

// CONFIGURATION
$whitelist = array();
$allowed_clients = array();
$shared_secret = '';
$sudo_path = '/usr/bin/sudo';
$iptables_path = '/sbin';
$iptables_bin = 'iptables';
$log_path = '/var/log/riptables.log';

// FUNCTIONS
function log_action($line)
{
	global $log_path, $remote_addr;
	if(!empty($log_path))
	{
		$fh = fopen($log_path, 'a');
		$date = date("Y-m-d H:i:s", time());
		fwrite($fh, "$date $remote_addr - $line\n");
		fclose($fh);
	}
}

function ip_deny($ip)	// This function drops all packets from an IP
{
	global $sudo_path, $iptables_path, $iptables_bin;
	$string1 = "$sudo_path $iptables_path/$iptables_bin -A INPUT -s $ip -j DROP";
	$string2 = "$sudo_path $iptables_path/$iptables_bin -A FORWARD -s $ip -j DROP";
	$string3 = "$sudo_path $iptables_path/$iptables_bin -A OUTPUT -d $ip -j DROP";
	exec($string1, $output1, $return1);
	exec($string2, $output2, $return2);
	exec($string3, $coutput3, $return3);
	if($return1 != 0 or $return2 != 0 or $return3 != 0) // Check for non-zero exit status
	{
		log_action("Attempted to block $ip but failed. Error:\nString1: $string1\nOutput1: {$output1[0]}\nString2: $string2\nOutput2: {$output2[0]}\nString3: $string3\nOutput3: {$output3[0]}\n");
		return 0;
	}
	else
	{
		log_action("$ip was blocked.");
		return 1;
	}
}

function ip_accept($ip) // This function explicitly allows an address, useful where DROP is default
{
	global $sudo_path, $iptables_path, $iptables_bin;
	$string1 = "$sudo_path $iptables_path/$iptables_bin -A INPUT -s $ip -j ACCEPT";
	$string2 = "$sudo_path $iptables_path/$iptables_bin -A FORWARD -s $ip -j ACCEPT";
	$string3 = "$sudo_path $iptables_path/$iptables_bin -A OUTPUT -d $ip -j ACCEPT";
	exec($string1, $output1, $return1);
	exec($string2, $output2, $return2);
	exec($string3, $coutput3, $return3);
	if($return1 != 0 or $return2 != 0 or $return3 != 0) // Check for non-zero exit status
	{
		log_action("Attempted to accept $ip but failed. Error:\nString1: $string1\nOutput1: {$output1[0]}\nString2: $string2\nOutput2: {$output2[0]}\nString3: $string3\nOutput3: {$output3[0]}\n");
		return 0;
	}
	else
	{
		log_action("$ip was accepted.");
		return 1;
	}
}

function ip_remove($ip) // This function undelicately removes a block or explicit accept from an IP
{
	global $sudo_path, $iptables_path, $iptables_bin;
	exec("$sudo_path $iptables_path/$iptables_bin -D INPUT -s $ip -j DROP");
	exec("$sudo_path $iptables_path/$iptables_bin -D FORWARD -s $ip -j DROP");
	exec("$sudo_path $iptables_path/$iptables_bin -D OUTPUT -d $ip -j DROP");
	exec("$sudo_path $iptables_path/$iptables_bin -D INPUT -s $ip -j ACCEPT");
	exec("$sudo_path $iptables_path/$iptables_bin -D FORWARD -s $ip -j ACCEPT");
	exec("$sudo_path $iptables_path/$iptables_bin -D OUTPUT -d $ip -j ACCEPT");
	log_action("$ip was removed.");
	return 1;
}

// SANITY CHECKING
if(!empty($_GET['ip']))
	$ip = $_GET['ip'];
else
	die('0');
$remote_addr = $_SERVER['REMOTE_ADDR'];
$octets = explode('.', $ip);
foreach($octets as $octet)
{
	if($octet > 255 or $octet < 0)
	{
		log_action("Invalid IP Address $ip");
		die('0');
	}
}
$ip = escapeshellcmd($ip);
if(!empty($shared_secret) and $_GET['key'] != md5($shared_secret))
{
	log_action("DANGER Invalid shared secret. Remember to encrypt your key variable with MD5.");
	die('0');
}
if(!empty($allowed_clients[0]))
{
	$valid = false;
	foreach($allowed_clients as $allowed)
	{
		if($allowed = $remote_addr)
			$valid = true;
	}
	if(!$valid)
	{
		log_action("DANGER Client is not in \$allowed_hosts array. This could be a sign of exposure.");
		die('0');
	}
}

// THE BRAINS
if($_GET['action'] == 'deny')
{
	foreach($whitelist as $whiteip)
	{
		if($ip == $whiteip)
			die('0');
	}
	ip_remove($ip);
	print(ip_deny($ip));
}
elseif($_GET['action'] == 'accept')
{
	ip_remove($ip);
	print(ip_accept($ip));
}
elseif($_GET['action'] == 'remove')
{
	ip_remove($ip);
	print('1');
}
else
{
	log_action('Invalid action parameter.');
	die('0');
}

?>

Geofence with iptables: Blocking Countries at the Firewall

karma

In some situations one may find it useful to block entire countries or restrict access to only one or a few. This is a technique known as geofencing, and if you've ever tried to watch a video only to be told that it's not available in your region you have been the victim of it. Geofencing, like geolocation, is possible because blocks of IP address space are handed out to specific countries, and additional details such as the province or city of the address holder may be obtained through reverse-whois. Data collected below the country level can be unreliable, often the location of a head office for a national ISP will appear to be the source of all if its users.

ahorli on the Clear forums just posted their geofencing solution for ClearOS at http://www.clearfoundation.com/component/option,com_kunena/Itemid,232/catid,7/func,view/id,10382/. It is intended to block specific countries that tend to produce a high volume of spam and automated attacks (in this case, Russia and China). I thought it would be neat to reverse the script so I could block every country except a specific one or two. Obviously this kind of tactic isn't going to stop someone who really wants into your box from outside the geofence - there's everything from proxies to VPNs to exploit. My interest here is in reducing automated attacks to those originating in the motherland, because that's the only place I expect to be connecting to our hypothetical server.

Download this script and put it somewhere appropriate, I would suggest /sbin or /usr/sbin. In order to work this requires that your default INPUT policy is DROP or REJECT. As mentioned above, geofencing is more art than science and when I ran this script my own subnet was not unblocked, I strongly recommend including your headquarters in the ALLOWSUBNET variable or you may find yourself one day without access. As you can see MAXZONEAGE is set to 6, so if we pop this in cron.weekly it should refresh its fence list every week. You should add the script to your firewall or local init scripts, on ClearOS use /etc/rc.d/rc.firewall.local.

Blocking ICMP Echo Requests (Pings) to your Linux Firewall with iptables

karma

It is generally considered poor form and a violation of some arcane RFC for a host to ignore ICMP echo requests (common "pings") and turning them off does not afford you any additional "security" per se. That being said there are a number of very good reasons you might want to ignore pings in the wild. Due to the amount of time it takes to accurately port scan a host, bulk scanning operations generally ping a host to determine if it is worth spending the time and resources needed to scan the address. If your host is configured to drop pings you instantly take yourself off the radar of such robots, sparing your resources for say combating directed attacks rather than the automated attacks that follow such scans.

If you're dealing with a single host it isn't necessary to specify the IP or interface but on a firewall you probably want to be able to ping its internal interface from the internal network. We're going to assume that eth0 represents the external interface:

# iptables -A INPUT -i eth0 -p icmp -m icmp --icmp-type 8 -j DROP

To specify an IP or subnet use the -s flag in place of -i. The --icmp-type 8 flag specifies that only ICMP echo requests are to be blocked, we want to leave type 0 replies alone so hosts behind and including the firewall can ping and receive responses from hosts beyond the router/firewall.

You may have existing chains that accept pings, you must delete these. For example:

# iptables-save | grep icmp -A INPUT -i eth0 -p icmp -m icmp --icmp-type 0 -j ACCEPT -A INPUT -i eth0 -p icmp -m icmp --icmp-type 3 -j ACCEPT -A INPUT -i eth0 -p icmp -m icmp --icmp-type 8 -j ACCEPT -A INPUT -i eth0 -p icmp -m icmp --icmp-type 11 -j ACCEPT -A INPUT -i eth0 -p icmp -m icmp --icmp-type 8 -j DROP

You can see our rule at the bottom. The third rule from the top conflicts with this so let's remove it:

# iptables -D INPUT -i eth0 -p icmp -m icmp --icmp-type 8 -j ACCEPT

As you can see, it's as simple as switching the add (-A) flag to delete (-D) and now our rule works. To automate this process you should add these lines to your firewall startup script or your "local" init script where available.

To save these rules on gentoo make sure you have the iptables init script in the default runlevel and run:

# /etc/init.d/iptables save

if there is no conflicting firewall script that adds an ACCEPT rule for ICMP requests. Otherwise you may wish to use /etc/conf.d/local.start.

ClearOS users should add something like this to /etc/rc.d/rc.firewall.local:

/sbin/iptables -D INPUT -i eth0 -p icmp -m icmp --icmp-type 8 -j ACCEPT /sbin/iptables -A INPUT -i eth0 -p icmp -m icmp --icmp-type 8 -j DROP

Gentoo Xen 4 Migration

karma

I was stoked to find out Xen 4 had finally made it into portage a couple weeks ago, The improvements are so sweet I had been checking gentoo-portage.com every few days or so. Let's take a look at a few of the advances Xen has made since 3x:

  • Better performance and scalability: 128 vcpus per guest, 1 TB of RAM per host, 128 physical CPUs per host (as a default, can be compile-time increased to lots more).
  • Blktap2 for VHD image support, including high-performance snapshots and cloning.
  • Improved IOMMU PCI passthru using hardware accelerated IO virtualization techniques (Intel VT-d and AMD IOMMU).
  • VGA primary graphics card passthru support to an HVM guest for high performance graphics using direct access to the graphics card GPU from the guest OS.
  • TMEM allows improved utilization of unused (for example page cache) PV guest memory. more information: http://oss.oracle.com/projects/tmem/
  • Memory page sharing and page-to-disc for HVM guests: Copy-on-Write sharing of identical memory pages between VMs.This is an initial implementation and will be improved in upcoming releases.
  • New Linux pvops dom0 kernel 2.6.31.x as a default, 2.6.32.x also available. You can also use linux-2.6.18-xen dom0 kernel with Xen 4.0 hypervisor if you want.
  • Netchannel2 for improved networking acceleration features and performance, smart NICs, multi-queue support and SR-IOV functionality.
  • Online resize of guest disks without reboot/shutdown.
  • Remus Fault Tolerance: Live transactional synchronization of VM state between physical servers. run guests synchronized on multiple hosts simultaneously for preventing downtime from hardware failures.
  • RAS features: physical cpu/memory hotplug.
  • Libxenlight (libxl): a new C library providing higher-level control of Xen that can be shared between various Xen management toolstacks.
  • PV-USB: Paravirtual high-performance USB passthru to both PV and HVM guests, supporting USB 2.0 devices.
  • gdbsx: debugger to debug ELF guests
  • Support for Citrix WHQL-certified Windows PV drivers, included in XCP (Xen Cloud Platform).
  • Pygrub improvements: Support for PV guests using GRUB2, Support for guest /boot on ext4 filesystem, Support for bzip2- and lzma-compressed bzImage kernels

What tickles me the most is the Remus Fault Tolerance, it basically lets you run a standby instance of a VM on a different physical server and it constantly updates that VM of the master's status, I/O etc. If the master VM dies, the standby kicks in so fast there may be no perceivable downtime. I've been dying for something that provides solid HA that's well supported and works out of the box - not to mention does its job transparently for years. Now that it's a core feature 0f Xen, competing technologies will be compelled to introduce their own easy-to-use HA solutions which I hope could usher in a golden age of reliability.

The original intent was to migrate 9 physical 32-bit servers the week it came out, most of them running kernel 2.6.21 on Xen 3.2.1. This was not to be, I was determined to make the new 2.6.32 kernel work (I had heard that .32 was going to be the new .18 in terms of adoption/support) and the thing just won't work with megaraid. I haven't tried it with cciss yet and I don't intend to, for the time being I have downgraded all of the dom0 kernels to 2.6.18. Things seem to be very stable now and much faster.

PAE, or physical address extension allows 32-bit processors to address up to 64GB of memory. When I first started working with Xen I had no idea that I had omitted PAE from my hypervisor build (it is not a default USE flag) nor that every shrinkwrapped Xen kernel out there required it. To make matters worse, the 2.6.21 dom0 kernel I was using on all of the servers for the sake of consistency lacked the ability to enable PAE at all - even by manually editing the .config, something I still haven't figured out. That kernel was eventually hard masked, then removed from portage. This situation cost me a lot of time because every new image I wanted to import required special preparation to work with my "foregin" domU kernel and without pleasantries like initramfs and pygrub.

I'm going to start off by showing you the make.conf that will be used in this article:

CFLAGS="-O2 -march=pentium4 -pipe -fomit-frame-pointer -mno-tls-direct-seg-refs -fstack-protector-all" #CFLAGS="-O2 -march=pentium4 -pipe -fomit-frame-pointer -mno-tls-direct-seg-refs" CXXFLAGS="${CFLAGS}" CHOST="i686-pc-linux-gnu" MAKEOPTS="-j4" GENTOO_MIRRORS="http://gentoo.osuosl.org/ http://distro.ibiblio.org/pub/linux/distributions/gentoo/ http://www.gtlib.gatech.edu/pub/gentoo " SYNC="rsync://rsync.namerica.gentoo.org/gentoo-portage" USE="-alsa cracklib curlwrappers -gnome -gtk -kde -X -qt sse png snmp cgi usb cli berkdb bzip2 crypt curl ftp ncurses snmp xml zip zlib sse2 offensive geoip nptl nptlonly acm flask xsm pae pygrub xen" FEATURES="parallel-fetch -collision-protect" LINGUAS="en_CA en"

If you're upgrading from an existing Xen installation you may want to disable collision protection, past experience with 3x upgrades has been that sometimes portage will see every Xen-related file in /boot as a potential conflict. Note the second CFLAGs line that's commented; some packages don't compile well (or at all) with Stack Smashing Protection (particularly glibc) so I update them individually with the second CFLAGS enabled before any sort of emerge world or deep update. SSP is enabled by default if you are using the hardened profile. I choose not to use the hardened profile because it can be needlessly problematic/inflexible and in this application the term "hardened" is misleading. If your dom0 isn't going to be exposed to the Internet (and it certainly should not be) it might be safe to omit -fstack-protector-all altogether but there is no such thing as paranoia. Also bear in mind that an attacker gaining access to a dom0 can be more devastating than an attacker gaining access to all of the VMs running on it individually.

Depending on the circumstances at the time you read this, one may or may not have to add the following to /etc/portage/package.keywords:

app-emulation/xen app-emulation/xen-tools sys-kernel/xen-sources

Sync portage and run an emerge --update --deep --newuse world --ask, if you see xen 4 in the package list you're on the right track. Compile away. You might be interested in this article on global updates with gentoo.

Once Xen has been upgraded it's time to build the new kernels. Follow the usual routine, making sure to enable xen backend drivers in the dom0 and frontend drivers in the domU. I like to make a monolithic domU kernel so there's no mess with installing or updating modules to the VMs. Make sure you have IP KVM/Console redirection if you're going to be booting this machine remotely and a non-xen fallback kernel configured in /etc/grub/grub.conf in case the hypervisor fails. Xen and some x86 BIOSes can be configured to use a serial console; a null modem to another server in the rack is often all you need.

I got all sorts of shit from the 2.6.32/34 kernels, for instance the kernel won't build properly if you enable Export Xen atributes in sysfs (on by default in 2.6.32-xen-r1). I got this message at the end of make and was not particularly successful at tracking down solutions:

WARNING: vmlinux.o (__xen_guest): unexpected non-allocatable section. Did you forget to use "ax"/"aw" in a .S file? Note that for example <linux/init.h> contains section definitions for use in .S files.

I don't know what - if anything - needs the Xen /sys interface to work, so it's probably no big deal.

When trying to compile at least versions 2.6.32 and 2.6.34 if the Xen compatibility code is set to 3.0.2 you can expect this error at the end of building the kernel:

  MODPOST vmlinux.o
WARNING: vmlinux.o (__xen_guest): unexpected non-allocatable section.
Did you forget to use "ax"/"aw" in a .S file?
Note that for example
 contains
section definitions for use in .S files.                                                                          

  GEN     .version
  CHK     include/generated/compile.h
  UPD     include/generated/compile.h
  CC      init/version.o
  LD      init/built-in.o
  LD      .tmp_vmlinux1
ld: kernel image bigger than KERNEL_IMAGE_SIZE
ld: kernel image bigger than KERNEL_IMAGE_SIZE
make: *** [.tmp_vmlinux1] Error 1

This seems to be fixable by upping the lowest version to at least 3.0.4.

In all cases 2.6.29, 2.6.32 and the yet-un-portaged 2.6.34 kernels panicked on bootup if the megaraid driver was compiled in or made available in an initrd. After four days of dusk-until-dawn tinkering I got tired of fucking with it and decided to go with 2.6.18, which compiled and booted without a hitch.

Previously I had been doing all sorts of contorted things to the networking configuration but since I was dealing with a clean slate anyway I decided to set things up the Gentoo way. The Gentoo way of Xen networking is to abandon Xen networking. Suddenly life's great. In the set of four servers I migrated this week all of them have one physical interface on an external-facing VLAN and another interface on an internal VLAN as depicted in the diagram (left). I wanted to make it so I could take a router VM and move it from physical server to physical server as quickly as possible, and this is how I did it (thanks xming on the Gentoo forums):

  1. Edit /etc/conf.d/rc and change RC_PLUG_SERVICES to look like this: RC_PLUG_SERVICES="!net.*" this will prevent Gentoo's hotplug script from automatically starting your interfaces on bootup
  2. Remove existing interfaces from default runlevel, i.e. rc-update del net.eth0 default
  3. Configure one bridge that connects to the external VLAN and one bridge that connects to the internal VLAN
    config_extbr0=("null") bridge_extbr0="eth0"

    config_xenbr0=("x.x.x.x/24")
    bridge_xenbr0="eth1"
    routes_xenbr0=("default via x.x.x.y")

  4. Create init scripts for the new bridges, i.e cd /etc/init.d; ln -s net.lo net.extbr0; ln -s net.lo net.xenbr0
  5. Add the bridges to the default runlevel: rc-update add net.extbr0 default; rc-update add net.xenbr0
  6. Edit /etc/xen/xend-config.sxp and comment out (network-script network-bridge) and add (vif-script vif-bridge bridge=xenbr0)
  7. Edit VM configuration files, edit VIFs to connect to the appropriate bridge, i.e: vif = ['mac=00:16:3e:XX:XX:XX,bridge=xenbr0' ]

I found that my Gentoo VMs needed this line added to their config in order to get any connection to xenconsole at all:

extra="xencons=tty console=tty1"

My ClearOS VMs, for the first time running the kernel that ships with them, needed a more dramatic approach. I added this line to their config files:

extra="xencons=ttyS"

and then in the VM's /etc/inittab I added this line to make it talk on what would be its serial port:

s0:12345:respawn:/sbin/mingetty ttyS0

I had some minor complaints from init about the dom0 kernel being too old for some udev feature so I added this line to /etc/portage/package.mask and rebuilt it:

>sys-fs/udev-124-r2