Simulating WAN network delay

Motivation

When testing the version of AutoFS from RHEL-3, Update 4 in a global WAN environment I discovered an interesting bug. When logged into a client machine in London, every now & attempts to access a mount from an NFS server in Boston would fail perhaps 25% of the time. The server was clearly online, since mounts from other clients would work fine, and there were no obvious errors in the logs from either the client or server. After a little more testing I discovered that mounts between London and LA would fail 100% reliably [sic]. This led me to believe that there was some part of the AutoFS mount process which was sensitive to network round trip time. Sure enough, there was a piece of code which checked for 'livliness' of the server by sending an RPC ping and had a fixed timeout of 100ms. Well on my particuarly WAN, ping times between London and Boston were 100ms +/- 5ms - no wonder it failed seemingly at random. Once identified the bug itself was trivially fixable by letting the code fallback to a longer timeout, if no server being tested had replied within the short timeout. I then got on to thinking about how you might build a system under which AutoFS could be reliably tested in a lab, without requiring co-location of part of the system half-way around the globe.

Planned solution

The obvious solution is to figure out a way to introduce arbitrary network packet delay between two hosts on the same subnet. I considered two possibilities

Setup a virtual interface on one of the hosts using the tun driver, and have the usre space daemon processing it queue up packets for Xms before sending them out. A few routing table entries and IPtables rules could then be used to redirect traffic on eth0 via the take tun0 interface.
Use the IPtables QUEUE target to intercept traffic on the actual network interface, redirecting to a userspace program to delay them

In the end I chose the second one since it seem to potentially require less work to implement and setup.

Implementation

CPAN has a Perl module IPQueue.pm which backends onto the libipq.so library which is part of IPTable codebase. With this, the Perl userspace daemon becomes obscenely easy to write. In pseudo-code

forever
  foreach queued packet
     if queue time &gt; requested delay
        accept packet

  wait for an incoming packet

  add packet to queue

forever

foreach queued packet

if queue time > requested delay

accept packet

wait for an incoming packet

add packet to queue

The actual code is in the script delay-net.pl.

Running it

The first task is to load up the neccessary iptables kernel modules

modprobe iptable_filter
modprobe ip_queue
modprobe ipt_ttl

modprobe iptable_filter

modprobe ip_queue

modprobe ipt_ttl

Now start the daemon - its important we do this before adding the IPTables rules to QUEUE traffic, otherwise you'll potentially lock yourself out of the machine! It takes the number of milliseconds delay its only command line argument, so lets delay for 300 milliseconds

./delay-net.pl 300

1	./delay-net.pl 300

Then add a rule to redirect incoming traffic from either the entire network, or better, from a particular host.

iptables -A INPUT --source 192.168.16.4 -j QUEUE

1	iptables -A INPUT --source 192.168.16.4 -j QUEUE

If you now run 'ping' from the host mentioned in the iptables rule, you should see an nice (approximate) 300ms delay.

Downloads

For convenience here are downloads of the various packages required to run on a RHEL-3 system

本作品采用知识共享署名-非商业性使用 4.0 国际许可协议进行许可

Simulating WAN network delay

Simulating WAN network delay

Motivation

Planned solution

Implementation

Running it

Downloads

相关文章

文章评论