Debian - HFSC + IMQ (Manchester - 17.05.2007)

Author: Rafal Rajs (ElessaR) (email: elessar1@poczta.wp.pl)

Introduction

Hello all again,

Welcome in my next small version of traffic shaping solution. Finally, it is time to update it to use current 2.6 Linux Kernel. In this article I will focus on new queuing technology: HFSC. I've decided to change it after not very good experience with HTB, especially in VoIP, low latency area.

This solution is based on IMQ as the previous one. Additionally, I've added to it the support for CONNLIMIT (limit TCP connections per user, IP, service etc) and IPP2P (layer7 filtering solution for detecting P2P traffic) functionality.

Let's get to the point.

Why am I doing this

I repeat many points from my previous article (polish only), because my requirements haven't changed a lot since then.

I've created my solution mainly because:

1. most of the examples, you can find in internet, are based on outgoing traffic on a few network interfaces

This point causes in many cases the shaping rules very complex and forces us to shape LAN traffic as well. In order to exclude LAN traffic, we have to write special exception rules, which is not convenient. Outgoing and Incoming traffic has to be shaped separately in these solutions, which is not a good idea in certain environments as well.

2. lack of complete solutions based on IMQ

IMQ is a virtual device, which can be used to shape both incoming and outgoing traffic together from one or many network interfaces. The examples, which can be found are quite simple and usually describe the case when several network interfaces are used (for example 2 network connections to the Internet).

3. most of solutions are based on compiling kernel, iptables from sources, where their installation is not "registered" in the system

This point is quite important from the maintenance point of view. If our modified components are not registered in the OS, we cannot check what files are included in this component, where they reside in the filesystem, what version they have and we cannot deinstall them cleanly.

I will present the solution, which should solve the above problems:

1. complete shaping solution of incoming and outcoming traffic on only external(internet) interface based on HFSC and IMQ
2. creation of debian packages with modified version of the kernel 2.6 and iptables

Before we move to the implementation stage, first we need to explain initial steps.

Example network diagram

This article focuses on small home networks. Diagram of the example network can be found below:

The gateway has 2 network interfaces. It functions in our case as firewall, NAT and Traffic Shaper. It can serve email, WWW and SAMBA (sharing files and folders with Windows machines) services as well.

Internet connection

In the beginning, we have to estimate what kind of network connection we have, what download and upload rates are and check how many simultaneous connections our net connection will survive (or how many simultaneous connections will allow having low latency on the net connection).

We need to find a server in ISP network (FTP is perfect), where we will execute our tests. Try to download a big file, note the rate. Try to upload a big file to the server, note the rate. Try to upload the same file but on the 70-90% of connection bandwidth, at the same time start to download a file again. Observe the rates. Did they change? Execute the same tests several times to be completely sure.

The issue with UPLOAD traffic I've noticed lately is latency. Even if I use only 50-60% upload in my ADSL connection using single SCP session, pinging one of my ISP servers reveals strange increase of the latency for some percent of the packets:

80% upload (300kbps) Pinging fifteenth.www.demon.net [194.159.80.39] with 32 bytes of data: Reply from 194.159.80.39: bytes=32 time=54ms TTL=250 Reply from 194.159.80.39: bytes=32 time=33ms TTL=250 Reply from 194.159.80.39: bytes=32 time=167ms TTL=250 Reply from 194.159.80.39: bytes=32 time=45ms TTL=250 Reply from 194.159.80.39: bytes=32 time=109ms TTL=250 Reply from 194.159.80.39: bytes=32 time=110ms TTL=250 Reply from 194.159.80.39: bytes=32 time=66ms TTL=250 Reply from 194.159.80.39: bytes=32 time=72ms TTL=250 Reply from 194.159.80.39: bytes=32 time=131ms TTL=250 Reply from 194.159.80.39: bytes=32 time=33ms TTL=250 Reply from 194.159.80.39: bytes=32 time=32ms TTL=250 Reply from 194.159.80.39: bytes=32 time=86ms TTL=250 Reply from 194.159.80.39: bytes=32 time=153ms TTL=250

upload 50% (190kbps) Pinging fifteenth.www.demon.net [194.159.80.39] with 32 bytes of data: Reply from 194.159.80.39: bytes=32 time=31ms TTL=250 Reply from 194.159.80.39: bytes=32 time=35ms TTL=250 Reply from 194.159.80.39: bytes=32 time=34ms TTL=250 Reply from 194.159.80.39: bytes=32 time=35ms TTL=250 Reply from 194.159.80.39: bytes=32 time=35ms TTL=250 Reply from 194.159.80.39: bytes=32 time=123ms TTL=250 Reply from 194.159.80.39: bytes=32 time=36ms TTL=250 Reply from 194.159.80.39: bytes=32 time=324ms TTL=250 Reply from 194.159.80.39: bytes=32 time=181ms TTL=250 Reply from 194.159.80.39: bytes=32 time=43ms TTL=250 Reply from 194.159.80.39: bytes=32 time=34ms TTL=250

If anybody knows the reason of this behavior, let me know, please. My explanation is that ADSL connections are *really* optimized for download only.

Another issue is the number of simultaneous connections. For testing we can use a P2P application like torrent for example. We need to set its upload/download limits to 60-70% of connection bandwidth and for example initially set only 100 max connections. The number of max connection should be increased on the next tries up to 500-600. During the test you need to observe the latency in ping responses. You should find the proper number of max connections when the latency of pings is acceptable. Next you need to divide this value by the number of the users and use it in CONNLIMIT rule.

MAX CONNECTION = 200
NUMBER of USERS = 10
MAX NUMBER of connections per user = 20

CONNLIMIT will allow you limit every user to have max. 20 (for example) TCP connections.

Possible solutions

If the upload traffic decreases the download rate, you are in the trouble. Nowadays, there is no stable implementation that will allow you shape upload and download together. To be more precise, there is no implementation if you want shape your local gateway(where IMQ is set) traffic. When you try to shape upload, download traffic and local server traffic on one IMQ device, your system will freeze after a few minutes or kernel will just panic. This happens on kernel 2.6 and IMQ implementation from www.linuximq.net.

I had very similar problems with IMQ from www.linuximq.net and kernel 2.4. Then I received kernel panics when I was using local traffic shaping. The solution I found then was to use different IMQ implementation done by Jiri Fojtasek (hyperfighter.jinak.cz/qos/). This allowed me to create one general limit for connection: MAX_TRANSER as the root queue. MAX_DOWNLOAD and MAX_UPLOAD were children of the MAX_TRANSFER queue (details - polish only).

Unfortunately, Jiri's implementation does not exist anymore, therefore I can't use download, upload and local traffic shaping on one IMQ device. I need to use 2 IMQ devices: one for upload, and another one for download. This means that we need to shape upload and download separately, but nevertheless shaping local traffic is possible, and the system is stable when we use this solution.

Another alternative is to use IFB device, which is built into the new 2.6 kernel. It is supposed to be a stable replacement of IMQ. Unfortunately, this device cannot work with NAT effectively as IMQ does now. You will not be able to shape packets coming/going to clients behind NAT, when using the external interface only. The author of IFB knows about the problem with NAT, but currently he does not have time to add support for NAT (May 2007).

Traffic shaping

In order to ensure that every user in our network can use the internet connection at any time, we have to create network traffic limits. They are called Traffic Shaping limits or Quality of Service (QoS). These limits will allow accessing the internet connection, even when all users are connected.

For this purpose, we set the following options:

1. Limit the available network bandwidth with guaranteed low latency for high priority traffic

After measuring the available download and upload bandwidth, we have the MAX_DOWNLOAD and MAX_UPLOAD variables set. In order to ensure that the internet connection will be available to all users, we divide MAX_DOWNLOAD and MAX_UPLOAD variables per number of users in our network. In that way, every user in our network will have some bandwidth of internet connection reserved for him to use it anytime he wants.

In order to make the traffic shaping algorithm efficient, the sum of all user reserved bandwidth has to be less that maximum network limits. It will be equal to 80% of the maximum values in our case.

Moreover, our setup ensures that a user can receive the maximum network bandwidth when he is the only user who is connected to the network. This is possible by using UL rate parameter in HFSC script equal to maximum network bandwidth. SC parameter contains reserved (guaranteed) bandwidth for every user.

The diagram below presents our structure of queues:

For every user, two queues are created. First queue contains high priority traffic, second contains the rest of the traffic. For high priority traffic PFIFO queue (First In First Out) is used to ensure that the delay is as low as possible and packets are forwarded in the right order. For the rest of traffic SFQ queue is used. SFQ tries to divide the connection in a fair manner and does not allow one connection dominating the available bandwidth.

2. Limit the number of available TCP connections

In some cases, an internet connection can be vulnerable to the high number of simultaneous connections. If you have that type of the connection, limiting TCP connections per user can help you in keeping stable and fast internet connection. This can be achieved using CONNLIMIT module in iptables.

Installation

First, we need to get proper patches:

1. Patch IMQ for kernel 2.6.18
2. Patch IMQ for IPTABLES 1.3.6
3. POM patches

Preparing KERNEL
solaris# apt-get install linux-source-2.6.18 solaris# cd /usr/src solaris# tar -jxf linux-source-2.6.18.tar.bz2 solaris# ln -s linux-2.6.18 linux solaris# cd linux

Patch IMQ solaris# patch -p1 < ../linux-2.6.18-imq1.diff

I assume that you are using the default debian kernel config. It contains many selected options but most of them are compiled as modules. It means that you need to set explicitly the load of specific modules (usually in /etc/modules) in order to be able to use them. Not defined modules will not be loaded and of course they will not use the memory. Therefore, it should be efficient enough.

I need to write here a small update about what I have written above. The load of kernel modules has changed at one point in 2.6 kernels (probably ~2.6.15) with introducing of UDEV (hotplug management daemon) package in debian distribution. This daemon loads kernel modules automatically when appriopriate hardware is detected. In order to disable loading certain modules, use "/etc/modprobe.d/blacklist" config file.

The most important advantage of basing your config on standard distribution config is that it will be very easy to upgrade the kernel and the following configuration. Time is precious as you should know ;)

solaris# cp /boot/config-2.6.18-3-686 /usr/src/linux-source-2.6.18/.config (copy your kernel config) solaris# make menuconfig

we have to modify only: # Networking --> Networking options --> Network packet filtering ---> IP: Netfilter Configuration ---> IMQ target support (NEW) the same for IP6 if you want to use it

# Device Drivers ---> Network device support ---> IMQ (intermediate queueing device) support (NEW) # (2) Number of IMQ devices # IMQ behavior (PRE/POSTROUTING)(IMQ AB) below: Choosing this option will make IMQ hook like this: PREROUTING: After NAT POSTROUTING: Before NAT
Preparing IPTABLES
solaris# apt-get source iptables solaris# cd /usr/src/iptables-1.3.6.0debian1/ solaris# dch -v 1.3.6.0debian1-99.imq (add notes about the changes) solaris# dpkg-buildpackage -us -uc solaris# cd debian/build/iptables_profectio/

Patch IMQ: solaris# patch -p1 < ../../../../iptables-1.3.6-imq.diff

CONNLIMIT and IPP2P

Apply the CONNLIMIT and IPP2P patches to the kernel and iptables sources: solaris# tar -jxf patch-o-matic-ng-20070508.tar.bz2 solaris# cd patch-o-matic-ng-20070508 solaris# ./runme --download Successfully downloaded external patch geoip Successfully downloaded external patch condition Successfully downloaded external patch IPMARK Successfully downloaded external patch connlimit Successfully downloaded external patch ipp2p Successfully downloaded external patch time Successfully downloaded external patch ACCOUNT Hey! KERNEL_DIR is not set. Where is your kernel source directory? [/usr/src/linux] Hey! IPTABLES_DIR is not set. Where is your iptables source code directory? [/usr/src/iptables] /usr/src/iptables-1.3.6.0debian1/debian/build/iptables_profectio/ solaris# ./runme connlimit solaris# ./runme ipp2p

Compiling KERNEL
solaris# make-kpkg --append_to_version -686.imq.by.elessar --revision=rev.01 debian solaris# dch (add notes about the changes) solaris# make-kpkg --initrd --append_to_version -686.imq.by.elessar --revision=rev.01 kernel_image

Select as modules the following options, which will appear: Connections/IP limit match support (IP_NF_MATCH_CONNLIMIT) [N/m/?] (NEW) IPP2P match support (IP_NF_MATCH_IPP2P) [N/m/?] (NEW)

Compiling IPTABLES

solaris# cd /usr/src/iptables-1.3.6.0debian1/debian/build/iptables_profectio/extensions/ solaris# chmod +x .IMQ-test* solaris# pico Makefile (we add on the end of the PF_EXT_SLIB variable "IMQ ipp2p" string) solaris# cd .. solaris# make solaris# cd ../../../ solaris# dpkg-buildpackage -us -uc -nc (We use [-nc] option in building the package command, in order to keep our modification in the source code.)

Package verification
solaris# dpkg -c iptables_1.3.6.0debian1-100.imq_i386.deb | grep IMQ -rw-r--r-- root/root 3776 2007-05-08 22:45 ./lib/iptables/libipt_IMQ.so solaris# dpkg -c iptables_1.3.6.0debian1-100.imq_i386.deb | grep conn -rw-r--r-- root/root 4400 2007-05-08 22:45 ./lib/iptables/libipt_connlimit.so -rw-r--r-- root/root 4288 2007-05-08 22:45 ./lib/iptables/libipt_connmark.so -rw-r--r-- root/root 9904 2007-05-08 22:45 ./lib/iptables/libipt_conntrack.so -rw-r--r-- root/root 5856 2007-05-08 22:45 ./lib/iptables/libipt_connbytes.so -rw-r--r-- root/root 4288 2007-05-08 22:45 ./lib/iptables/libip6t_connmark.so solaris# dpkg -c iptables_1.3.6.0debian1-100.imq_i386.deb | grep ipp -rw-r--r-- root/root 8432 2007-05-08 22:45 ./lib/iptables/libipt_ipp2p.so
Install packages
solaris# dpkg -i iptables_1.3.6.0debian1-99.imq_i386.deb solaris# dpkg -i linux-image-2.6.18-686.imq.by.elessar_rev.01_i386.deb solaris# apt-get install iproute (if it is not installed yet)

Configuration

We have our modified kernel and iptables packages, which are needed for IMQ and CONNLIMIT functionality. It's time to focus on the shaping configuration.

I've created 3 configuration files in /etc/scripts/:
- hfsc
- firewall
- nat

All these files use definitions from the globals file. This globals script should be customized by a network administrator.

The following init scripts were created in /etc/init.d/:
- hfsc
- firewall
- nat

You can find the following definitions in globals file:

- a few obvious options, especially for a polish reader ;) ZEW=eth1 #external interface WEW=eth0 #internal interface LAN="172.16.1.0/24" # internal network IP range IP_ZEW="10.1.1.100" # External IP address

- a few quite important L_USERS=5 # the number of users in the network + server's external IP address USER_IP[1]="172.16.1.2" USER_IP[2]="172.16.1.10" USER_IP[3]="172.16.1.11" USER_IP[4]="172.16.1.12" USER_IP[5]=$IP_ZEW # the last address is always the External IP address of the server

You can add another user in the following way: L_USERS=6 USER_IP[1]="172.16.1.2" USER_IP[2]="172.16.1.10" USER_IP[3]="172.16.1.11" USER_IP[4]="172.16.1.12" USER_IP[5]="172.16.1.20" (new address) USER_IP[6]=$IP_ZEW

After adding new user, you have to restart HFSC script to add new shaping queues.

- a few essential options: MAX_UPLOAD=120 MAX_DOWNLOAD=900

The above variables decide what maximum bandwidth will be assigned to main shaping queues. Their values should be estimated using techniques, mentioned in INTERNET CONNECTION section.

The limits were set in the following way: every user (recognized by the IP address) received the reserved part of download and upload bandwidth. The sum of the reserved is less than the maximum limit itself ( to make HFSC algorithm more efficient as mentioned above).

# reserved upload for every user USER_UPLOAD=$[$MAX_UPLOAD/($L_USERS+1)] # reserved download for every user USER_DOWNLOAD=$[$MAX_DOWNLOAD/($L_USERS+1)]

Every user queue (download and upload one) is divided into 2 further queues. First one contains the important traffic. In that way we can "prioritize" some services like SSH, mail, online games etc by setting low latency parameter in HFSC. Currently the important traffic contains only ICMP pings, DNS and SSH.

The important traffic takes 55% of the reserved bandwidth. The sum of queues bandwidth is less than reserved limit, because of HFSC algorithm again.

USER_D_PRIOR=$[$USER_DOWNLOAD*55/100] USER_D_RESZTA=$[$USER_DOWNLOAD*25/100] USER_U_PRIOR=$[$USER_UPLOAD*55/100] USER_U_RESZTA=$[$USER_UPLOAD*25/100]

How can we forward packets to HFSC queues?

At first, we need to redirect whole traffic from external interface to IMQ devices: $IPTABLES -t mangle -A POSTROUTING -o $ZEW -j IMQ --todev 1 #upload $IPTABLES -t mangle -A PREROUTING -i $ZEW -j IMQ --todev 0 #download

The next step was a big problem for 2.4 kernel. In order to distinguish traffic from different users on the machine with NAT, we had to use additional NAT patch and some tricks with packets marking. Fortunately, we have IMQ behavior (PRE/POSTROUTING) option in kernel 2.6. Defining it as AB will allow us see outgoing packets before NAT (they will have LAN source IP addresses) and incoming packets after NAT (they will have LAN destination IP addresses). Because of this behaviour, we are able to create very simple and clear rules for redirecting packets to proper queues using TC filters.

for example: # important traffic upload rules $TCF_U prio 3 u32 match ip protocol 1 0xff match ip src ${USER_IP[${i}]} flowid 1:$[$N_CLASS_U_1+$i] $TCF_U prio 3 u32 match ip sport 22 0xffff match ip src ${USER_IP[${i}]} flowid 1:$[$N_CLASS_U_1+$i] $TCF_U prio 3 u32 match ip dport 22 0xffff match ip src ${USER_IP[${i}]} flowid 1:$[$N_CLASS_U_1+$i] $TCF_U prio 3 u32 match ip sport 53 0xffff match ip src ${USER_IP[${i}]} flowid 1:$[$N_CLASS_U_1+$i] $TCF_U prio 3 u32 match ip dport 53 0xffff match ip src ${USER_IP[${i}]} flowid 1:$[$N_CLASS_U_1+$i] # important traffic download rules $TCF_D prio 2 u32 match ip protocol 1 0xff match ip dst ${USER_IP[${i}]} flowid 1:$[$N_CLASS_D_1+$i] $TCF_D prio 2 u32 match ip sport 22 0xffff match ip dst ${USER_IP[${i}]} flowid 1:$[$N_CLASS_D_1+$i] $TCF_D prio 2 u32 match ip dport 22 0xffff match ip dst ${USER_IP[${i}]} flowid 1:$[$N_CLASS_D_1+$i] $TCF_D prio 2 u32 match ip sport 53 0xffff match ip dst ${USER_IP[${i}]} flowid 1:$[$N_CLASS_D_1+$i] $TCF_D prio 2 u32 match ip dport 53 0xffff match ip dst ${USER_IP[${i}]} flowid 1:$[$N_CLASS_D_1+$i]

Number of connections

In order to limit the number of user TCP connections, I've put the following rules in the firewall script. In the FORWARD chain you can limit internal user connections.

while [ $j -lt $L_USERS ] do    $IPTABLES -A FORWARD -s ${USER_IP[${j}]} -p tcp -m connlimit --connlimit-above $MAX_CONNS -j REJECT    --reject-with tcp-reset    j=$[$j+1] done;

Limiting the server (when it uses P2P software for example) can be done in OUTPUT chain. For the server limiting we have to put one exception. SSH access should be always allowed.

$IPTABLES -A OUTPUT -o $ZEW -s ${USER_IP[${L_USERS}]} -p tcp --sport ! 22 -m connlimit --connlimit-above $MAX_CONNS -j REJECT --reject-with tcp-reset

TODO

1. ipp2p implementation to create 3 traffic classes

  1 class = important
  2 class = standard www, email services
  3 class = p2p

This will allow users to have a good connection to the Internet while using P2P software.

2. Improved traffic shaping statistics

3. MAC filtering and managed switches to detect rogue clients

End

That's all. I've done many tests to confirm that this configuration works.

I hope you enjoyed the article. If you have any comments, send them to me, please.

best regards
Rafal Rajs