Back to my home page dummynet 1. Description dummynet is a flexible tool originally designed for for testing networking protocols, and since then (mis)used for bandwidth management. It simulates/enforces queue and bandwidth limitations, delays, packet losses, and multipath effects. It also implements a variant of Weighted Fair Queueing called WF2Q+. It can be used on user's workstations, or on FreeBSD machines acting as routers or bridges. Just to get the idea of what you can do with dummynet, e.g. by using dummynet on your workstation, or putting a PC with two ethernet cards between your network and your router and booting from the floppy-image below, here are a few examples : These rules limit the total ICMP traffic (inbound+outbound) to 50Kbit/s ipfw add pipe 1 icmp from any to any ipfw pipe 1 config bw 50Kbit/s queue 10 These rules limit inbound traffic to 300Kbit/s for each host on your network 10.1.2.0/24. ipfw add pipe 2 ip from any to 10.1.2.0/24 ipfw pipe 2 config bw 300Kbit/s queue 20 mask dst-ip 0x000000ff If you want all machines to share evenly a single link, you should use instead: ipfw add queue 1 ip from any to 10.1.2.0/24 ipfw queue 1 config weight 5 pipe 2 mask dst-ip 0x000000ff ipfw pipe 2 config bw 300Kbit/s And these rules simulate an ADSL link to the moon: ipfw add pipe 3 ip from any to any out ipfw add pipe 4 ip from any to any in ipfw pipe 3 config bw 128Kbit/s queue 10 delay 1000ms ipfw pipe 4 config bw 640Kbit/s queue 30 delay 1000ms dummynet works by intercepting packets (selected by ipfw rules - ipfw is one of the FreeBSD firewalls) in their way through the protocol stack, and passing them through one or more objects called queues and pipes, which simulate the effects of bandwidth limitations, propagation delays, bounded-size queues, packet losses, multipath. Pipes are fixed-bandwidth channels. Queues represent instead queues of packets, associated with a weight, which share the bandwidth of the pipe they are connected to proportionally to their weight. Each pipe and queue can be configured separately, so you can apply different limitations/delays to different traffic according to the ipfw rules (e.g. selecting on protocols, addresses and ports ranges, interfaces, etc.). Pipes and queues can be created dynamically, so using a single set of rules you can apply independent limitations to all hosts in a subnet, or to all types of traffic, etc. You can also configure the system to build cascades of pipes, so you can simulate networks with multiple links and paths between source(s) and destination(s). 2. Performance, status and availability Unlike other traffic shaping packages which run in userland, dummynet has a very little overhead, as all processing is done within the kernel. There is no data copying involved to move packets through pipes, just a bit of pointer shuffling, and the implementation is able to handle thousands of pipes with O(log N) cost. The WFQ variant we implement, called WF2Q+, has a complexity which is O(log N) in the number of active flows, so again it is able to handle efficiently thousands of flows. dummynet is part of FreeBSD since Sept.1998. It has been recently (Jan.2000 and June 2000) rewritten, so the most recent, feature-rich and robust versions are in FreeBSD 3.4-STABLE and newer releases. You don't need to install FreeBSD on your hard disk to use it, as below you will find a bootable single-floppy version of FreeBSD which includes dummynet, bridging, and a lot of other goodies. Dummynet is being heavily used by lots of people, and the code seems to be extremely stable and robust, especially in the 3.4-STABLE version and above. Bug fixes are generally applied to the FreeBSD source tree and are available from the CVS tree or in newer snapshot/releases of FreeBSD. From time to time i update the floppy image on this site as well. 3. Support If you have found some bug, please report it to me by email, but don't forget to include information on which version of FreeBSD and dummynet you are using, your rules (ipfw show; ipfw pipe show), your configuration (bridge or router) etc. If you have a simple question, again just email me and i generally try to reply as soon as possible. Again, please supply details! For more complex things (like "i have no time to learn how to use it, i just want this work done"), or customizations and additions of new features to dummynet/ipfw, I am available (through my department) for doing support on a contract basis. Email luigi@iet.unipi.it for discussing details. This said, FreeBSD users should be able to use dummynet without the need for support. The relevant manpages (ipfw(8), dummynet(4), bridge(4)) are a great source of information, so please read updated version of them before asking questions. You can also try posting on the various FreeBSD mailing lists or newsgroups, they are usually a very good source of information. 4. Using dummynet Dummynet is entirely controlled by the ipfw commands and a set of sysctl variables. 4.1 Basic ipfw commands The basic structure of ipfw commands is ipfw add [N] [prob X] action PROTO from SRC to DST [options] where N is the rule number ; X is a number between 0 and 1 that, when present, indicates the probability of getting a match on this rule if all other fields are correct. The default is deterministic match; action is one of the actions executed on a match, which can be any of allow, deny, skipto N, pipe N and others. To send a packet to a dummynet pipe, we have to use pipe N; PROTO is the protocol type we want to match (IP, TCP, UDP, ...); SRC and DST are address specifier (we can use addresses with netmasks and optionally followed by ports or port ranges); options can be used to restrict the attention to packets coming from/to specific interfaces, or carrying some TCP flags or ICMP options, or bridged, etc. 4.2 Sysctl variables The following are the main sysctl variables to control the behaviour of ipfw, bridging and dummynet: Controlling ipfw The firewall is mostly controlled by ipfw, and the sysctl variables only serve to give global configuration and default parameters. net.inet.ip.fw.enable: 1 enables firewall in the IP stack net.inet.ip.fw.one_pass: 1 Forces a single pass through the firewall. If set to 0, packets coming out of a pipe will be reinjected into the firewall starting with the rule after the matching one. NOTE: there is always one pass for bridged packets. net.inet.ip.fw.dyn_buckets: 256 (readonly) Current hash table size used for dynamic rules. net.inet.ip.fw.curr_dyn_buckets: 256 Desired hash table size used for dynamic rules. net.inet.ip.fw.dyn_count: 3 Current number of dynamic rules. (readonly) net.inet.ip.fw.dyn_max: 1000 Max number of dynamic rules. If you exceed this limit, you will have to wait for a rule to expire before being able to create a new one. net.inet.ip.fw.dyn_ack_lifetime: 300 net.inet.ip.fw.dyn_syn_lifetime: 20 net.inet.ip.fw.dyn_fin_lifetime: 20 net.inet.ip.fw.dyn_rst_lifetime: 5 net.inet.ip.fw.dyn_short_lifetime: 5 Lifetime (in seconds) for various types of dynamic rules. Controlling dummynet Also dummynet is mostly controlled by ipfw, with the sysctl variables serving mostly for default parameters. net.inet.ip.dummynet.hash_size: 64 Size of hash table for dynamic pipes. net.inet.ip.dummynet.expire: 1 Delete dynamic pipes when they become empty. net.inet.ip.dummynet.max_chain_len: 16 Max ratio between number of dynamic queues and hash buckets. When you exceed (max_chain_len*buckets) queues on a pipe, packets not matching any of these will be all put into the same default queue. Controlling bridging Bridging is almost exclusively controlled by sysctl variables. net.link.ether.bridge_cfg: ed2:1,rl0:1, set of interfaces for which bridging is enabled, and cluster they belong to. net.link.ether.bridge: 0 enable bridging. net.link.ether.bridge_ipfw: 0 enable ipfw for bridging. 4.3 Pipe and queue configuration The following ipfw commands control dummynet pipes * ipfw pipe NN config ... This command is used to create or reconfigure a pipe. NN is the numeric identifier (between 1 and 65535) of the pipe. Issuing multiple time the configuration command results in the pipe being reconfigured. * ipfw [-s field] pipe [NN] show This command shows the parameters of a pipe. If the pipe is a dynamic one (see mask parameter), then all dynamic pipes created from this one are listed. The list can be very very long. The -s option allows you to sort the listing on one of the four counters associated to the pipe. * ipfw pipe NN delete Destroys a single pipe. Remember that packets sent to a non-existing pipe are silently dropped. * ipfw pipe flush Destroys all pipes. The following parameters can be configured for a pipe, adding the command in the pipe config... line: * Bandwidth: bw NNunit NN is the bandwidth assigned to the pipe, unit (which must follow the number with no intervening spaces) can be any of bit/s Kbit/s Mbit/s Byte/s KByte/s MByte/s or non-ambiguous abbreviations. A bandwidth of 0 (or no bandwidth) results in no bandwidth limitations (hence, no queues will ever build up). * Queue size: queue NN [unit] Sets the queue size, in slots if only NN is specified, otherwise in Bytes or KBytes. When there is no room in the queue, packets are dropped. The default queue size is 50 slots. The combination of bandwidth and queue size influence the queueing delay. Be careful when using low bandwidths not to use too large queues, or you might end up with several seconds of queueing delay. Also be careful when you specify the queue size in packets: if you run tests over the loopback interface, a packet can be very large, e.g. 16KB, again resulting in huge delays. * Delay: delay NN ms Sets the propagation delay of the pipe, in milliseconds. Note that the queueing delay component is independent of the propagation delay. Also note that all delays are approximated with a granularity of 1/HZ seconds (HZ is typically 100, but we suggest using HZ=1000 and maybe even larger values). * Random Packet Loss: plr X X is a floating point number between 0 and 1 which causes packets to be dropped at random. This is done generally to simulate lossy links. The default is 0, or no loss. * Dynamic queue creation: mask ... It is possible to associate a mask to a pipe so that bandwidth and queue limitations are enforced separately for packets belonging to different flows. The mask command lets you specify which parts of the following fields contribute to identify a flow: [proto N] [src-ip N] [dst-ip N] [src-port N] [dst-port N] where N is a bitmask where significant bits are set to 1. You can specify one or more masks, or the all keyword to mean that all fields are fully significant. The default (when no mask are specified) is to ignore all fields, so that all packets are considered to belong to the same flow. Whenever a new flow is encountered, a new queue (with the specified bandwidth and queue size) is created. WARNING!!! the number of dynamic queues that can be created in this way can become very large. They are accessed through a hash table, whose size you can define using the buckets NN specifier after the mask command. To use WF2Q+, packets must be passed to queues which in turn must be connected to a pipe. The following ipfw commands control dummynet pipes * ipfw queue NN config ... This command is used to create or reconfigure a queue. NN is the numeric identifier (between 1 and 65535) of the queue. Issuing multiple time the configuration command results in the queue being reconfigured. * ipfw queue NN delete Destroys a single queue. Remember that packets sent to a non-existing queue are silently dropped. * ipfw queue flush Destroys all queues. The following parameters can be configured for a queue, adding the command in the queue config... line: * Pipe: pipe NN NN is the identifier of the pipe used for regulating traffic. * Weight: weight NN NN is the weight (1..100, default 1) associated to the queue. * Per-Flow queueing: mask ... The syntax is the same as for pipes. However, all queues created dynamically will share the parent pipe's bandwidth according to the weight. * Queue size, Random Packet Loss: Same as for pipes. 5. Using dummynet for testing protocols Dummynet was originally created to test network protocols and applications, possibly even on a standalone system. As a consequence, some of its features such as delay emulation, random loss etc. are esplitictly designed for that purpose. There are a few things you should take in mind when doing such tests, to avoid getting incorrect results. They are all obvious things, still it is better to have them in mind. o Choosing a reasonable buffer size. As said earlier, packet can be subject to a delay which is proportional to the total queue size (in bytes), and inversely proportional to the bandwidth. At low bandwidths, this queueing delays can be extremely high, especially if the queue size is defined in terms of packets and packets are large. The default queue size is almost certainly too large for most purposes, and it is often preferable to define the queue size in terms of bytes rather than packets. o Half-duplex vs. Full-duplex channels. With the exception of shared-medium networks such as the ethernet, most links that you want to simulate for your experiments are full duplex. As such, the proper configuration is the following: ipfw add pipe 1 ip from A to B ipfw add pipe 2 ip from B to A ipfw pipe 1 config ... ipfw pipe 2 config ... Should you really need to mode a half duplex network, then you can use the following sequence. But think twice before you do so, as it is often a non-realistic mode. ipfw add pipe 3 ip from A to B ipfw add pipe 3 ip from B to A ipfw pipe 3 config ... o Interactions between bridging and multicast You can use ipfw (and dummynet) in a bridge by setting some sysctl variables: sysctl -w net.link.ether.bridge=1 sysctl -w net.link.ether.bridge_ipfw=1 and then specify your firewall configuration. Be careful when you run experiment involving multicast traffic through a dummynet-enabled bridge. Unless you set the rules right, multicast traffic in a bridge goes through the firewall code twice: once during forwarding at level 2, once when the packet is passed to the local IP stack of the bridge. Starting from Feb.2000, there are to avoid this problem. One involves a sysctl variable: sysctl -w net.inet.ip.fw.enable=0 which avoids that the firewall is invoked at the ip level. Otherwise, you can use the bridged specifier in your ruleset to match only bridged packets: ipfw add pipe 1 ip from any to any bridged o Running over the loopback interface. Dummynet was originally designed for running experiments on a standalone machine. The loopback interface lets you run senders and receivers on the same machine, but you should remember a few things: + The firewall is invoked on all packets. This means that if you have a configuration such as ipfw add pipe 4 ip from 127.0.0.1 to 127.0.0.1 ipfw pipe 4 config delay 100ms and do a simple ping 127.0.0.1 you will see a delay of approximately 400ms. In fact the ICMP request goes through the pipe twice (once down, once up), and the same for the ICMP reply. For the same reason, if you also have bandwidth or queue limitations, remember that the queue sees the traffic multiple times. You can partially overcome this problem by using additional ipfw options, e.g. specifying a direction for matching packets, or the uid of the sender or receiving process. Alternatively, you can assign multiple aliases to the loopback interface, and make sure that the sender and receiver bind their local endpoint to different addresses so that you will have distinct rules matching traffic in the two directions. + The MTU of the loopback interface defaults to 16KB The usual default for ethernet is 1500, and for point-to-point links often smaller (576 or so). You can simply fix this by redefining the mtu to the desired value with ifconfig lo0 mtu 1500 o TCP defaults. Be very careful when using TCP, especially between processes running on the same machine, or on the same subnet. Apart from the MTU issue mentioned earlier, at least on FreeBSD, TCP starts with a full window when the remote endpoint is on the same subnet as one of the local addresses. You need a simple fix in the source (tcp_input.c i believe) to fix this behaviour in FreeBSD 3.x, whereas FreeBSD 4.x has sysctl variable(s) to set the initial window. Secondly, when you do experiments on configuration with a large delay-bandwidth product, remember that many applications use the default window size which is small, something like 16KB. You might end up not using the full bandwidth just because your data transfer is window-limited. 5.1 Simulating multipath One nice feature of the new version of dummynet is the ability to simulate multiple paths between sender and receiver. This is done using probabilistic match, e.g.: ipfw add prob 0.33 pipe 1 ip from A to B ipfw add prob 0.5 pipe 2 ip from A to B ipfw add pipe 3 ip from A to B ipfw pipe 1 config ... ipfw pipe 2 config ... ipfw pipe 3 config ... Given the right packet, the first rule will match with probability 1/3; in the remaining 2/3 of occurrence we move to the second rule, which will match with prob 1/2 (so overall 1/2*1/3 = 1/3), and the remaining 1/3 of occurrence will move to the third rule, which has a deterministic match. We can then configure the three pipes as desired to emulate phenomena such as packet reordering etc. ----------------------------------------------------------------------- 6 Related links Here i collect some info on how to do various ipfw-related things. Most of this is just URLs collected from the mailing list so the reliability of the info might be different (for good or bad) from what is in this page. o PPP Over Ethernet Detailed instructions on how to set up a PPPoE connection. o ALTQ Alternate queueing scheme. ----------------------------------------------------------------------- A PicoBSD floppy To conclude... if you want to try dummynet, here is a bootable floppy image of a system with FreeBSD, bridging, ipfw, dummynet, natd, ppp, drivers for a few interfaces, and accessible via telnet. To setup this system, download the 1.44MB image, pico.000608.bin and copy it to a floppy using dd under FreeBSD, or rawrite under DOS/Windows. Then put the floppy into a machine with hopefully at least one interface, and wait for it to boot. When the system comes up, login as root, password "setup", and you can play with bridging, ipfw and dummynet using the above commands. ----------------------------------------------------------------------- Luigi Rizzo Dipartimento di Ingegneria dell'Informazione -- Univ. di Pisa via Diotisalvi 2 -- 56126 PISA tel. +39-050-568533 Fax +39-050-568522 email: luigi@iet.unipi.it [Image]