Skip to content

rstade/TrafficEngine

Repository files navigation

TrafficEngine Overview

TrafficEngine is a stateful user-space TCP traffic generator written in Rust with following properties

It may be used for (load-)testing TCP based application servers and TCP proxies. TrafficEngine maintains TCP-state and can therefore setup and release complete TCP connections.

Multi-core scaling is supported by steering packets of the same TCP connection based on the TCP port or the IP address to the appropriate core which handles that connection. Therefore port resources can be assigned to cores (based on paramater dst_port_mask in the configuration file). Alternatively, if the NIC does not support port masks, steering can be based on the IP address.

TrafficEngine builds on Netbricks which itself utilizes DPDK for user-space networking. Starting with version 0.2.0 more generic code is moved to an application independent crate netfcts (in sub-directory netfcts).

TrafficEngine Installation

First install NetBricks. TrafficEngine needs the branch e2d2-rstade from the fork at https://github.com/rstade/Netbricks. The required NetBricks version is tagged (starting with v0.2.0). Install NetBricks locally on your (virtual) machine by following the description of NetBricks. The installation path of e2d2 needs to be updated in the dependency section of Cargo.toml of TrafficEngine.

Note, that a local installation of NetBricks is necessary as it includes DPDK and some C-libraries for interfacing the Rust code of NetBricks with the DPDK. If the optional KNI interface is needed, the DPDK kernel module needs to be re-compiled each time the kernel version changes. This can be done with the script build.sh of NetBricks. Note also that the Linux linker ld needs to be made aware of the location of the .so libraries created by NetBricks. This can be solved using ldconfig.

Secondly, TrafficEngine depends on the crate netfcts. netfcts is an extension to NetBricks with helper functions and data structures, and needs to be build using the locally installed NetBricks to ensure consistent dependencies.

The network interfaces of the test machine need to be prepared (see prepNet.sh):

First a network interface for user-space DPDK is needed. This interface is used by the engine to connect to servers (in the example configuration this interface uses PCI slot 07:00.0). The latest code is tested with NIC X520-DA2 (82599).

Secondly an extra Linux interface is required which is used by the test modules for placing server stacks.

For some integration tests both interfaces must be interconnected. In case of physical interfaces, interfaces my be connected by a cross over cable. In case of virtual interfaces, e.g. interfaces may be connected to a host-only network of the hypervisor. Using Wireshark on the linux interface allows us to observe the traffic exchange between clients, the TrafficEngine and the servers. However, as wireshark may not keep up with the transmission speeds of modern line cards, packets may be lost.

In addition some parameters like the Linux interface name (linux_if) and the IP / MAC addresses in the test module configuration files tests/*.toml need to be adapted.

Below test results are achieved on a 2-socket NUMA server, each socket hosting 4 physical cores, running the real-time kernel of Centos 7.5.

Testing

The executables must currently be run with supervisor rights, as otherwise the DPDK cannot be initialized. However to avoid that Cargo itself must be run under root, the shell script test.sh can be used, for example

  • "./test.sh test_as_client --release" or "./test.sh test_as_server --release".

The script requires installation of the jq tool, e.g. by running "yum install jq".

In addition the script allows to run a simple loopback helper tool, called macswap:

  • "./test.sh macswap --release"

This tool can be used in cases the loopback mode of the NIC is not working. This happened with X710DA2. The tool should be run ideally on a second server. It swaps source and destination MAC addresses and sends the frames back towards the origin.

Performance

Our test scenario is as follows:

  • We connect client- with server-side of TrafficEngine by using the loopback feature of the NIC (see loopback_run.toml). For this we used a 82599 based NIC.
  • After the client has setup the TCP connection, it sends a small payload packet to the server. After receiving the payload the server side release the TCP connection. In total we exchange seven packets per connection.
  • The same TrafficEngine instance operates concurrently as client and as server. Therefore when comparing our cps figures with the cps of a TCP server our figures can be approximately doubled.
  • Tests were run on a two socket server with two rather old 4 core L5520 CPU @ 2.27GHz with 32K/256K/8192K L1/L2/L3 Cache and a recent Centos 7.6 real-time kernel, e.g. from repository: http://linuxsoft.cern.ch/cern/centos/7/rt/CentOS-RT.repo. We also performed the basic tuning steps to isolate the cores which are running our working threads. The real-time kernel increases determinism significantly versus the usual Centos non-real-time kernel. For more information see rt-tuning.md.

The following figures shows results for the achieved connections per second in dependence of the cores used for forwarding pipelines. The measurements are based on NetBricks using DPDK 18.11. The upper curve is showing the result with generation of connection records switched off. Each point is the average of four runs with 2 million TCP connections each per core. The lower curve are the results with generation of connection records. In the latter case each run has 200 thousand TCP connections per core.

TrafficEngine performance

Limitations

Currently only a basic TCP state machine without retransmission, flow control, etc., is implemented.