January 19, 2015

Sniffing flows with flowparser

FlowParser is a simple C++ library for sniffing and capturing IP flows. It works by listening on an interface (or reading from a .pcap file) and keeping state for every flow seen. A flow is identified by its 5-tuple – IP source address, IP destination address, IP protocol and transport layer source and destination ports. Currently only TCP, UDP and ICMP and supported, over IPv4.

Why another sniffing tool? After all the combination of tcpdump / BPF filters / Wireshark is very powerful and seems to work just fine for most people.

The short answer is that if your requirements can be expressed as a BPF filter, then FlowParser is not for you. If you want to do fancier things with flows, or you want do online filtering on a live interface then maybe FlowParser suits your needs.

Requirements and Installation

The only dependencies are libpcap and a relatively recent C++11 compiler. You can read about installation here. For the rest of this post I will assume that FlowParser is already installed.

Example

Here is a very simple example. Imagine you have a big .pcap trace (several GBytes) and you want to figure out what flows last for more than 10 seconds and contain more than 100 MB of application data. Here are some options:

You can write a Python script using dpkt and do flow reconstruction manually.
You can roll your own solution using libpcap directly. While not difficult writing the code would be tedious and boring.

Or you can give FlowParser a try. Here is a short description of the more important bits. When using FlowParser we have to include the main header:

#include <flowparser/flowparser.h>

This assumes that FlowParser is installed globally and the headers are in the default include path. If you don’t want to install you can just include the header from the checked-out repository.

Each FlowParser instance is configured with a FlowParserConfig object:

FlowParserConfig fp_cfg;
fp_cfg.OfflineTrace(filename);

auto queue_ptr = std::make_shared<flowparser::Parser::FlowQueue>();
fp_cfg.FlowQueue(queue_ptr);

FlowParser fp(fp_cfg);

The fist part sets the .pcap file to read from. FlowParser wraps libpcap and will open/close the file for us. The second part is more interesting.

As the file is read (or packets are received from a live interface) state is kept and updated for each flow. At the end of the trace those in-memory flows are evicted. If there is a FlowQueue set in the FlowParserConfig instance flows will be handled to it. A FlowQueue is a queue that transfers flows as unique pointers. The client (the application that we are writing) can read those flows from the queue, taking ownership.

Now that we have the FlowParser object we have to tell he application what to do with those flows that will get handed to it at the end of the trace:

std::thread th([&queue_ptr] {
  while (true) {
    std::unique_ptr<Flow> flow_ptr = queue_ptr->ConsumeOrBlock();
    if (!flow_ptr) {
      break;
    }

    FlowInfo info = flow_ptr->GetInfo();
    uint64_t duration = info.last_rx - info.first_rx;
    if (duration < kTenSeconds || info.total_payload_seen < k100MB) {
      continue;
    }

    std::cout << flow_ptr->key().ToString() << "\n";
  }
});

This is a thread that will read from the queue until nullptr is returned. Every flow that is shorter than 10 seconds or that has less than 100 MB of application payload will be skipped. Note that since the flows are unique pointers skipped flows will get freed automatically. For flows which satisfy our requirements we print the key. The key contains the 5-tuple.

Here is a complete self-contained piece of code which does what we need:

#include <iostream>
#include <memory>
#include <string>
#include <thread>
#include <flowparser/flowparser.h>

using flowparser::Flow;
using flowparser::FlowParserConfig;
using flowparser::FlowParser;
using flowparser::FlowInfo;

static uint64_t kTenSeconds = 10000000; // microseconds
static uint64_t k100MB = 100000000; // bytes

int main(int argc, char *argv[]) {
  if (argc != 2) {
    std::cout << "Supply exactly one argument.\n";
    return -1;
  }

  std::string filename(argv[1]);

  FlowParserConfig fp_cfg;
  fp_cfg.OfflineTrace(filename);

  auto queue_ptr = std::make_shared<flowparser::Parser::FlowQueue>();
  fp_cfg.FlowQueue(queue_ptr);

  FlowParser fp(fp_cfg);

  std::thread th([&queue_ptr] {
    while (true) {
      std::unique_ptr<Flow> flow_ptr = queue_ptr->ConsumeOrBlock();
      if (!flow_ptr) {
        break;
      }

      FlowInfo info = flow_ptr->GetInfo();
      uint64_t duration = info.last_rx - info.first_rx;
      if (duration < kTenSeconds || info.total_payload_seen < k100MB) {
        continue;
      }

      std::cout << flow_ptr->key().ToString() << "\n";
    }
  });

  fp.RunTrace();
  th.join();
  return 0;
}

You can copy-paste the above in a file named example_one and compile with:

g++ -g -std=c++11 -Wall -Wextra -O2 -c -o example_one.o example_one.cc
g++ example_one.o -o example_one -g -lflowparser -lpcap

Even though this example reads flows from a file it is possible to also perform this on a live trace. More on this later.

Posted by Nikola Gvozdiev

— Blog Archive —