3. Requirements Analysis

There is a lot of software available for implementing various parts of the NetFlow architecture. Most of the products are for various analysis tasks, but there is still a wealth of choices for collectors; a bit less for probes, since most probes will come with routers operating system (eg. Cisco IOS), although this feature may be limited to higher-end devices in some vendor offerings. Our problem is to figure out which software is best for us; to do this we need to do some requiremnts analysis. To answer this question, we need to first figure out what features are important, and which software satisfies the various features.

Table 4. NetFlow Probe Selection

Softwarev.9?Linux?Debian?Maintained?Comm. OpinionSampling?SourceCost

Considering that we really want version 9 support for IPv6, we can choose from softflowd, nProbe or pmacct. pmacct is developmental, so may be harder to support and may have less stable community experience. nProbe seems very good, particularly aimed at performance, but the cost is quite high; although nProbe is available free for universities and research, I’d rather not have to deal with licensing issues. That leaves us with softflowd, of which I found at least one comprehensive article about building an integrated NetFlow system around, and is also widely available.

Retrospectively, softflowd does do a good job, although I do wish it would export incoming and outgoing interface; unfortunately libpcap based probes can’t really get that information; it would need to be more tied to the routing subsystem.

softflowd had NetFlow version 9 support added two released ago, so should be fairly stable. It is also available via package repository, so keeping up-to-date should be less of an issue, although the software seems to be in a stable-but-slowly updated phase of its life, due to developer priorities. softflowd also has one extra feature that may be rather nice: the ability to read in a pcap file (as produced by tcpdump) and emit NetFlow records based on it, which is useful for testing and evaluation.

Okay, so we have an apparent winner in the selection for the ‘probe’ component of our NetFlow architecture: we still require a ‘collector’ and some analysis or reporting tools, which there are numerous commercial and open-source products for.

Using NetFlow version 9 as a must-have requirement, we can immediately discount much of the field, much of which may be otherwise commonly used. One fairly serious contender, which is free, is nfdump. This is a tool written by people actively working and using the product in projects such as Internet2, which feature technologies such as IPv6 and other assorted goodies, so it is reasonable to think that it should be capable of satisfying our needs — but note that we don’t yet know how hard it might be to use. A look at resources such as mailing list archives, on-line bug reports and release history confirms that it is actively maintained and developed, with a stable and developmental branch and a friendly and responsive developer. nfdump can be integrated with a larger product, called NfSen, which is a web reporting frontend, though it is not required.

So now we have some components that should be suitable for the ‘probe’ and ‘collector’… but what about the reporting and analysis tools? To answer that question, we need to do some further requirements analysis as to what we want to do with the data.

Some likely activities include network accounting / monitoring / diagnosis. Using NetFlow we can ask questions such as:

As we only have limited time available, we shall concentrate on the easier areas of accounting, although there is plenty of scope to expand. It is also an area that is to be most likely in smaller networks. Note that many areas commonly require a mix of pre-determined and ad-hoc reporting facilities:

A suite of programs called flow-tools are a common free solution for these tasks; generally coupled with other tools to do things like graphing. FlowScan is another common free tool that can do some useful reporting such as producing a graph showing the composition of traffic — such as web traffic, FTP, etc. — over time. Unfortunately, neither of these tools yet support NetFlow version 9, so we shan’t cover them.

Fortunately, nfdump has some rather useful querying facilities, including a filter mechanism similar to tcpdump, that supports many of the ad-hoc queries we may wish to make, including Top-N style questions. Therefore, we shall use nfdump for some of our reporting and querying examples as well as our ‘collector’ component.

This leaves us with at least one part of our reporting system with which there are surprisingly few available tools: the accounting system. This is because such systems are commonly: complex, customised, vary greatly according to a) how the accounting data is collected, b) policies by which accounting is done, c) integration with billing and payment systems. Note that there can be a reactive quality to b and c, such as what may happen when you introduce a quota policy. As such, accounting systems are often developed in-house; we shall see how we could use nfdump to collect the basic data, but processing the data and making that into a report is an exercise left to the reader.