There is a lot of software available for implementing various parts of the NetFlow architecture. Most of the products are for various analysis tasks, but there is still a wealth of choices for collectors; a bit less for probes, since most probes will come with routers operating system (eg. Cisco IOS), although this feature may be limited to higher-end devices in some vendor offerings. Our problem is to figure out which software is best for us; to do this we need to do some requiremnts analysis. To answer this question, we need to first figure out what features are important, and which software satisfies the various features.
Table 4. NetFlow Probe Selection
Considering that we really want version 9 support for IPv6, we can choose from softflowd, nProbe or pmacct. pmacct is developmental, so may be harder to support and may have less stable community experience. nProbe seems very good, particularly aimed at performance, but the cost is quite high; although nProbe is available free for universities and research, I’d rather not have to deal with licensing issues. That leaves us with softflowd, of which I found at least one comprehensive article about building an integrated NetFlow system around, and is also widely available.
Retrospectively, softflowd does do a good job, although I do wish it would export incoming and outgoing interface; unfortunately libpcap based probes can’t really get that information; it would need to be more tied to the routing subsystem.
softflowd had NetFlow version 9 support added two released ago, so should be fairly stable. It is also available via package repository, so keeping up-to-date should be less of an issue, although the software seems to be in a stable-but-slowly updated phase of its life, due to developer priorities. softflowd also has one extra feature that may be rather nice: the ability to read in a pcap file (as produced by tcpdump) and emit NetFlow records based on it, which is useful for testing and evaluation.
Okay, so we have an apparent winner in the selection for the ‘probe’ component of our NetFlow architecture: we still require a ‘collector’ and some analysis or reporting tools, which there are numerous commercial and open-source products for.
Using NetFlow version 9 as a must-have requirement, we can immediately discount much of the field, much of which may be otherwise commonly used. One fairly serious contender, which is free, is nfdump. This is a tool written by people actively working and using the product in projects such as Internet2, which feature technologies such as IPv6 and other assorted goodies, so it is reasonable to think that it should be capable of satisfying our needs — but note that we don’t yet know how hard it might be to use. A look at resources such as mailing list archives, on-line bug reports and release history confirms that it is actively maintained and developed, with a stable and developmental branch and a friendly and responsive developer. nfdump can be integrated with a larger product, called NfSen, which is a web reporting frontend, though it is not required.
So now we have some components that should be suitable for the ‘probe’ and ‘collector’… but what about the reporting and analysis tools? To answer that question, we need to do some further requirements analysis as to what we want to do with the data.
Some likely activities include network accounting / monitoring / diagnosis. Using NetFlow we can ask questions such as:
Who (is/has been) using the bandwidth the most?
This could be used as a data-source for targetted education of, say, users that are using resources inappropriately.
How much data did user/computer/group X send/receive internationally/domestically/locally?
This could be used for billing or, if such reports are generated frequently, to alert someone about potential budget blowouts.
How much of our bandwidth use is comprised of application X?
Perhaps I could use this data to make an informed decision about deploying caching data to move data off my expensive WAN links.
Has our change to X reduced/impacted network load as expected? Are there any unexpected consequences?
This is generally, like many of these, a graphing/trending application, but may also be facilitated by a weather-map.
Are the any unusual spikes that might indicate security incidents such as Denial of Service or internet worms?
This is a common use of NetFlow accounting data in a security setting.
As we only have limited time available, we shall concentrate on the easier areas of accounting, although there is plenty of scope to expand. It is also an area that is to be most likely in smaller networks. Note that many areas commonly require a mix of pre-determined and ad-hoc reporting facilities:
“Hello, I’d like to know why my network bill is so high…”
The network bill would be a pre-determined report, but the ability to drill down (ie. look at increasing levels of detail) may very likely need to be ad-hoc.
“Okay, so it seems host X was broken into by an exploit on port Y… what other machines may have been affected this way?”
The detection of an exploit may not have come from a NetFlow based report, but investigating traffic to port Y would generally require some degree of ad-hoc reporting, which could be done by a pre-defined parameterised report.
“Hmmm, can we recognise new applications on our network that we don’t currently track?”
It is common to try and track the amount of data used by protocols such as HTTP, SMTP, FTP (in its various modes) and others, but there will be some Other category, which you might like to take a closer look at occassionally.
“Argh! What is causing these spikes in traffic?”
In this situation, NetFlow itself may not be enough, because NetFlow only gives you packet-level statistics, it doesn’t give you the packet data itself. But that said, NetFlow can give you enough visibility on the network to begin making intelligent hypotheses as to what might be happening, and then you can introduce more ad-hoc monitoring in the parts of the network which are seem more able to answer the question. Finally, note that many of these LAN-activity questions will be best answered using NetFlow data from switches, of which there is likely to be an excessive amount of data.
“Why are my Voice-over-IP sessions so slow sometimes?”
You might use NetFlow to determine what else was happening in the network when the problem manifests itself. Perhaps there is a lot of traffic from machines automatically downloading patches or submitting backups, perhaps there are people viewing YouTube videos at the time, or perhaps someone is sending photos via e-mail which is choking upstream bandwidth. These are all questions that NetFlow can help you answer quickly.
A suite of programs called flow-tools are a common free solution for these tasks; generally coupled with other tools to do things like graphing. FlowScan is another common free tool that can do some useful reporting such as producing a graph showing the composition of traffic — such as web traffic, FTP, etc. — over time. Unfortunately, neither of these tools yet support NetFlow version 9, so we shan’t cover them.
Fortunately, nfdump has some rather useful querying facilities, including a filter mechanism similar to tcpdump, that supports many of the ad-hoc queries we may wish to make, including Top-N style questions. Therefore, we shall use nfdump for some of our reporting and querying examples as well as our ‘collector’ component.
This leaves us with at least one part of our reporting system with which there are surprisingly few available tools: the accounting system. This is because such systems are commonly: complex, customised, vary greatly according to a) how the accounting data is collected, b) policies by which accounting is done, c) integration with billing and payment systems. Note that there can be a reactive quality to b and c, such as what may happen when you introduce a quota policy. As such, accounting systems are often developed in-house; we shall see how we could use nfdump to collect the basic data, but processing the data and making that into a report is an exercise left to the reader.