Netflow data in SQL DB
From GEANT2-JRA1 Wiki
Depending on the netflow collector tool, netflow data is typically stored in a plain text format. There are number of tools which can help user to analise netflow data afterwards, like flowscan or Stager. Those tools use different storing formats, like RRD or SQL database, but they use netflow data files to access row data.
However, parsing big amount of data stored in a flat files is not always the best solution. Is there a point of storing row netflow data in a SQL database?
CARNet flow center
CARNet netflow-based monitoring system is based on a flow-tools collector tool. Due to the certain limitations of flowscan front-end application, we decided to implement different visualization and netflow data analysis system.
Basic idea is simple: every 5 minutes some reports are generated, and data is stored in a SQL database (in this case mySQL). PHP-based frontend web application is then created in order to present data. Therefore, not ALL netflow data is stored in a database, but aggregated data instead. Some granularity is obviously lost, but for the purpose of our needs it was not crucial.
The idea is implemented with name CARNet flow center. At the moment, this system consists of:
- 4 netflow collector boxes
- 40 netflow enabled routers and L3 switches. Full netflow export is used for all of the boxes that are supporting full netflow, but for some platforms however only sampled netflow is possible. At the moment, 5 boxes are using sampling, and the rest of the boxes full netflow export.
Following flow-tools reports are generated every 5 minutes for a period of 5 minutes, using flow-stat utility, for octets, flows and packets:
- Overall summary
- Average packet size distribution.
- IP protocol
- TCP/UDP port
- top talkers
Not all flows are stored in database. Following criteria is used:
- only first 10 top talkers/ports/protocols are stored, or
- stored are only those data which exceed 3% of overall statistics (for example, all protocols which exceed threshold of 3% of overall octets, flows or packets are stored when 'TCP/UDP port' report is in use)
Stager
Stager is another example of how aggregated netflow data can be stored in a database. Please check details on Stager.
