Python socket and struct

The script, netFlow_v5_parser.py, was modified from Brian Rak's blog post on http://blog.devicenull.org/2013/09/04/python-netflow-v5-parser.html; this was done mostly for Python 3 compatibility as well as additional NetFlow version 5 fields. The reason we choose v5 instead of v9 is because v9 is more complex as it introduces templates; therefore, it will provide a very difficult-to-grasp introduction to NetFlow. Since NetFlow version 9 is an extended format of the original NetFlow version 5, all the concepts we introduced in this section are applicable to it.

Because NetFlow packets are represented in bytes over the wire, we will use the struct module included in the standard library to convert bytes into native Python data types.

We will start by using the socket module to bind and listen for the UDP datagram. With socket.AF_INET, we intend on listing for the IPv4 address; with socket.SOCK_DGRAM, we specify that we'll see the UDP datagram:

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind(('0.0.0.0', 9995))

We will start a loop and retrieve information off the wire:

while True:
buf, addr = sock.recvfrom(1500)

The following line is where we begin to deconstruct or unpack the packet. The first argument of !HH specifies the network's big-endian byte order with the exclamation sign (big-endian) as well as the format of the C type (H = 2 byte unsigned short integer):

(version, count) = struct.unpack('!HH',buf[0:4])

If you do not remember the NetFlow version 5 header off the top of your head (that was a joke by the way), you can always refer to http://www.cisco.com/c/en/us/td/docs/net_mgmt/netflow_collection_engine/3-6/user/guide/format.html#wp1006108 for a quick refresher. The rest of the header can be parsed accordingly, depending on the byte location and data type:

 (sys_uptime, unix_secs, unix_nsecs, flow_sequence) = struct.unpack('!IIII', buf[4:20])
(engine_type, engine_id, sampling_interval) = struct.unpack('!BBH', buf[20:24])

The while loop that follows will fill the nfdata dictionary with the flow record that unpacks the source address and port, destination address and port, packet count, and byte count, and prints the information out on the screen:

for i in range(0, count):
try:
base = SIZE_OF_HEADER+(i*SIZE_OF_RECORD)
data = struct.unpack('!IIIIHH',buf[base+16:base+36])
input_int, output_int = struct.unpack('!HH', buf[base+12:base+16])
nfdata[i] = {}
nfdata[i]['saddr'] = inet_ntoa(buf[base+0:base+4])
nfdata[i]['daddr'] = inet_ntoa(buf[base+4:base+8])
nfdata[i]['pcount'] = data[0]
nfdata[i]['bcount'] = data[1]
...

The output of the script allows you to visualize the header as well as the flow content at a glance:

Headers:
NetFlow Version: 5
Flow Count: 9
System Uptime: 290826756
Epoch Time in seconds: 1489636168
Epoch Time in nanoseconds: 401224368
Sequence counter of total flow: 77616
0 192.168.0.1:26828 -> 192.168.0.5:179 1 packts 40 bytes
1 10.0.0.9:52912 -> 10.0.0.5:8000 6 packts 487 bytes
2 10.0.0.9:52912 -> 10.0.0.5:8000 6 packts 487 bytes
3 10.0.0.5:8000 -> 10.0.0.9:52912 5 packts 973 bytes
4 10.0.0.5:8000 -> 10.0.0.9:52912 5 packts 973 bytes
5 10.0.0.9:52913 -> 10.0.0.5:8000 6 packts 487 bytes
6 10.0.0.9:52913 -> 10.0.0.5:8000 6 packts 487 bytes
7 10.0.0.5:8000 -> 10.0.0.9:52913 5 packts 973 bytes
8 10.0.0.5:8000 -> 10.0.0.9:52913 5 packts 973 bytes

Note that in NetFlow version 5, the size of the record is fixed at 48 bytes; therefore, the loop and script are relatively straightforward. However, in the case of NetFlow version 9 or IPFIX, after the header, there is a template FlowSet (http://www.cisco.com/en/US/technologies/tk648/tk362/technologies_white_paper09186a00800a3db9.html) that specifies the field count, field type, and field length. This allows the collector to parse the data without knowing the data format in advance.

As you may have guessed, there are other tools that save us the problem of parsing NetFlow records one by one. Let's look at one such tool, called ntop, in the next section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.67.5