Searching with regular expressions

For our first search, we will simply use the regular expression module to look for the terms we are looking for. We will use a simple loop to the following:

#!/usr/bin/env python3

import re, datetime

startTime = datetime.datetime.now()

with open('sample_log_anonymized.log', 'r') as f:
for line in f.readlines():
if re.search('ACLLOG-5-ACLLOG_FLOW_INTERVAL', line):
print(line)

endTime = datetime.datetime.now()
elapsedTime = endTime - startTime
print("Time Elapsed: " + str(elapsedTime))

The result is about 6/100th of a second to search through the log file:

$ python3 python_re_search_1.py
2014 Jun 29 19:21:18 Nexus-7000 %ACLLOG-5-ACLLOG_FLOW_INTERVAL: Src IP: 10.1 0.10.1,

2014 Jun 29 19:26:18 Nexus-7000 %ACLLOG-5-ACLLOG_FLOW_INTERVAL: Src IP: 10.1 0.10.1,

Time Elapsed: 0:00:00.065436

It is recommended to compile the search term for a more efficient search. It will not impact us much since we are already pretty fast. In fact, the Python interpretative nature will actually make it slower. However, it will make a difference when we search through larger text body, so let's make the change:

searchTerm = re.compile('ACLLOG-5-ACLLOG_FLOW_INTERVAL')

with open('sample_log_anonymized.log', 'r') as f:
for line in f.readlines():
if re.search(searchTerm, line):
print(line)

The time result is actually slower:

Time Elapsed: 0:00:00.081541

Let's expand the example a bit. Assuming we have several files and multiple terms to search through, we will copy the original file to a new file:

$ cp sample_log_anonymized.log sample_log_anonymized_1.log

We will also include searching for the PAM: Authentication failure term. We will add another loop to search both the files:

term1 = re.compile('ACLLOG-5-ACLLOG_FLOW_INTERVAL')
term2 = re.compile('PAM: Authentication failure')

fileList = ['sample_log_anonymized.log', 'sample_log_anonymized_1.log']

for log in fileList:
with open(log, 'r') as f:
for line in f.readlines():
if re.search(term1, line) or re.search(term2, line):
print(line)

We can now see the difference in performance as well as expand our search capabilities:

$ python3 python_re_search_2.py
2016 Jun 5 16:49:33 NEXUS-A %DAEMON-3-SYSTEM_MSG: error: PAM: Authentication failure for illegal user AAA from 172.16.20.170 - sshd[4425]

2016 Sep 14 22:52:26.210 NEXUS-A %DAEMON-3-SYSTEM_MSG: error: PAM: Authentication failure for illegal user AAA from 172.16.20.170 - sshd[2811]

<skip>

2014 Jun 29 19:21:18 Nexus-7000 %ACLLOG-5-ACLLOG_FLOW_INTERVAL: Src IP: 10.1 0.10.1,

2014 Jun 29 19:26:18 Nexus-7000 %ACLLOG-5-ACLLOG_FLOW_INTERVAL: Src IP: 10.1 0.10.1,

<skip>

Time Elapsed: 0:00:00.330697

Of course, when it comes to performance tuning, it is a never ending race to zero and sometimes depends on the hardware you are using. But the important point is to regularly perform audits of your log files using Python, so you can catch the early signals of any potential breach.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.126.211