Having fun analyzing nginx log to find malicious attacker in the net (ง'̀-'́)ง (day 37)

What makes you sleepless at night?

is it because of a ghost or scary stories?

is it because you have an important meeting tomorrow?

or is it because you have an exam?

For me, what keeps me up all night is that I keep thinking about what happens to a website that I just created, is it safe from an attacker (certainly not) or did I missing some security adjustments that lead to vulnerability?

well I'm not the best secure programmer in the world, I'm still learning and there is a big possibility that I can make a mistake but for me, a mistake can be a valuable investment to myself or yourself to be better

so from this idea, I want to know more about what attackers casually do when attacking a website. Here in this post, I'm going to show you how I analyzed attack to the website that I have permission to design and also some interesting findings that I could get from the analysis

Background:

All of this analysis comes from the traffic that is targeted to the website that I have permission to access it. I set up the website to use NGINX as a reverse proxy (thanks to ko Mario who show me how to do this, it's really helpful) for redirecting the traffic to the web framework.

so the only thing that contains the whole HTTP transaction is in the access.log file, this is a self-generated log that comes from Nginx it contains all of the interaction between server to the client. To get a better view Below is the example of the contents of the log file

103.9.124.70 - - [15/Nov/2019:00:57:59 +0000] "GET ///queue-stats/pngbehavior.htc HTTP/1.1" 404 153 "-" "python-requests/2.12.4" "-"
103.9.124.70 - - [15/Nov/2019:00:57:59 +0000] "GET ///stats/pngbehavior.htc HTTP/1.1" 404 153 "-" "python-requests/2.12.4" "-"
103.9.124.70 - - [15/Nov/2019:00:57:59 +0000] "GET ///config.php HTTP/1.1" 404 153 "-" "python-requests/2.12.4" "-"
103.9.124.70 - - [15/Nov/2019:00:57:59 +0000] "GET ///admin/common/content.css HTTP/1.1" 404 153 "-" "python-requests/2.12.4" "-"
103.9.124.70 - - [15/Nov/2019:00:57:59 +0000] "GET ///html/recordings/index.php HTTP/1.1" 404 153 "-" "python-requests/2.12.4" "-"
103.9.124.70 - - [15/Nov/2019:00:57:59 +0000] "GET ///freepbx/recordings/index.php HTTP/1.1" 404 153 "-" "python-requests/2.12.4" "-"
103.9.124.70 - - [15/Nov/2019:00:57:59 +0000] "GET ///html//admin/config.php HTTP/1.1" 404 153 "-" "python-requests/2.12.4" "-"

so we can assume the format goes like this:

1st column => source IP
2nd column => timestamp
3rd column => requested resources (including method and HTTP paths)
4th column => status of the HTTP
5th column => referer
the last column => user agent

link: https://easyengine.io/tutorials/nginx/log-parsing/

There is a lot of script and program in github to parse the Nginx log, so I don't have to reinvent the wheel from the scratch

Crafting a parser:

I'm going to use this parser regex, link: https://gist.github.com/hreeder/f1ffe1408d296ce0591d

first, we load our example data to a variable and then we load the regex rule to and match it with our example data. Finally, dump the result using groupdict() functions.

this is really cool ! the regex already group the data according to their intended key

with this regex, I can just create a simple python script that will dump all of the log contents into dict format

hmm, it seems we find some exception in the program when processing line 72, let's find out what is that:

it seems that there is some faulty line that doesn't return anything that creates confusion to the regex lets add try-except function to just pass this condition

ok let's find out how many unique IP that comes to the website, I create another function that can take care of it:

I create a global variable so that the function can insert all IP address inside it and then I use set function to make all duplicate IP address is gone

next, I want to found out which of this IP address is browse the website in high intensity

we can use the Counter function from python collections library to count how many occurrences that this IP is appear in our log

ok let's focus on top 5 IP address in the list, I discard IP address 134.219.227.22, 36.84.147.244, 114.125.198.23, 134.219.227.21 and IP come 66.102.8.0/24 since it's come from me or my colleague

so this is the top 5 IP address

202.162.19.114

103.9.124.70

112.29.140.222

122.51.226.167

77.247.181.162

that I want to group all of the connection, in order to do this I create another function that will do the grouping for each IP and it goes like this.

I create a global dictionary that contains a list of mentioned IP:

and change the main code a little bit

if you running we got a nice grouping variable for each IP

let's dump it in CSV file to get better representation, I create another function to dump the dictionary data into a CSV file:

and a little bit of touch at the main functions:

run it we got a nice formatted and grouped csv file:

Findings:

- Interesting request

I try to take a look at some of the connections generated by these 5 IP and I found out that most of this traffic is generated by a botnet because of the intensity of the connection.

for example, let's analyze the time-intensity of the one IP address (202.162.19.114) cause it has the biggest traffic among all the top IP

this is the code I create to generate the diagram:

this is really interesting although the window of the traffic is only a couple of minutes but each of the minutes the IP can query at least a hundred times this is similar to botnet traffic. But what do these IP query to our web server?

I categorize the query into two types:

1st type is the query for brute-forcing, like the following one:

/edmin.php
/sconfig.php
/indax.php
/logo.php
/o.php
/shell.php
/tools.php
/asjc.php
/test.php
/fuck.php
/freebook.php
/goodbook.php
/tools.php
/indexl.php
/sql.php
/conf.php
/pagefile.php
/settings.php
/system.php
/test123.php
/db.init.php
/error.php
/099.php
/_404.php
/Alarg53.php
/lapan.php
/pk1914.php
/sllolx.php
/Skri.php
/db_desql.php
/mx.php
/wshell.php
/xshell.php
/qq.php
/conflg.php
/conflg.php
/lindex.php
/phpstudy.php
/phpStudy.php
/weixiao.php
/feixiang.php
/ak47.php
/ak48.php
/xiao.php
/yao.php
/defect.php
/webslee.php
/q.php
/pe.php
/hm.php
/sz.php
/cainiao.php
/zuoshou.php
/zuo.php
/aotu.php
/aotu7.php
/cmd.php
/cmd.php
/bak.php
/system.php
/l6.php
/l7.php
/l8.php
/q.php
/56.php
/mz.php
/yumo.php
/min.php

I found some reference on the internet that already talk about this type of URL probe, it says this traffic caused by a generic bot that scans the entire internet to found known filenames for software with known bugs/holes

references:

https://alittlebrighter.svbtle.com/hack-attempt

https://security.stackexchange.com/questions/190534/help-on-what-to-do-with-these-suspicious-logs

https://stackoverflow.com/questions/53636935/tomcat-server-under-attack

https://gist.github.com/acosonic/ca16bb27f34aa9bee8d92fc1a741830a

2nd type is the query that contains some sort of payload that also lead to a vulnerable web framework, like the following ones:

/%75%73%65%72/%72%65%67%69%73%74%65%72?%65%6c%65%6d%65%6e%74%5f%70%61%72%65%6e%74%73=%74%69%6d%65%7a%6f%6e%65%2f%74%69%6d%65%7a%6f%6e%65%2f%23%76%61%6c%75%65&%61%6a%61%78%5f%66%6f%72%6d=1&%5f%77%72%61%70%70%65%72%5f%66%6f%72%6d%61%74=%64%72%75%70%61%6c%5f%61%6a%61%78

decode version:

/user/register?element_parents=timezone/timezone/#value&ajax_form=1&_wrapper_format=drupal_ajax

/index.php?s=%2f%69%6e%64%65%78%2f%5c%74%68%69%6e%6b%5c%61%70%70%2f%69%6e%76%6f%6b%65%66%75%6e%63%74%69%6f%6e&function=%63%61%6c%6c%5f%75%73%65%72%5f%66%75%6e%63%5f%61%72%72%61%79&vars[0]=%6d%645&vars[1][]=%48%65%6c%6c%6f%54%68%69%6e%6b%50%48%50

decode version:

/index.php?s=/index/\think\app/invokefunction&function=call_user_func_array&vars[0]=md5&vars[1][]=HelloThinkPHP => this will lead to thinkphp exploitation https://securitynews.sonicwall.com/xmlpost/thinkphp-remote-code-execution-rce-bug-is-actively-being-exploited/

/elrekt.php?s=%2f%69%6e%64%65%78%2f%5c%74%68%69%6e%6b%5c%61%70%70%2f%69%6e%76%6f%6b%65%66%75%6e%63%74%69%6f%6e&function=%63%61%6c%6c%5f%75%73%65%72%5f%66%75%6e%63%5f%61%72%72%61%79&vars[0]=%6d%645&vars[1][]=%48%65%6c%6c%6f%54%68%69%6e%6b%50%48%50

decode version:

/elrekt.php?s=/index/\think\app/invokefunction&function=call_user_func_array&vars[0]=md5&vars[1][]=HelloThinkPHP => this will lead to thinkphp exploitation https://securitynews.sonicwall.com/xmlpost/thinkphp-remote-code-execution-rce-bug-is-actively-being-exploited/

/install/lib/ajaxHandlers/ajaxServerSettingsChk.php?rootUname=%3Becho%20-n%20HellorConfig%7Cmd5sum%20%23

decode version:

/install/lib/ajaxHandlers/ajaxServerSettingsChk.php?rootUname=;echo -n HellorConfig|md5sum # => lead to RCE exploitation of https://shells.systems/rconfig-v3-9-2-authenticated-and-unauthenticated-rce-cve-2019-16663-and-cve-2019-16662/

/%73%65%65%79%6F%6E/%68%74%6D%6C%6F%66%66%69%63%65%73%65%72%76%6C%65%74

decode version:

/seeyon/htmlofficeservlet => this lead to exploitation of http://wyb0.com/posts/2019/seeyon-htmlofficeservlet-getshell/

/%75%73%65%72%2e%70%68%70

decode version:

/user.php, but what unique about this attack is that the bot send some "refferer" in the packet that contains some sort of sql injection:

554fcae493e564ee0dc75bdf2ebf94caads|a:3:{s:2:\x22id\x22;s:3:\x22'/*\x22;s:3:\x22num\x22;s:141:\x22*/ union select 1,0x272F2A,3,4,5,6,7,8,0x7b247b24524345275d3b6469652f2a2a2f286d6435284449524543544f52595f534550415241544f5229293b2f2f7d7d,0--\x22;s:4:\x22name\x22;s:3:\x22ads\x22;}554fcae493e564ee0dc75bdf2ebf94ca"

this lead to vulnerability of ECshop => https://github.com/SecWiki/CMS-Hunter/tree/master/Ecshop/ecshop2.x_code_execute

I'm not an expert on web attack but I assume that this botnet encodes their payload to bypass detection in the firewall, packet filtering or waf.

reference:

https://www.owasp.org/index.php/Double_Encoding

-Interesting user-agent

I try to enumerate the user agent of all of the connection and come out with the following list:

Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0

Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0
Mozilla/5.0 (Windows; U; Windows NT 6.0;en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6)
python-requests/2.20.0
python-requests/2.12.4
masscan/1.0 (https://github.com/robertdavidgraham/masscan)

The first 4 user-agent is not really interesting since it's an expected user agent and botnet can also mask their agent to look like a legitimate browser. But some of them indeed still using a python request agent

The last user-agent is actually caught my attention since I never see this kind of user-agent in my entire life

I try to search the tools and it leads me to this GitHub page: https://github.com/robertdavidgraham/masscan

description of the tools is explained as follow:

This is an Internet-scale port scanner. It can scan the entire Internet in under 6 minutes, transmitting 10 million packets per second, from a single machine.
Its input/output is similar to nmap, the most famous port scanner. When in doubt, try one of those features.

the following is the list of IP address that is incorporated with masscan tools:

104.248.63.201
170.238.36.20
207.180.224.136
178.128.94.31
5.189.176.208
170.238.36.66
200.2.162.34
155.93.118.14
173.212.252.245
5.189.163.253
190.13.136.237
51.38.57.199
122.155.11.55
170.238.36.21

104.248.63.201 =>14/Nov/2019:21:44:39
170.238.36.20 => 14/Nov/2019:22:07:46
207.180.224.136 => 14/Nov/2019:22:35:05
178.128.94.31 =>15/Nov/2019:01:37:19
5.189.176.208 =>15/Nov/2019:02:20:04
170.238.36.66 =>15/Nov/2019:02:38:45
200.2.162.34 =>15/Nov/2019:02:48:38
155.93.118.14 =>15/Nov/2019:08:53:34
173.212.252.245 =>15/Nov/2019:12:19:34
5.189.163.253 =>15/Nov/2019:12:58:39
190.13.136.237 =>15/Nov/2019:18:24:54
51.38.57.199 =>15/Nov/2019:18:45:19
122.155.11.55 => 15/Nov/2019:19:50:08
170.238.36.21 => 15/Nov/2019:21:00:59

if we try to map the time frame with the corresponding like the above, we don't get any pattern it seems that the each of this IP only does scanning one time

I try to find any botnet that could have user masscan in its program after a short time searching on the internet. Most of the botnet that uses masscan is fall into the category of IoT botnet (example: https://www.sangfor.com/source/blog-network-security/1101.html)

I also try to do passive DNS replication to found out if the IP address have a relation or belongs to certain domain name or not.

104.248.63.201 => avans.com (url: https://securitytrails.com/list/ip/104.248.63.201)
170.238.36.20 => nameserver for igtelecom.com.br (url: https://securitytrails.com/list/ip/170.238.36.20)
207.180.224.136 => datactor.fi (url: https://securitytrails.com/list/ip/207.180.224.136)
178.128.94.31 => 1000score.com (url: https://securitytrails.com/list/ip/178.128.94.31)
5.189.176.208 => ingxpress.de (url: https://securitytrails.com/list/ip/5.189.176.208)
170.238.36.66 => rendebem.com.br (url: https://securitytrails.com/list/ip/170.238.36.66)
200.2.162.34 => speedtest.sr (url: https://securitytrails.com/list/ip/200.2.162.34)
155.93.118.14 => nbaph.org.ng (url: https://securitytrails.com/list/ip/155.93.118.14)
173.212.252.245 => odoo.upets.nl (url: https://securitytrails.com/list/ip/173.212.252.245)
5.189.163.253 => contabo.host (url: https://securitytrails.com/list/ip/5.189.163.253)
190.13.136.237 => stag.helpcom.cl (url: https://securitytrails.com/list/ip/190.13.136.237)
51.38.57.199 => radiomarcabarcelona.com (url: https://securitytrails.com/list/ip/51.38.57.199)
122.155.11.55 => dlf.ac.th (url: https://securitytrails.com/list/ip/122.155.11.55)
170.238.36.21 => nameserver for igtelecom.com.br (url: https://securitytrails.com/list/ip/170.238.36.21)

That's all of the analysis that I could come out from a short period of time, I hope this analysis can be used as a reference to create any packet filtering or waf rules to strengthen your network perimeter.

Enjoy :D

Below I attached the source code of the parser if you want to experiment it.

import re
import sys
from collections import Counter
import csv
ip_address = []
suspicious_ip = {"202.162.19.114":[],"103.9.124.70":[],"112.29.140.222":[],"122.51.226.167":[],"77.247.181.162":[]}

def parse_log(log_data):

    lineformat = re.compile(r"""(?P<ipaddress>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) - - \[(?P<dateandtime>\d{2}\/[a-z]{3}\/\d{4}:\d{2}:\d{2}:\d{2} (\+|\-)\d{4})\] ((\"(GET|POST) )(?P<url>.+)(http\/1\.1")) (?P<statuscode>\d{3}) (?P<bytessent>\d+) (["](?P<refferer>(\-)|(.+))["]) (["](?P<useragent>.+)["])""", re.IGNORECASE)
    res = re.search(lineformat,log_data)

    return res.groupdict()

def list_ip(log_data):
    global ip_address

    ip = log_data['ipaddress']
    ip_address.append(ip)

def counter_ip(ip_address):

    return Counter(ip_address)

def top_5_ip(ip_mal):
    temp = ip_mal['ipaddress']
    global suspicious_ip
    if temp in suspicious_ip.keys():
        suspicious_ip[temp].append(ip_mal)

def dump_data():
    header = ['dateandtime','ipaddress','url','bytessent','refferer','useragent','statuscode']
    file = open("nginx_log.csv","w")
    writer = csv.DictWriter(file,delimiter=',',fieldnames=header)

    global suspicious_ip
    writer.writeheader()
    for x in suspicious_ip:
        for y in suspicious_ip[x]:
            writer.writerow(y)


#read the file
file = open(sys.argv[1],"r")
data = file.readlines()

for x in data:
    try:
        log = parse_log(x)
        top_5_ip(log)
    except:
        pass

dump_data()

and here is the code to generate the graph of time:

import csv
import matplotlib.pyplot as plt

file = open("nginx_log.csv","r")
reader = csv.DictReader(file)

data_time = {}

#count how many occurence in time of the ip
index = 0
for row in reader:

    if row['ipaddress'] == "202.162.19.114":
        if row['dateandtime'][12:17] in data_time.keys():
            data_time[row['dateandtime'][12:17]] += 1
        else:
            data_time[row['dateandtime'][12:17]] = 1

print data_time

#create diagram
lists = sorted(data_time.items())
x,y = zip(*lists)

plt.plot(x,y)
plt.show()

Court of Analysis and Testing

Search This Blog