One step further into reconnaissance, we need to figure out if there is any page or directory in the site that is not linked to what is shown to the common user. For example, a login page to the intranet or to the content management systems (CMS) administration. Finding a site similar to this will expand our testing surface considerably and can give us some important clues about the application and its infrastructure.
In this recipe, we will use the robots.txt
file to discover some files and directories that may not be linked to anywhere in the main application.
http://192.168.56.102/vicnum/
.robots.txt
to the URL and we will see the following screnshot:This file tells search engines that the indexing of the directories jotto
and cgi-bin
is not allowed for every browser (user agent). However, this doesn't mean that we cannot browse them.
http:
//192.168.56.102/vicnum/cgi-bin/
:We can click and navigate directly to any of the Perl scripts in this directory.
htt
p://192.168.56.102/vicnum/jotto/
:jotto
:. You will see something similar to the following screenshot:Jotto is a game about guessing five-character words; could this be the list of possible answers? Check it by playing the game; if it is, we have already hacked the game!
robots.txt
is a file used by web servers to tell search engines about the directories or files that they should index and what they are not allowed to look into. Taking the perspective of an attacker, this tells us if there is a directory in the server that is accessible but hidden to the public using what is called "security through obscurity" (that is, assuming that users won't discover the existence of something, if they are not told about it).
18.188.131.255