Hakrawler Tutorial: Fast Web Crawler for Bug Bounty

Hakrawler Tutorial: Fast Web Crawler for Bug Bounty
Hakrawler Tutorial: Fast Web Crawler for Bug Bounty

Whether you're interested in penetration testing, OSINT, or bug bounty, this quick and easy Golang-based crawler is an essential addition to your recon toolbox.

On a newly configured Ubuntu ARM64 virtual machine, we will install Hakrawler, run practical examples, and learn how to use it efficiently.

Setup

"Let's start by installing Hakrawler on a fresh Ubuntu installation."

Terminal Commands:

sudo apt update && sudo apt upgrade -y
sudo apt install golang git -y

"First, we must install Go and Git. We'll use go install to install Hakrawler after that is finished.

go install github.com/hakluke/hakrawler@latest

The binary can be found at ~/go/bin/hakrawler after installation. Let us permanently include it in our PATH.

echo 'export PATH=$PATH:~/go/bin' >> ~/.bashrc
source ~/.bashrc

"Now, let's see if it works."

hakrawler --help

"Awesome! The hakrawler is now set up and operational.

Usage

"Let's crawl our initial website. For testing, I'll use https://www.openexploit.in as a secure target.

echo https://www.openexploit.in | hakrawler

"Hakrawler begins crawling after reading URLs from standard input." Multiple URLs can also be entered from a file in the manner described here:

cat urls.txt | hakrawler

"To illustrate that, let's make a brief urls.txt file."

echo -e "https://httpbin.org\nhttps://openexploit.in" > urls.txt
cat urls.txt | hakrawler

Bug Bounty-Style Recon

An actual bug bounty recon chain might look like this. Assume that you are testing target.com. You use Hakrawler to crawl them after first gathering subdomains with Haktrails and filtering live hosts with httpx.

echo openexploit.in | haktrails subdomains | httpx | hakrawler

"You receive a list of all subdomains, determine which are active, and then send them all to Hakrawler in a single line for crawling!"

Options & Flgs

"Hakrawler provides a number of helpful flags." Let's review a few that I use frequently.

Subdomains:

echo https://www.openexploit.in | hakrawler -subs

Depth:

echo https://example.com | hakrawler -subs

HTTP Headers:

echo https://www.openexploit.in | hakrawler -h "User-Agent: OpenExploitBot;;Referer: https://google.com"

JSON output:

echo https://www.openexploit.in | hakrawler -json

Proxy(such as Burp Suite):

echo https://www.openexploit.in | hakrawler -proxy http://127.0.0.1:8080

"You have more control over how and what you crawl with each of these flags."

Typical Problem and Solutions

You may not always receive URLs back. This is typically the result of the domain redirecting to a subdomain, so don't panic. As an example:

echo http://www.openexploit.in | hakrawler

"You will not receive any results if this reroutes to https://www.openexploit.in unless you either add -subs to include subdomains or crawl the redirected URL directly."

Bonus: Using Docker

Would you rather use Docker or not install Go? Here's how to use Docker with Hakrawler.

DockerHub:

echo https://httpbin.org | docker run --rm -i hakluke/hakrawler:v2 -subs

Local:

git clone https://github.com/hakluke/hakrawler
cd hakrawler
sudo docker build -t hakluke/hakrawler .
echo https://httpbin.org | sudo docker run --rm -i hakluke/hakrawler -subs

Final Words

It is quick, easy, and very useful for reconfiguring and finding endpoints quickly during bug bounty hunts or pentests.

Until the next time, be safe and curious!

Prefer watching instead of reading? Here’s a quick video guide