668 words
3 minutes
The Art of Reconnaissance Part 1 - How Hackers Find Vulnerabilities Before You Do

Introduction :#

Every successful cyber attack begins long before the first exploit is launched. It starts with reconnaissance the methodical process of gathering information about a target This phase is so critical that professional penetration testers often spend 60–70% of their time just collecting data before attempting any actual hacking.

In this article, I’ll walk you through the techniques ethical hackers and bug bounty hunters use to discover vulnerabilities in organizations. Whether you’re a developer wanting to secure your applications, a security enthusiast, or someone curious about how hackers think, this guide will change how you view digital security.

“Give me six hours to chop down a tree, and I will spend the first four sharpening the axe” Abraham Lincoln

This quote perfectly captures the hacker mindset Reconnaissance is sharpening the axe

The Two Types of Reconnaissance :#

1 ) Passive Reconnaissance#

Passive recon involves gathering information without directly interacting with the target. Think of it as observing from a distance the target never knows you’re looking

Examples include:

  • Searching public records and databases
  • Analyzing social media profiles
  • Reading job postings (they reveal tech stacks!)
  • Using search engines creatively
  • Examining DNS records

2 ) Active Reconnaissance#

Active recon requires direct interaction with the target’s systems. This leaves traces and could be detected by security monitoring

Examples include:

  • Port scanning
  • Web crawling
  • Subdomain enumeration
  • Vulnerability scanning

Let’s dive into the specific techniques.

Subdomain Enumeration : Finding Hidden Doors#

Most security teams focus on protecting their main domain, but forget about the dozens (or hundreds) of subdomains hiding in the shadows. These forgotten subdomains often run outdated software, contain sensitive data, or have weaker security controls.

Why Subdomains Matter#

Consider this :

staging.company.com : might run an older, unpatched version of the application

dev.company.com : might have debug mode enabled.

old.company.com : might be a forgotten server from 2019 that nobody maintains

I’ve personally found :

  • Admin panels at company like : admin.target.com with default credentials

Tools of the Trade#

Subfinder : My go to tool for passive subdomain enumeration :

subfinder -d target.com -o subdomains.txt

Amass : More comprehensive but slower :

amass enum -passive -d target.com

Certificate Transparency Logs : SSL certificates are public Check crt.sh :

https://crt.sh/?q=%.target.com

Pro Tip#

Combine multiple tools and deduplicate. Each tool has different sources, and combining them gives you the most complete picture :

cat subfinder.txt amass.txt crtsh.txt | sort -u > all_subdomains.txt

Google Dorking : The Search Engine as a Weapon#

Google indexes more than you think. Using advanced search operators, hackers can find sensitive files, exposed admin panels, and vulnerable servers all through Google.

Dangerous Dorks#

>>>>> Find exposed configuration files :
site:target.com filetype:env
site:target.com filetype:yml password
site:target.com filetype:config
>>>>> Find login panels :
site:target.com inurl:admin
site:target.com inurl:login
site:target.com intitle:"Dashboard"
>>>>> Find exposed documents :
site:target.com filetype:pdf confidential
site:target.com filetype:xlsx
site:target.com filetype:doc internal
>>>>> Find error messages (reveals technology) :
site:target.com "sql syntax error"
site:target.com "php error"
site:target.com "stack trace"

The Google Hacking Database#

The [GHDB] maintains thousands of proven Google dorks for finding vulnerabilities. It’s terrifying how much sensitive data is just… indexed.

Defense Tip#

Use robots.txt properly, implement authentication on sensitive pages, and regularly Google your own domain with these dorks. You might be surprised what you find

Technology Fingerprinting: Know Your Enemy#

Understanding what technologies a target uses is crucial. Each technology has its own set of known vulnerabilities.

What to Look For#

  • Web servers : Apache, Nginx, IIS
  • Frameworks : React, Angular, Laravel, Django
  • CMS : WordPress, Drupal, Joomla
  • Databases: Exposed by error messages
  • Cloud providers: AWS, Azure, GCP
  • WAFs : Cloudflare, Akamai, AWS WAF

Detection Methods#

HTTP Headers reveal a lot :

Server: nginx/1.18.0
X-Powered-By: PHP/7.4.3
X-AspNet-Version: 4.0.30319

Wappalyzer : Browser extension that identifies technologies instantly.

WhatWeb : Command line alternative:

whatweb target.com

Built With : Online service showing technology stack

Why This Matters#

If I know you’re running WordPress 5.4.1, I can immediately search for CVEs affecting that version. If I see you’re using an old PHP version, I know to look for type juggling vulnerabilities.

The Wayback Machine: Ghosts of Websites Past#

The Internet Archive’s Wayback Machine is a goldmine for hackers. It stores historical snapshots of websites, including:

  • Old pages that were removed (but might still exist)
  • Previous versions of JavaScript files (with hardcoded API keys)
  • Removed endpoints that are still functional
  • Old documentation revealing internal systems

How to Use It#

Web Interface :

https://web.archive.org/web/*/target.com/*

Command Line with waybackurls :

echo target.com | waybackurls > urls.txt

Defense Tip#

When deprecating endpoints, actually remove them from your server. Don’t just hide the links.

See you in the next part